[jira] [Commented] (LUCENE-8722) Simplify relate logic on EdgeTree

2019-03-12 Thread Lucene/Solr QA (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16791339#comment-16791339
 ] 

Lucene/Solr QA commented on LUCENE-8722:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
33s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} Release audit (RAT) {color} | 
{color:green}  0m 28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} Check forbidden APIs {color} | 
{color:green}  0m 28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} Validate source patterns {color} | 
{color:green}  0m 28s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 14m 
32s{color} | {color:green} core in the patch passed. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 17m 57s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | LUCENE-8722 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12962097/LUCENE-8722.patch |
| Optional Tests |  compile  javac  unit  ratsources  checkforbiddenapis  
validatesourcepatterns  |
| uname | Linux lucene1-us-west 4.4.0-137-generic #163~14.04.1-Ubuntu SMP Mon 
Sep 24 17:14:57 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | ant |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-LUCENE-Build/sourcedir/dev-tools/test-patch/lucene-solr-yetus-personality.sh
 |
| git revision | master / d8f2a02 |
| ant | version: Apache Ant(TM) version 1.9.3 compiled on July 24 2018 |
| Default Java | 1.8.0_191 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-LUCENE-Build/173/testReport/ |
| modules | C: lucene/core U: lucene/core |
| Console output | 
https://builds.apache.org/job/PreCommit-LUCENE-Build/173/console |
| Powered by | Apache Yetus 0.7.0   http://yetus.apache.org |


This message was automatically generated.



> Simplify relate logic on EdgeTree
> -
>
> Key: LUCENE-8722
> URL: https://issues.apache.org/jira/browse/LUCENE-8722
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Ignacio Vera
>Priority: Trivial
> Attachments: LUCENE-8722.patch
>
>
> Currently Edge tree contains three methods for {{relate}}: relate, 
> internalComponentRelate and componentRelate.
> {{internalComponentRelate}} does not bring any benefit and it is trivial to 
> merge the logic it contains into the {{relate}} method.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-13320) add a param ignoreDuplicates=true to updates to not overwrite existing docs

2019-03-12 Thread Shalin Shekhar Mangar (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16791318#comment-16791318
 ] 

Shalin Shekhar Mangar edited comment on SOLR-13320 at 3/13/19 5:36 AM:
---

If the definition of duplicate is just having the same id then that can also be 
done today using optimistic concurrency. Use a negative value for the version. 
See 
https://lucene.apache.org/solr/guide/6_6/updating-parts-of-documents.html#UpdatingPartsofDocuments-OptimisticConcurrency

If duplicate depends on the content of the document then you need to use the 
SignatureUpdateProcessorFactory


was (Author: shalinmangar):
If the definition of duplicate is just having the same id then that can also be 
done today using optimistic concurrency. Use {{_version_}} with a negative 
value. See 
https://lucene.apache.org/solr/guide/6_6/updating-parts-of-documents.html#UpdatingPartsofDocuments-OptimisticConcurrency

If duplicate depends on the content of the document then you need to use the 
SignatureUpdateProcessorFactory

> add a param ignoreDuplicates=true to updates to not overwrite existing docs
> ---
>
> Key: SOLR-13320
> URL: https://issues.apache.org/jira/browse/SOLR-13320
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>
> Updates should have an option to ignore duplicate documents and drop them if 
> an option  {{ignoreDuplicates=true}} is specified



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13320) add a param ignoreDuplicates=true to updates to not overwrite existing docs

2019-03-12 Thread Shalin Shekhar Mangar (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16791318#comment-16791318
 ] 

Shalin Shekhar Mangar commented on SOLR-13320:
--

If the definition of duplicate is just having the same id then that can also be 
done today using optimistic concurrency. Use `_version_` with a negative value. 
See 
https://lucene.apache.org/solr/guide/6_6/updating-parts-of-documents.html#UpdatingPartsofDocuments-OptimisticConcurrency

If duplicate depends on the content of the document then you need to use the 
SignatureUpdateProcessorFactory

> add a param ignoreDuplicates=true to updates to not overwrite existing docs
> ---
>
> Key: SOLR-13320
> URL: https://issues.apache.org/jira/browse/SOLR-13320
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>
> Updates should have an option to ignore duplicate documents and drop them if 
> an option  {{ignoreDuplicates=true}} is specified



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13131) Category Routed Aliases

2019-03-12 Thread Gus Heck (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16791320#comment-16791320
 ] 

Gus Heck commented on SOLR-13131:
-

Pushed to master, but it's getting late here will do 8x in the morning assuming 
tests pass

> Category Routed Aliases
> ---
>
> Key: SOLR-13131
> URL: https://issues.apache.org/jira/browse/SOLR-13131
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrCloud
>Affects Versions: master (9.0)
>Reporter: Gus Heck
>Assignee: Gus Heck
>Priority: Major
> Attachments: indexingWithCRA.png, indexingwithoutCRA.png, 
> indexintWithoutCRA2.png
>
>
> This ticket is to add a second type of routed alias in addition to the 
> current time routed aliases. The new type of alias will allow data driven 
> creation of collections based on the values of a field and automated 
> organization of these collections under an alias that allows the collections 
> to also be searched as a whole.
> The use case in mind at present is an IOT device type segregation, but I 
> could also see this leading to the ability to direct updates to tenant 
> specific hardware (in cooperation with autoscaling). 
> This ticket also looks forward to (but does not include) the creation of a 
> Dimensionally Routed Alias which would allow organizing time routed data also 
> segregated by device
> Further design details to be added in comments.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-13320) add a param ignoreDuplicates=true to updates to not overwrite existing docs

2019-03-12 Thread Shalin Shekhar Mangar (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16791318#comment-16791318
 ] 

Shalin Shekhar Mangar edited comment on SOLR-13320 at 3/13/19 5:35 AM:
---

If the definition of duplicate is just having the same id then that can also be 
done today using optimistic concurrency. Use {{_version_}} with a negative 
value. See 
https://lucene.apache.org/solr/guide/6_6/updating-parts-of-documents.html#UpdatingPartsofDocuments-OptimisticConcurrency

If duplicate depends on the content of the document then you need to use the 
SignatureUpdateProcessorFactory


was (Author: shalinmangar):
If the definition of duplicate is just having the same id then that can also be 
done today using optimistic concurrency. Use `_version_` with a negative value. 
See 
https://lucene.apache.org/solr/guide/6_6/updating-parts-of-documents.html#UpdatingPartsofDocuments-OptimisticConcurrency

If duplicate depends on the content of the document then you need to use the 
SignatureUpdateProcessorFactory

> add a param ignoreDuplicates=true to updates to not overwrite existing docs
> ---
>
> Key: SOLR-13320
> URL: https://issues.apache.org/jira/browse/SOLR-13320
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>
> Updates should have an option to ignore duplicate documents and drop them if 
> an option  {{ignoreDuplicates=true}} is specified



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13131) Category Routed Aliases

2019-03-12 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16791314#comment-16791314
 ] 

ASF subversion and git services commented on SOLR-13131:


Commit d8f2a02fdb11a484425f9fddfa7061711d2f0034 in lucene-solr's branch 
refs/heads/master from Gus Heck
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=d8f2a02 ]

SOLR-13131 Category Routed Aliases


> Category Routed Aliases
> ---
>
> Key: SOLR-13131
> URL: https://issues.apache.org/jira/browse/SOLR-13131
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrCloud
>Affects Versions: master (9.0)
>Reporter: Gus Heck
>Assignee: Gus Heck
>Priority: Major
> Attachments: indexingWithCRA.png, indexingwithoutCRA.png, 
> indexintWithoutCRA2.png
>
>
> This ticket is to add a second type of routed alias in addition to the 
> current time routed aliases. The new type of alias will allow data driven 
> creation of collections based on the values of a field and automated 
> organization of these collections under an alias that allows the collections 
> to also be searched as a whole.
> The use case in mind at present is an IOT device type segregation, but I 
> could also see this leading to the ability to direct updates to tenant 
> specific hardware (in cooperation with autoscaling). 
> This ticket also looks forward to (but does not include) the creation of a 
> Dimensionally Routed Alias which would allow organizing time routed data also 
> segregated by device
> Further design details to be added in comments.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-13320) add a param overwrite=false to updates to not overwrite existing docs

2019-03-12 Thread Noble Paul (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-13320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-13320:
--
Description: Updates should have an option to ignore duplicate documents 
and drop them if an option  {{ignoreDuplicates=true}} is specified  (was: 
Updates should have an option to ignore overwrites if an option  
{{overwrite=false}} is specified)

> add a param overwrite=false to updates to not overwrite existing docs
> -
>
> Key: SOLR-13320
> URL: https://issues.apache.org/jira/browse/SOLR-13320
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>
> Updates should have an option to ignore duplicate documents and drop them if 
> an option  {{ignoreDuplicates=true}} is specified



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13320) add a param overwrite=false to updates to not overwrite existing docs

2019-03-12 Thread Noble Paul (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16791312#comment-16791312
 ] 

Noble Paul commented on SOLR-13320:
---

sorry, it's actually a different functionality

basically I want to ignore any duplicate incoming documents. I'll change the 
param name

> add a param overwrite=false to updates to not overwrite existing docs
> -
>
> Key: SOLR-13320
> URL: https://issues.apache.org/jira/browse/SOLR-13320
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>
> Updates should have an option to ignore overwrites if an option  
> {{overwrite=false}} is specified



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-13320) add a param ignoreDuplicates=true to updates to not overwrite existing docs

2019-03-12 Thread Noble Paul (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-13320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-13320:
--
Summary: add a param ignoreDuplicates=true to updates to not overwrite 
existing docs  (was: add a param overwrite=false to updates to not overwrite 
existing docs)

> add a param ignoreDuplicates=true to updates to not overwrite existing docs
> ---
>
> Key: SOLR-13320
> URL: https://issues.apache.org/jira/browse/SOLR-13320
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>
> Updates should have an option to ignore duplicate documents and drop them if 
> an option  {{ignoreDuplicates=true}} is specified



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13320) add a param overwrite=false to updates to not overwrite existing docs

2019-03-12 Thread Shalin Shekhar Mangar (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16791306#comment-16791306
 ] 

Shalin Shekhar Mangar commented on SOLR-13320:
--

All of Solr's update request handlers already support an overwrite parameter 
which defaults to true.

See 
https://lucene.apache.org/solr/guide/7_5/uploading-data-with-index-handlers.html#uploading-data-with-index-handlers
 and search for "overwrite"

> add a param overwrite=false to updates to not overwrite existing docs
> -
>
> Key: SOLR-13320
> URL: https://issues.apache.org/jira/browse/SOLR-13320
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>
> Updates should have an option to ignore overwrites if an option  
> {{overwrite=false}} is specified



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr] moshebla commented on issue #549: WIP:SOLR-13129

2019-03-12 Thread GitBox
moshebla commented on issue #549: WIP:SOLR-13129
URL: https://github.com/apache/lucene-solr/pull/549#issuecomment-472283044
 
 
   Sorry I've been MIA.
   It'd be nice if you could add the finishing touches to the warning, since I 
am newer to SOLR and am not quite as aware of the other implications this could 
cause.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13242) RegexReplaceProcessorFactory not making accurate replacement

2019-03-12 Thread Edwin Yeo Zheng Lin (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16791256#comment-16791256
 ] 

Edwin Yeo Zheng Lin commented on SOLR-13242:


We have managed to resolve the issue, by changing the \s to \W. The reason 
could be due to that some of the spaces and white space instead of just a 
space. Using \s will only remove the spaces and not the white spaces, but using 
\W will remove the white spaces as well.
 
We have used this config, and it works.
 
  
    content
    (\n\W*)\{2,}
    brbr
    true
  
  
    content
    (\n\W*)\{1,}
    br
    true
  
 
 

> RegexReplaceProcessorFactory not making accurate replacement
> 
>
> Key: SOLR-13242
> URL: https://issues.apache.org/jira/browse/SOLR-13242
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 7.6, 7.7, 7.7.1
>Reporter: Edwin Yeo Zheng Lin
>Priority: Major
>  Labels: regex, solr
>
> We are using the RegexReplaceProcessorFactory, and have tried with all of the 
> following configurations in solrconfig.xml:
>  
> 
>     content
>     (\s*\r?\n)\{2,}
>     
>     true
>   
> 
>     content
>     ([ \s]*\r?\n)\{2,}
>     
>     true
>   
>  
>     content
>     (\s*\n)\{2,}
>     
>     true
>   
>  
>     content
>     (\n\s*)\{2,}
>     
>     true
>   
>  
> The regex pattern of (\s*\r?\n)\{2,}, ([ \s]*\r?\n)\{2,}, (\s*\n)\{2,} and 
> (\n\s*)\{2,} are working perfectly in [regex101.com|http://regex101.com/], in 
> which all the \n will be replaced by only two 
> However, in Solr, there are cases (in Example 2 and 3 below) that has four 
>  in a row. This should not be the case, as we have already set it to 
> replace by two  regardless of how many \n are there in a row.
>  
>  
> *Example 1: The sentence that the above regex pattern is working correctly* 
> *Original content in EML [file:*|file://%2A/]  
> Dear Sir, 
>  
> I am terminating 
> *Original content:*    Dear Sir,  \n\n \n \n\n I am terminating
> *Index content:*     Dear Sir,  I am terminating 
>  
> *Example 2: The sentence that the above regex pattern is partially working 
> (as you can see, instead of 2 , there are 4 )*
> *Original content in EML [file:*|file://%2A/]
> _exalted_
> _Psalm 89:17_
>  
> 3 Choa Chu Kang Avenue 4    
> *Original content:* exalted  \n \n\n   Psalm 89:17   \n\n   \n\n  3 Choa Chu 
> Kang Avenue 4, Singapore
> *Index content:* exalted  Psalm 89:17     3 Choa Chu 
> Kang Avenue 4, Singapore
>  
> *Example 3: The sentence that the above regex pattern is partially working 
> (as you can see, instead of 2 , there are 4 )*
> *Original content in EML [file:*|file://%2A/]
> [http://www.concordpri.moe.edu.sg/]
>  
>  
>  
>  
> On Tue, Dec 18, 2018 at 10:07 AM    
> *Original content:* [http://www.concordpri.moe.edu.sg/]   \n\n   \n\n \n \n\n 
> \n\n \n\n \n\n \n\n \n\n \n\n \n\n \n\n \n\n\n \n\n\n  On Tue, Dec 18, 2018 
> at 10:07 AM 
> *Index content:* [http://www.concordpri.moe.edu.sg/]     On 
> Tue, Dec 18, 2018 at 10:07 AM



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-Tests-8.0 - Build # 28 - Unstable

2019-03-12 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-Tests-8.0/28/

1 tests failed.
FAILED:  org.apache.solr.cloud.OverseerTest.testShardLeaderChange

Error Message:
Captured an uncaught exception in thread: Thread[id=235, 
name=OverseerCollectionConfigSetProcessor-74336431539879938-127.0.0.1:42131_solr-n_00,
 state=RUNNABLE, group=Overseer collection creation process.]

Stack Trace:
com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an uncaught 
exception in thread: Thread[id=235, 
name=OverseerCollectionConfigSetProcessor-74336431539879938-127.0.0.1:42131_solr-n_00,
 state=RUNNABLE, group=Overseer collection creation process.]
at 
__randomizedtesting.SeedInfo.seed([A63CB12CBE10F5F0:786F36DBA4880001]:0)
Caused by: org.apache.solr.common.AlreadyClosedException
at __randomizedtesting.SeedInfo.seed([A63CB12CBE10F5F0]:0)
at 
org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:69)
at 
org.apache.solr.common.cloud.SolrZkClient.getData(SolrZkClient.java:358)
at 
org.apache.solr.cloud.OverseerTaskProcessor.amILeader(OverseerTaskProcessor.java:416)
at 
org.apache.solr.cloud.OverseerTaskProcessor.run(OverseerTaskProcessor.java:156)
at java.lang.Thread.run(Thread.java:748)




Build Log:
[...truncated 12689 lines...]
   [junit4] Suite: org.apache.solr.cloud.OverseerTest
   [junit4]   2> Creating dataDir: 
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-Tests-8.0/solr/build/solr-core/test/J2/temp/solr.cloud.OverseerTest_A63CB12CBE10F5F0-001/init-core-data-001
   [junit4]   2> 129197 WARN  
(SUITE-OverseerTest-seed#[A63CB12CBE10F5F0]-worker) [] o.a.s.SolrTestCaseJ4 
startTrackingSearchers: numOpens=9 numCloses=9
   [junit4]   2> 129356 INFO  
(SUITE-OverseerTest-seed#[A63CB12CBE10F5F0]-worker) [] o.a.s.SolrTestCaseJ4 
Using PointFields (NUMERIC_POINTS_SYSPROP=true) 
w/NUMERIC_DOCVALUES_SYSPROP=false
   [junit4]   2> 129357 INFO  
(SUITE-OverseerTest-seed#[A63CB12CBE10F5F0]-worker) [] o.a.s.SolrTestCaseJ4 
Randomized ssl (true) and clientAuth (false) via: 
@org.apache.solr.util.RandomizeSSL(reason=, ssl=NaN, value=NaN, clientAuth=NaN)
   [junit4]   2> 129502 INFO  
(SUITE-OverseerTest-seed#[A63CB12CBE10F5F0]-worker) [] o.a.s.SolrTestCaseJ4 
SecureRandom sanity checks: test.solr.allowed.securerandom=null & 
java.security.egd=file:/dev/./urandom
   [junit4]   2> 129760 INFO  
(SUITE-OverseerTest-seed#[A63CB12CBE10F5F0]-worker) [] o.a.s.c.ZkTestServer 
STARTING ZK TEST SERVER
   [junit4]   2> 129858 INFO  (ZkTestServer Run Thread) [] 
o.a.s.c.ZkTestServer client port:0.0.0.0/0.0.0.0:0
   [junit4]   2> 129859 INFO  (ZkTestServer Run Thread) [] 
o.a.s.c.ZkTestServer Starting server
   [junit4]   2> 130559 INFO  
(SUITE-OverseerTest-seed#[A63CB12CBE10F5F0]-worker) [] o.a.s.c.ZkTestServer 
start zk server on port:42131
   [junit4]   2> 130559 INFO  
(SUITE-OverseerTest-seed#[A63CB12CBE10F5F0]-worker) [] o.a.s.c.ZkTestServer 
parse host and port list: 127.0.0.1:42131
   [junit4]   2> 130560 INFO  
(SUITE-OverseerTest-seed#[A63CB12CBE10F5F0]-worker) [] o.a.s.c.ZkTestServer 
connecting to 127.0.0.1 42131
   [junit4]   2> 131681 INFO  (zkConnectionManagerCallback-41-thread-1) [] 
o.a.s.c.c.ConnectionManager zkClient has connected
   [junit4]   2> 132127 INFO  (zkConnectionManagerCallback-43-thread-1) [] 
o.a.s.c.c.ConnectionManager zkClient has connected
   [junit4]   2> 132131 INFO  
(SUITE-OverseerTest-seed#[A63CB12CBE10F5F0]-worker) [] o.a.s.SolrTestCaseJ4 
initCore
   [junit4]   2> 132131 INFO  
(SUITE-OverseerTest-seed#[A63CB12CBE10F5F0]-worker) [] o.a.s.SolrTestCaseJ4 
initCore end
   [junit4]   2> 132168 INFO  
(TEST-OverseerTest.testShardLeaderChange-seed#[A63CB12CBE10F5F0]) [] 
o.a.s.SolrTestCaseJ4 ###Starting testShardLeaderChange
   [junit4]   2> 133392 INFO  (zkConnectionManagerCallback-45-thread-1) [] 
o.a.s.c.c.ConnectionManager zkClient has connected
   [junit4]   2> 133670 WARN  (Thread-43) [] o.a.s.c.s.i.Http2SolrClient 
Create Http2SolrClient with HTTP/1.1 transport since Java 8 or lower versions 
does not support SSL + HTTP/2
   [junit4]   2> 134073 WARN  
(TEST-OverseerTest.testShardLeaderChange-seed#[A63CB12CBE10F5F0]) [] 
o.a.s.c.s.i.Http2SolrClient Create Http2SolrClient with HTTP/1.1 transport 
since Java 8 or lower versions does not support SSL + HTTP/2
   [junit4]   2> 134074 WARN  (Thread-43) [] o.e.j.u.s.S.config No Client 
EndPointIdentificationAlgorithm configured for 
SslContextFactory@2e841338[provider=null,keyStore=null,trustStore=null]
   [junit4]   2> 134142 WARN  
(TEST-OverseerTest.testShardLeaderChange-seed#[A63CB12CBE10F5F0]) [] 
o.e.j.u.s.S.config No Client EndPointIdentificationAlgorithm configured for 
SslContextFactory@58bb2be1[provider=null,keyStore=null,trustStore=null]
   [junit4]   2> 134226 INFO  (zkConnectionManagerCallback-57-thread-1) [] 
o.a.s.c.c.ConnectionManager zkClient has 

[JENKINS] Lucene-Solr-master-Windows (32bit/jdk1.8.0_172) - Build # 7773 - Unstable!

2019-03-12 Thread Policeman Jenkins Server
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-master-Windows/7773/
Java: 32bit/jdk1.8.0_172 -server -XX:+UseSerialGC

1 tests failed.
FAILED:  
junit.framework.TestSuite.org.apache.solr.client.solrj.impl.CloudHttp2SolrClientTest

Error Message:
ObjectTracker found 4 object(s) that were not released!!! 
[MockDirectoryWrapper, MockDirectoryWrapper, MockDirectoryWrapper, SolrCore] 
org.apache.solr.common.util.ObjectReleaseTracker$ObjectTrackerException: 
org.apache.lucene.store.MockDirectoryWrapper  at 
org.apache.solr.common.util.ObjectReleaseTracker.track(ObjectReleaseTracker.java:42)
  at 
org.apache.solr.core.CachingDirectoryFactory.get(CachingDirectoryFactory.java:348)
  at org.apache.solr.update.SolrIndexWriter.create(SolrIndexWriter.java:99)  at 
org.apache.solr.core.SolrCore.initIndex(SolrCore.java:779)  at 
org.apache.solr.core.SolrCore.(SolrCore.java:976)  at 
org.apache.solr.core.SolrCore.(SolrCore.java:883)  at 
org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1192)
  at org.apache.solr.core.CoreContainer.create(CoreContainer.java:1102)  at 
org.apache.solr.handler.admin.CoreAdminOperation.lambda$static$0(CoreAdminOperation.java:92)
  at 
org.apache.solr.handler.admin.CoreAdminOperation.execute(CoreAdminOperation.java:360)
  at 
org.apache.solr.handler.admin.CoreAdminHandler$CallInfo.call(CoreAdminHandler.java:396)
  at 
org.apache.solr.handler.admin.CoreAdminHandler.lambda$handleRequestBody$0(CoreAdminHandler.java:188)
  at 
com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:176)
  at 
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:209)
  at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
 at java.lang.Thread.run(Thread.java:748)  
org.apache.solr.common.util.ObjectReleaseTracker$ObjectTrackerException: 
org.apache.lucene.store.MockDirectoryWrapper  at 
org.apache.solr.common.util.ObjectReleaseTracker.track(ObjectReleaseTracker.java:42)
  at 
org.apache.solr.core.CachingDirectoryFactory.get(CachingDirectoryFactory.java:348)
  at 
org.apache.solr.handler.IndexFetcher.fetchLatestIndex(IndexFetcher.java:509)  
at org.apache.solr.handler.IndexFetcher.fetchLatestIndex(IndexFetcher.java:351) 
 at 
org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:422) 
 at 
org.apache.solr.handler.ReplicationHandler.lambda$setupPolling$13(ReplicationHandler.java:1191)
  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)  
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)  at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
  at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
  at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
 at java.lang.Thread.run(Thread.java:748)  
org.apache.solr.common.util.ObjectReleaseTracker$ObjectTrackerException: 
org.apache.lucene.store.MockDirectoryWrapper  at 
org.apache.solr.common.util.ObjectReleaseTracker.track(ObjectReleaseTracker.java:42)
  at 
org.apache.solr.core.CachingDirectoryFactory.get(CachingDirectoryFactory.java:348)
  at 
org.apache.solr.core.SolrCore.initSnapshotMetaDataManager(SolrCore.java:517)  
at org.apache.solr.core.SolrCore.(SolrCore.java:968)  at 
org.apache.solr.core.SolrCore.(SolrCore.java:883)  at 
org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1192)
  at org.apache.solr.core.CoreContainer.create(CoreContainer.java:1102)  at 
org.apache.solr.handler.admin.CoreAdminOperation.lambda$static$0(CoreAdminOperation.java:92)
  at 
org.apache.solr.handler.admin.CoreAdminOperation.execute(CoreAdminOperation.java:360)
  at 
org.apache.solr.handler.admin.CoreAdminHandler$CallInfo.call(CoreAdminHandler.java:396)
  at 
org.apache.solr.handler.admin.CoreAdminHandler.lambda$handleRequestBody$0(CoreAdminHandler.java:188)
  at 
com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:176)
  at 
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:209)
  at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
 at java.lang.Thread.run(Thread.java:748)  
org.apache.solr.common.util.ObjectReleaseTracker$ObjectTrackerException: 
org.apache.solr.core.SolrCore  at 
org.apache.solr.common.util.ObjectReleaseTracker.track(ObjectReleaseTracker.java:42)
  at org.apache.solr.core.SolrCore.(SolrCore.java:1063)  at 
org.apache.solr.core.SolrCore.(SolrCore.java:883)  at 

[jira] [Updated] (SOLR-13320) add a param overwrite=false to updates to not overwrite existing docs

2019-03-12 Thread Noble Paul (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-13320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-13320:
--
Issue Type: New Feature  (was: Improvement)

> add a param overwrite=false to updates to not overwrite existing docs
> -
>
> Key: SOLR-13320
> URL: https://issues.apache.org/jira/browse/SOLR-13320
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Priority: Major
>
> Updates should have an option to ignore overwrites if an option  
> {{overwrite=false}} is specified



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (SOLR-13320) add a param overwrite=false to updates to not overwrite existing docs

2019-03-12 Thread Noble Paul (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-13320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul reassigned SOLR-13320:
-

Assignee: Noble Paul

> add a param overwrite=false to updates to not overwrite existing docs
> -
>
> Key: SOLR-13320
> URL: https://issues.apache.org/jira/browse/SOLR-13320
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>
> Updates should have an option to ignore overwrites if an option  
> {{overwrite=false}} is specified



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-13320) add a param overwrite=false to updates to not overwrite existing docs

2019-03-12 Thread Noble Paul (JIRA)
Noble Paul created SOLR-13320:
-

 Summary: add a param overwrite=false to updates to not overwrite 
existing docs
 Key: SOLR-13320
 URL: https://issues.apache.org/jira/browse/SOLR-13320
 Project: Solr
  Issue Type: Improvement
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Noble Paul


Updates should have an option to ignore overwrites if an option  
{{overwrite=false}} is specified



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-13319) HashBasedRouter prints out the whole state,json instead of collection name

2019-03-12 Thread Noble Paul (JIRA)
Noble Paul created SOLR-13319:
-

 Summary: HashBasedRouter prints out the whole state,json instead 
of collection name
 Key: SOLR-13319
 URL: https://issues.apache.org/jira/browse/SOLR-13319
 Project: Solr
  Issue Type: Improvement
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Noble Paul
Assignee: Noble Paul


{code}
protected Slice hashToSlice(int hash, DocCollection collection) {
final Slice[] slices = collection.getActiveSlicesArr();
for (Slice slice : slices) {
  Range range = slice.getRange();
  if (range != null && range.includes(hash)) return slice;
}
throw new SolrException(SolrException.ErrorCode.BAD_REQUEST, "No active 
slice servicing hash code " + Integer.toHexString(hash) + " in " + collection);
  }
{code}

The exception prints out the collection object instead of just collection name



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8585) Create jump-tables for DocValues at index-time

2019-03-12 Thread Toke Eskildsen (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16791049#comment-16791049
 ] 

Toke Eskildsen commented on LUCENE-8585:


I need to check my mail filters to discover why I did not see this before now. 
I apologize and thank [~romseygeek] and [~jpountz] for handling it.

> Create jump-tables for DocValues at index-time
> --
>
> Key: LUCENE-8585
> URL: https://issues.apache.org/jira/browse/LUCENE-8585
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/codecs
>Affects Versions: 8.0
>Reporter: Toke Eskildsen
>Priority: Minor
>  Labels: performance
> Fix For: 8.0
>
> Attachments: LUCENE-8585.patch, LUCENE-8585.patch, 
> make_patch_lucene8585.sh
>
>  Time Spent: 10.5h
>  Remaining Estimate: 0h
>
> As noted in LUCENE-7589, lookup of DocValues should use jump-tables to avoid 
> long iterative walks. This is implemented in LUCENE-8374 at search-time 
> (first request for DocValues from a field in a segment), with the benefit of 
> working without changes to existing Lucene 7 indexes and the downside of 
> introducing a startup time penalty and a memory overhead.
> As discussed in LUCENE-8374, the codec should be updated to create these 
> jump-tables at index time. This eliminates the segment-open time & memory 
> penalties, with the potential downside of increasing index-time for DocValues.
> The three elements of LUCENE-8374 should be transferable to index-time 
> without much alteration of the core structures:
>  * {{IndexedDISI}} block offset and index skips: A {{long}} (64 bits) for 
> every 65536 documents, containing the offset of the block in 33 bits and the 
> index (number of set bits) up to the block in 31 bits.
>  It can be build sequentially and should be stored as a simple sequence of 
> consecutive longs for caching of lookups.
>  As it is fairly small, relative to document count, it might be better to 
> simply memory cache it?
>  * {{IndexedDISI}} DENSE (> 4095, < 65536 set bits) blocks: A {{short}} (16 
> bits) for every 8 {{longs}} (512 bits) for a total of 256 bytes/DENSE_block. 
> Each {{short}} represents the number of set bits up to right before the 
> corresponding sub-block of 512 docIDs.
>  The \{{shorts}} can be computed sequentially or when the DENSE block is 
> flushed (probably the easiest). They should be stored as a simple sequence of 
> consecutive shorts for caching of lookups, one logically independent sequence 
> for each DENSE block. The logical position would be one sequence at the start 
> of every DENSE block.
>  Whether it is best to read all the 16 {{shorts}} up front when a DENSE block 
> is accessed or whether it is best to only read any individual {{short}} when 
> needed is not clear at this point.
>  * Variable Bits Per Value: A {{long}} (64 bits) for every 16384 numeric 
> values. Each {{long}} holds the offset to the corresponding block of values.
>  The offsets can be computed sequentially and should be stored as a simple 
> sequence of consecutive {{longs}} for caching of lookups.
>  The vBPV-offsets has the largest space overhead og the 3 jump-tables and a 
> lot of the 64 bits in each long are not used for most indexes. They could be 
> represented as a simple {{PackedInts}} sequence or {{MonotonicLongValues}}, 
> with the downsides of a potential lookup-time overhead and the need for doing 
> the compression after all offsets has been determined.
> I have no experience with the codec-parts responsible for creating 
> index-structures. I'm quite willing to take a stab at this, although I 
> probably won't do much about it before January 2019. Should anyone else wish 
> to adopt this JIRA-issue or co-work on it, I'll be happy to share.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS-EA] Lucene-Solr-8.x-Linux (64bit/jdk-12-ea+shipilev-fastdebug) - Build # 255 - Failure!

2019-03-12 Thread Policeman Jenkins Server
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-8.x-Linux/255/
Java: 64bit/jdk-12-ea+shipilev-fastdebug -XX:+UseCompressedOops 
-XX:+UseParallelGC

1 tests failed.
FAILED:  
org.apache.lucene.index.TestConcurrentMergeScheduler.testFlushExceptions

Error Message:


Stack Trace:
java.lang.AssertionError
at 
__randomizedtesting.SeedInfo.seed([F36A6A232719BDA1:470039F92EF92B95]:0)
at org.junit.Assert.fail(Assert.java:86)
at org.junit.Assert.assertTrue(Assert.java:41)
at org.junit.Assert.assertTrue(Assert.java:52)
at 
org.apache.lucene.index.TestConcurrentMergeScheduler.testFlushExceptions(TestConcurrentMergeScheduler.java:128)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:567)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1750)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:938)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:974)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:988)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:947)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:832)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:883)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:894)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at java.base/java.lang.Thread.run(Thread.java:835)




Build Log:
[...truncated 598 lines...]
   [junit4] Suite: org.apache.lucene.index.TestConcurrentMergeScheduler
   [junit4]   2> NOTE: reproduce with: ant test  
-Dtestcase=TestConcurrentMergeScheduler -Dtests.method=testFlushExceptions 
-Dtests.seed=F36A6A232719BDA1 -Dtests.multiplier=3 -Dtests.slow=true 
-Dtests.locale=en-BM -Dtests.timezone=America/Argentina/Buenos_Aires 
-Dtests.asserts=true -Dtests.file.encoding=UTF-8
   [junit4] FAILURE 0.24s J2 | TestConcurrentMergeScheduler.testFlushExceptions 
<<<
   [junit4]> Throwable #1: java.lang.AssertionError
   [junit4]>at 

[jira] [Commented] (SOLR-12122) nodes expression should support multiValued walk target

2019-03-12 Thread Jonathan Nightingale (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16791014#comment-16791014
 ] 

Jonathan Nightingale commented on SOLR-12122:
-

I'm running into this limitation now myself. Not only on the walk but also the 
gather part of the nodes stream definition. I put it up here:

[https://stackoverflow.com/questions/55130208/using-solr-graph-to-traverse-n-relationships]

Is there a path forward for this or is this not something solr plans to support 
(I'd think its important for most graph traversal solutions).

 

Or is there a work around using some other stream functions?

> nodes expression should support multiValued walk target
> ---
>
> Key: SOLR-12122
> URL: https://issues.apache.org/jira/browse/SOLR-12122
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: streaming expressions
>Reporter: David Smiley
>Priority: Major
>
> The {{nodes}} streaming expression has a {{walk}} argument that articulates a 
> pair of Solr fields of the form {{traversalFrom->traversalTo}}.  It assumed 
> that they are *not* multiValued.  It _appears_ not difficult to add 
> multiValued support to traversalTo; that's what this issue is about.
> See 
> http://lucene.472066.n3.nabble.com/Using-multi-valued-field-in-solr-cloud-Graph-Traversal-Query-td4324379.html
> Note: {{gatherNodes}} appears to be the older name which is still supported. 
> It's more commonly known as {{nodes}}.  graph-traversal.adoc documents it



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-13318) JsonFacetingResponse classes should record provide access to count fields as longs

2019-03-12 Thread Jason Gerlowski (JIRA)
Jason Gerlowski created SOLR-13318:
--

 Summary: JsonFacetingResponse classes should record  provide 
access to count fields as longs
 Key: SOLR-13318
 URL: https://issues.apache.org/jira/browse/SOLR-13318
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
  Components: SolrJ
Affects Versions: 7.7.1
Reporter: Jason Gerlowski
Assignee: Jason Gerlowski


JsonFacetingResponse and its series of dependent classes hold a variety of 
count fields for bucket counts and various optional properties ({{allBuckets}}, 
{{numBuckets}}, etc.).  Currently, some of the code that parses these values 
out of the originating NamedList either stores or casts the values as ints.  
When doc counts are low this works fine.  But when the doc counts become larger 
and stray into "long" territory, SolrJ is liable to blow up with 
ClassCastExceptions.

A user on the list reported on of these with the partial stack trace:

{code}
Caused by: java.lang.ClassCastException: java.lang.Long cannot be cast to 
java.lang.Integer
  at 
org.apache.solr.client.solrj.response.json.NestableJsonFacet.(NestableJsonFacet.java:52)
  at 
org.apache.solr.client.solrj.response.QueryResponse.extractJsonFacetingInfo(QueryResponse.java:200)
  at 
org.apache.solr.client.solrj.response.QueryResponse.getJsonFacetingResponse(QueryResponse.java:571)
{code}

We should fix this so that these classes can be used without incident for any 
doc counts.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12891) Injection Dangers in Streaming Expressions

2019-03-12 Thread Gus Heck (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16790959#comment-16790959
 ] 

Gus Heck commented on SOLR-12891:
-

Certainly, thanks for catching that I wish precommit would flag such 
things. Will fix.

> Injection Dangers in Streaming Expressions
> --
>
> Key: SOLR-12891
> URL: https://issues.apache.org/jira/browse/SOLR-12891
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: streaming expressions
>Affects Versions: 7.5, 8.0
>Reporter: Gus Heck
>Priority: Minor
>  Labels: security
> Fix For: master (9.0), 8.1
>
> Attachments: SOLR-12891.patch, SOLR-12891.patch, SOLR-12891.patch, 
> SOLR-12891.patch, SOLR12819example.java
>
>
> I just spent some time fiddling with streaming expressions for fun, reading 
> Erick Erickson's blog 
> ([https://lucidworks.com/2017/12/06/streaming-expressions-in-solrj/)] and the 
> example given in the ref guide 
> ([https://lucene.apache.org/solr/guide/7_5/streaming-expressions.html#streaming-requests-and-responses)]
>  and it occurred to me that we are recommending string concatenation into an 
> expression language with the power to harm the server, or other network 
> services visible from the server. I'm starting this Jira as a security issue 
> to avoid creating a public impression of insecurity, feel free to undo that 
> if I have guessed wrong. I haven't developed an exploit example, but it would 
> go something like this:
>  # Some portion of an expression is built including user supplied data using 
> the techniques we're recommending in the ref guide
>  # Malicious user constructs input data that breaks out of the expression 
> (SOLR-10894 is relevant here), probably somewhere inside a let() expression 
> where one could simply define an additional variable taking the value of a 
> malicious expression...
>  # update() expression is executed to add/overwrite data, jdbc() makes a JDBC 
> connection to a database visible to the server, or the malicious expression 
> executes some very expensive expression for DOS effect.
> Technically this is of course the fault of the end user who allowed unchecked 
> input into programmatic execution, but when I think about how to check the 
> input I realize that the only way to be sure is to construct for myself a 
> notion of exactly how the parser behaves and then determine what needs to be 
> escaped. To do this I need to dig into the expression parser code...
> How to escape input is also already unclear as shown by SOLR-10894
> There's another important wrinkle that would easily be missed by someone 
> trying to construct their own escaping/protection system relating to 
> parameter substitution as discussed here: SOLR-8458 
> I think the solution to this is that SolrJ API should be enhanced to provide 
> an escaping utility at a minimum and possibly a "prepared expression" similar 
> to SQL prepared statements and call this issue to attention in the ref guide 
> once these tools are available... 
> Additionally, templating features might be a useful addition to help folks 
> manage large expressions and facilitate re-use of patterns... such templating 
> should also have this issue in mind when/if they are added.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-8723) Bad interaction bewteen WordDelimiterGraphFilter, StopFilter and FlattenGraphFilter

2019-03-12 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/LUCENE-8723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nicolás Lichtmaier updated LUCENE-8723:
---
Description: 
I was debugging an issue (missing tokens after analysis) and when I enabled 
Java assertions I uncovered a bug when using WordDelimiterGraphFilter + 
StopFilter + FlattenGraphFilter.

I could reproduce the issue in a small piece of code. This code gives an 
assertion failure when assertions are enabled (-ea java option):

{code:java}
    Builder builder = CustomAnalyzer.builder();
    builder.withTokenizer(StandardTokenizerFactory.class);
    builder.addTokenFilter(WordDelimiterGraphFilterFactory.class, 
"preserveOriginal", "1");
    builder.addTokenFilter(StopFilterFactory.class);
    builder.addTokenFilter(FlattenGraphFilterFactory.class);
    Analyzer analyzer = builder.build();
     
    TokenStream ts = analyzer.tokenStream("*", new StringReader("x7in"));
    ts.reset();
    while(ts.incrementToken())
        ;
{code}

This gives:

{code}
Exception in thread "main" java.lang.AssertionError: 2
     at 
org.apache.lucene.analysis.core.FlattenGraphFilter.releaseBufferedToken(FlattenGraphFilter.java:195)
     at 
org.apache.lucene.analysis.core.FlattenGraphFilter.incrementToken(FlattenGraphFilter.java:258)
     at com.wolfram.textsearch.AnalyzerError.main(AnalyzerError.java:32)
{code}

Maybe removing stop words after WordDelimiterGraphFilter is wrong, I don't 
know. However is the only way to process stop-words generated by that filter. 
In any case, it should not eat tokens or produce assertions. 

  was:
I was debugging an issue (missing tokens after analysis) and when I enabled 
Java assertions I uncovered a bug when using WordDelimiterGraphFilter + 
StopFilter + FlattenGraphFilter.

I could reproduce the issue in a small piece of code. This code gives an 
assertion failure when assertions are enabled (-ea java option):

{code:java}
    Builder builder = CustomAnalyzer.builder();
    builder.withTokenizer(StandardTokenizerFactory.class);
    builder.addTokenFilter(WordDelimiterGraphFilterFactory.class, 
"preserveOriginal", "1");
    builder.addTokenFilter(StopFilterFactory.class);
    builder.addTokenFilter(FlattenGraphFilterFactory.class);
    Analyzer analyzer = builder.build();}}
     
    TokenStream ts = analyzer.tokenStream("*", new StringReader("x7in"));
    ts.reset();
    while(ts.incrementToken())
        ;
{code}

This gives:

{code}
Exception in thread "main" java.lang.AssertionError: 2
     at 
org.apache.lucene.analysis.core.FlattenGraphFilter.releaseBufferedToken(FlattenGraphFilter.java:195)
     at 
org.apache.lucene.analysis.core.FlattenGraphFilter.incrementToken(FlattenGraphFilter.java:258)
     at com.wolfram.textsearch.AnalyzerError.main(AnalyzerError.java:32)
{code}

Maybe removing stop words after WordDelimiterGraphFilter is wrong, I don't 
know. However is the only way to process stop-words generated by that filter. 
In any case, it should not eat tokens or produce assertions. 


> Bad interaction bewteen WordDelimiterGraphFilter, StopFilter and 
> FlattenGraphFilter
> ---
>
> Key: LUCENE-8723
> URL: https://issues.apache.org/jira/browse/LUCENE-8723
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/analysis
>Affects Versions: 7.7.1
>Reporter: Nicolás Lichtmaier
>Priority: Major
>
> I was debugging an issue (missing tokens after analysis) and when I enabled 
> Java assertions I uncovered a bug when using WordDelimiterGraphFilter + 
> StopFilter + FlattenGraphFilter.
> I could reproduce the issue in a small piece of code. This code gives an 
> assertion failure when assertions are enabled (-ea java option):
> {code:java}
>     Builder builder = CustomAnalyzer.builder();
>     builder.withTokenizer(StandardTokenizerFactory.class);
>     builder.addTokenFilter(WordDelimiterGraphFilterFactory.class, 
> "preserveOriginal", "1");
>     builder.addTokenFilter(StopFilterFactory.class);
>     builder.addTokenFilter(FlattenGraphFilterFactory.class);
>     Analyzer analyzer = builder.build();
>      
>     TokenStream ts = analyzer.tokenStream("*", new StringReader("x7in"));
>     ts.reset();
>     while(ts.incrementToken())
>         ;
> {code}
> This gives:
> {code}
> Exception in thread "main" java.lang.AssertionError: 2
>      at 
> org.apache.lucene.analysis.core.FlattenGraphFilter.releaseBufferedToken(FlattenGraphFilter.java:195)
>      at 
> org.apache.lucene.analysis.core.FlattenGraphFilter.incrementToken(FlattenGraphFilter.java:258)
>      at com.wolfram.textsearch.AnalyzerError.main(AnalyzerError.java:32)
> {code}
> Maybe removing stop words after WordDelimiterGraphFilter is wrong, I don't 
> know. However is the only way to process stop-words generated by that filter. 
> In 

[jira] [Updated] (LUCENE-8723) Bad interaction bewteen WordDelimiterGraphFilter, StopFilter and FlattenGraphFilter

2019-03-12 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/LUCENE-8723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nicolás Lichtmaier updated LUCENE-8723:
---
Description: 
I was debugging an issue (missing tokens after analysis) and when I enabled 
Java assertions I uncovered a bug when using WordDelimiterGraphFilter + 
StopFilter + FlattenGraphFilter.

I could reproduce the issue in a small piece of code. This code gives an 
assertion failure when assertions are enabled (-ea java option):

{code:java}
    Builder builder = CustomAnalyzer.builder();
    builder.withTokenizer(StandardTokenizerFactory.class);
    builder.addTokenFilter(WordDelimiterGraphFilterFactory.class, 
"preserveOriginal", "1");
    builder.addTokenFilter(StopFilterFactory.class);
    builder.addTokenFilter(FlattenGraphFilterFactory.class);
    Analyzer analyzer = builder.build();}}
     
    TokenStream ts = analyzer.tokenStream("*", new StringReader("x7in"));
    ts.reset();
    while(ts.incrementToken())
        ;
{code}

This gives:

{code}
Exception in thread "main" java.lang.AssertionError: 2
     at 
org.apache.lucene.analysis.core.FlattenGraphFilter.releaseBufferedToken(FlattenGraphFilter.java:195)
     at 
org.apache.lucene.analysis.core.FlattenGraphFilter.incrementToken(FlattenGraphFilter.java:258)
     at com.wolfram.textsearch.AnalyzerError.main(AnalyzerError.java:32)
{code}

Maybe removing stop words after WordDelimiterGraphFilter is wrong, I don't 
know. However is the only way to process stop-words generated by that filter. 
In any case, it should not eat tokens or produce assertions. 

  was:
I was debugging an issue (missing tokens after analysis) and when I enabled 
Java assertions I uncovered a bug when using WordDelimiterGraphFilter + 
StopFilter + FlattenGraphFilter.

I could reproduce the issue in a small piece of code. This code gives an 
assertion failure when assertions are enabled (-ea java option):

{code:java}
    Builder builder = CustomAnalyzer.builder();
    builder.withTokenizer(StandardTokenizerFactory.class);
    builder.addTokenFilter(WordDelimiterGraphFilterFactory.class, 
"preserveOriginal", "1");
    builder.addTokenFilter(StopFilterFactory.class);
             
    builder.addTokenFilter(FlattenGraphFilterFactory.class);}}
    Analyzer analyzer = builder.build();}}
     
    TokenStream ts = analyzer.tokenStream("*", new StringReader("x7in"));
    ts.reset();
    while(ts.incrementToken())
        ;
{code}

This gives:

{code}
Exception in thread "main" java.lang.AssertionError: 2
     at 
org.apache.lucene.analysis.core.FlattenGraphFilter.releaseBufferedToken(FlattenGraphFilter.java:195)
     at 
org.apache.lucene.analysis.core.FlattenGraphFilter.incrementToken(FlattenGraphFilter.java:258)
     at com.wolfram.textsearch.AnalyzerError.main(AnalyzerError.java:32)
{code}

Maybe removing stop words after WordDelimiterGraphFilter is wrong, I don't 
know. However is the only way to process stop-words generated by that filter. 
In any case, it should not eat tokens or produce assertions. 


> Bad interaction bewteen WordDelimiterGraphFilter, StopFilter and 
> FlattenGraphFilter
> ---
>
> Key: LUCENE-8723
> URL: https://issues.apache.org/jira/browse/LUCENE-8723
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/analysis
>Affects Versions: 7.7.1
>Reporter: Nicolás Lichtmaier
>Priority: Major
>
> I was debugging an issue (missing tokens after analysis) and when I enabled 
> Java assertions I uncovered a bug when using WordDelimiterGraphFilter + 
> StopFilter + FlattenGraphFilter.
> I could reproduce the issue in a small piece of code. This code gives an 
> assertion failure when assertions are enabled (-ea java option):
> {code:java}
>     Builder builder = CustomAnalyzer.builder();
>     builder.withTokenizer(StandardTokenizerFactory.class);
>     builder.addTokenFilter(WordDelimiterGraphFilterFactory.class, 
> "preserveOriginal", "1");
>     builder.addTokenFilter(StopFilterFactory.class);
>     builder.addTokenFilter(FlattenGraphFilterFactory.class);
>     Analyzer analyzer = builder.build();}}
>      
>     TokenStream ts = analyzer.tokenStream("*", new StringReader("x7in"));
>     ts.reset();
>     while(ts.incrementToken())
>         ;
> {code}
> This gives:
> {code}
> Exception in thread "main" java.lang.AssertionError: 2
>      at 
> org.apache.lucene.analysis.core.FlattenGraphFilter.releaseBufferedToken(FlattenGraphFilter.java:195)
>      at 
> org.apache.lucene.analysis.core.FlattenGraphFilter.incrementToken(FlattenGraphFilter.java:258)
>      at com.wolfram.textsearch.AnalyzerError.main(AnalyzerError.java:32)
> {code}
> Maybe removing stop words after WordDelimiterGraphFilter is wrong, I don't 
> know. However is the only way to process stop-words generated by 

[jira] [Commented] (LUCENE-8692) IndexWriter.getTragicException() may not reflect all corrupting exceptions (notably: NoSuchFileException)

2019-03-12 Thread Simon Willnauer (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16790895#comment-16790895
 ] 

Simon Willnauer commented on LUCENE-8692:
-

> rollback gives you a way to close IndexWriter without doing a commit, which 
> seems useful.  If you removed that, what would users do instead?

can't we expend close to close without commit? I mean we can keep rollback but 
bet more strict about exceptions during the commit and friends?

> IndexWriter.getTragicException() may not reflect all corrupting exceptions 
> (notably: NoSuchFileException)
> -
>
> Key: LUCENE-8692
> URL: https://issues.apache.org/jira/browse/LUCENE-8692
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Hoss Man
>Priority: Major
> Attachments: LUCENE-8692.patch, LUCENE-8692.patch, LUCENE-8692.patch, 
> LUCENE-8692_test.patch
>
>
> Backstory...
> Solr has a "LeaderTragicEventTest" which uses MockDirectoryWrapper's 
> {{corruptFiles}} to introduce corruption into the "leader" node's index and 
> then assert that this solr node gives up it's leadership of the shard and 
> another replica takes over.
> This can currently fail sporadically (but usually reproducibly - see 
> SOLR-13237) due to the leader not giving up it's leadership even after the 
> corruption causes an update/commit to fail. Solr's leadership code makes this 
> decision after encountering an exception from the IndexWriter based on wether 
> {{IndexWriter.getTragicException()}} is (non-)null.
> 
> While investigating this, I created an isolated Lucene-Core equivilent test 
> that demonstrates the same basic situation:
>  * Gradually cause corruption on an index untill (otherwise) valid execution 
> of IW.add() + IW.commit() calls throw an exception to the IW client.
>  * assert that if an exception is thrown to the IW client, 
> {{getTragicException()}} is now non-null.
> It's fairly easy to make my new test fail reproducibly – in every situation 
> I've seen the underlying exception is a {{NoSuchFileException}} (ie: the 
> randomly introduced corruption was to delete some file).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12891) Injection Dangers in Streaming Expressions

2019-03-12 Thread Christine Poerschke (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16790884#comment-16790884
 ] 

Christine Poerschke commented on SOLR-12891:


Hi [~gus_heck], is the 
[https://github.com/apache/lucene-solr/blob/9edc557f4526ffbbf35daea06972eb2c595e692b/solr/solrj/src/java/org/apache/solr/client/solrj/io/stream/expr/InjectionDefense.java#L85]
 {{System.out.println}} potentially unintended? [~erickerickson] mentioned on 
the dev list about looking into some "bytes to stdout and stderr" limits being 
exceeded in tests and I speculatively searched my inbox for "System.out" and 
thus found that line 85.

> Injection Dangers in Streaming Expressions
> --
>
> Key: SOLR-12891
> URL: https://issues.apache.org/jira/browse/SOLR-12891
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: streaming expressions
>Affects Versions: 7.5, 8.0
>Reporter: Gus Heck
>Priority: Minor
>  Labels: security
> Fix For: master (9.0), 8.1
>
> Attachments: SOLR-12891.patch, SOLR-12891.patch, SOLR-12891.patch, 
> SOLR-12891.patch, SOLR12819example.java
>
>
> I just spent some time fiddling with streaming expressions for fun, reading 
> Erick Erickson's blog 
> ([https://lucidworks.com/2017/12/06/streaming-expressions-in-solrj/)] and the 
> example given in the ref guide 
> ([https://lucene.apache.org/solr/guide/7_5/streaming-expressions.html#streaming-requests-and-responses)]
>  and it occurred to me that we are recommending string concatenation into an 
> expression language with the power to harm the server, or other network 
> services visible from the server. I'm starting this Jira as a security issue 
> to avoid creating a public impression of insecurity, feel free to undo that 
> if I have guessed wrong. I haven't developed an exploit example, but it would 
> go something like this:
>  # Some portion of an expression is built including user supplied data using 
> the techniques we're recommending in the ref guide
>  # Malicious user constructs input data that breaks out of the expression 
> (SOLR-10894 is relevant here), probably somewhere inside a let() expression 
> where one could simply define an additional variable taking the value of a 
> malicious expression...
>  # update() expression is executed to add/overwrite data, jdbc() makes a JDBC 
> connection to a database visible to the server, or the malicious expression 
> executes some very expensive expression for DOS effect.
> Technically this is of course the fault of the end user who allowed unchecked 
> input into programmatic execution, but when I think about how to check the 
> input I realize that the only way to be sure is to construct for myself a 
> notion of exactly how the parser behaves and then determine what needs to be 
> escaped. To do this I need to dig into the expression parser code...
> How to escape input is also already unclear as shown by SOLR-10894
> There's another important wrinkle that would easily be missed by someone 
> trying to construct their own escaping/protection system relating to 
> parameter substitution as discussed here: SOLR-8458 
> I think the solution to this is that SolrJ API should be enhanced to provide 
> an escaping utility at a minimum and possibly a "prepared expression" similar 
> to SQL prepared statements and call this issue to attention in the ref guide 
> once these tools are available... 
> Additionally, templating features might be a useful addition to help folks 
> manage large expressions and facilitate re-use of patterns... such templating 
> should also have this issue in mind when/if they are added.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-8723) Bad interaction bewteen WordDelimiterGraphFilter, StopFilter and FlattenGraphFilter

2019-03-12 Thread JIRA
Nicolás Lichtmaier created LUCENE-8723:
--

 Summary: Bad interaction bewteen WordDelimiterGraphFilter, 
StopFilter and FlattenGraphFilter
 Key: LUCENE-8723
 URL: https://issues.apache.org/jira/browse/LUCENE-8723
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/analysis
Affects Versions: 7.7.1
Reporter: Nicolás Lichtmaier


I was debugging an issue (missing tokens after analysis) and when I enabled 
Java assertions I uncovered a bug when using WordDelimiterGraphFilter + 
StopFilter + FlattenGraphFilter.

I could reproduce the issue in a small piece of code. This code gives an 
assertion failure when assertions are enabled (-ea java option):

{code:java}
    Builder builder = CustomAnalyzer.builder();
    builder.withTokenizer(StandardTokenizerFactory.class);
    builder.addTokenFilter(WordDelimiterGraphFilterFactory.class, 
"preserveOriginal", "1");
    builder.addTokenFilter(StopFilterFactory.class);
             
    builder.addTokenFilter(FlattenGraphFilterFactory.class);}}
    Analyzer analyzer = builder.build();}}
     
    TokenStream ts = analyzer.tokenStream("*", new StringReader("x7in"));
    ts.reset();
    while(ts.incrementToken())
        ;
{code}

This gives:

{code}
Exception in thread "main" java.lang.AssertionError: 2
     at 
org.apache.lucene.analysis.core.FlattenGraphFilter.releaseBufferedToken(FlattenGraphFilter.java:195)
     at 
org.apache.lucene.analysis.core.FlattenGraphFilter.incrementToken(FlattenGraphFilter.java:258)
     at com.wolfram.textsearch.AnalyzerError.main(AnalyzerError.java:32)
{code}

Maybe removing stop words after WordDelimiterGraphFilter is wrong, I don't 
know. However is the only way to process stop-words generated by that filter. 
In any case, it should not eat tokens or produce assertions. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8542) Provide the LeafSlice to CollectorManager.newCollector to save memory on small index slices

2019-03-12 Thread Adrien Grand (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16790882#comment-16790882
 ] 

Adrien Grand commented on LUCENE-8542:
--

Right I get how it can help with small slices, but at the same time I'm seeing 
small slices as something that should be avoided in order to limit context 
switching so I don't think we should design for small slices?

I think the JDK's PriorityQueue would be fine most of the time, except in the 
worst case that every new hit is competitive, which happens when more recently 
indexed documents are more likely to be competitive (eg. when sorting by 
decreasing timestamp, which I'm seeing often) given that a single reordering 
via updateTop would need to be replaced with one pop and one push. Can we make 
our own PriorityQueue growable maybe? I'm not too concerned about performance 
given that only add() would be affected, not top() and updateTop() which are 
the ones that are important for performance.

> Provide the LeafSlice to CollectorManager.newCollector to save memory on 
> small index slices
> ---
>
> Key: LUCENE-8542
> URL: https://issues.apache.org/jira/browse/LUCENE-8542
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/search
>Reporter: Christoph Kaser
>Priority: Minor
> Attachments: LUCENE-8542.patch
>
>
> I have an index consisting of 44 million documents spread across 60 segments. 
> When I run a query against this index with a huge number of results requested 
> (e.g. 5 million), this query uses more than 5 GB of heap if the IndexSearch 
> was configured to use an ExecutorService.
> (I know this kind of query is fairly unusual and it would be better to use 
> paging and searchAfter, but our architecture does not allow this at the 
> moment.)
> The reason for the huge memory requirement is that the search [will create a 
> TopScoreDocCollector for each 
> segment|https://github.com/apache/lucene-solr/blob/master/lucene/core/src/java/org/apache/lucene/search/IndexSearcher.java#L404],
>  each one with numHits = 5 million. This is fine for the large segments, but 
> many of those segments are fairly small and only contain several thousand 
> documents. This wastes a huge amount of memory for queries with large values 
> of numHits on indices with many segments.
> Therefore, I propose to change the CollectorManager - interface in the 
> following way:
>  * change the method newCollector to accept a parameter LeafSlice that can be 
> used to determine the total count of documents in the LeafSlice
>  * Maybe, in order to remain backwards compatible, it would be possible to 
> introduce this as a new method with a default implementation that calls the 
> old method - otherwise, it probably has to wait for Lucene 8?
>  * This can then be used to cap numHits for each TopScoreDocCollector to the 
> leafslice-size.
> If this is something that would make sense for you, I can try to provide a 
> patch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11127) Add a Collections API command to migrate the .system collection schema from Trie-based (pre-7.0) to Points-based (7.0+)

2019-03-12 Thread Andrzej Bialecki (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-11127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16790861#comment-16790861
 ] 

Andrzej Bialecki  commented on SOLR-11127:
--

Updated patch, with a lot more internal error checking and additional unit 
tests. I think this is fairly complete in functionality, more documentation to 
follow soon.


> Add a Collections API command to migrate the .system collection schema from 
> Trie-based (pre-7.0) to Points-based (7.0+)
> ---
>
> Key: SOLR-11127
> URL: https://issues.apache.org/jira/browse/SOLR-11127
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Steve Rowe
>Assignee: Andrzej Bialecki 
>Priority: Blocker
>  Labels: numeric-tries-to-points
> Fix For: 8.0
>
> Attachments: SOLR-11127.patch, SOLR-11127.patch
>
>
> SOLR-9 will switch the Trie fieldtypes in the .system collection's schema 
> to Points.
> Users with pre-7.0 .system collections will no longer be able to use them 
> once Trie fields have been removed (8.0).
> Solr should provide a Collections API command MIGRATESYSTEMCOLLECTION to 
> automatically convert a Trie-based .system collection to a Points-based one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-11127) Add a Collections API command to migrate the .system collection schema from Trie-based (pre-7.0) to Points-based (7.0+)

2019-03-12 Thread Andrzej Bialecki (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-11127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrzej Bialecki  updated SOLR-11127:
-
Attachment: SOLR-11127.patch

> Add a Collections API command to migrate the .system collection schema from 
> Trie-based (pre-7.0) to Points-based (7.0+)
> ---
>
> Key: SOLR-11127
> URL: https://issues.apache.org/jira/browse/SOLR-11127
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Steve Rowe
>Assignee: Andrzej Bialecki 
>Priority: Blocker
>  Labels: numeric-tries-to-points
> Fix For: 8.0
>
> Attachments: SOLR-11127.patch, SOLR-11127.patch
>
>
> SOLR-9 will switch the Trie fieldtypes in the .system collection's schema 
> to Points.
> Users with pre-7.0 .system collections will no longer be able to use them 
> once Trie fields have been removed (8.0).
> Solr should provide a Collections API command MIGRATESYSTEMCOLLECTION to 
> automatically convert a Trie-based .system collection to a Points-based one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr] dweiss commented on a change in pull request #606: LUCENE-8671: Allow more fine-grained control over off-heap term dictionaries

2019-03-12 Thread GitBox
dweiss commented on a change in pull request #606: LUCENE-8671: Allow more 
fine-grained control over off-heap term dictionaries
URL: https://github.com/apache/lucene-solr/pull/606#discussion_r264826646
 
 

 ##
 File path: 
lucene/core/src/java/org/apache/lucene/store/ByteBuffersIndexInput.java
 ##
 @@ -193,5 +193,10 @@ private void ensureOpen() {
 if (in == null) {
   throw new AlreadyClosedException("Already closed.");
 }
-  }  
+  }
+
+  @Override
+  public boolean isMMapped() {
+return true;
 
 Review comment:
   Err... I didn't follow this change too closely. I thought the meaning of 
this is to distinguish between memory-mapped (off-heap) vs. heap-based storage?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [JENKINS-EA] Lucene-Solr-8.x-Linux (64bit/jdk-13-ea+8) - Build # 254 - Unstable!

2019-03-12 Thread Erick Erickson
There are three of these or so lately, I’ll be looking at them today

> On Mar 12, 2019, at 9:36 AM, Policeman Jenkins Server  
> wrote:
> 
> Build: https://jenkins.thetaphi.de/job/Lucene-Solr-8.x-Linux/254/
> Java: 64bit/jdk-13-ea+8 -XX:-UseCompressedOops -XX:+UseSerialGC
> 
> 1 tests failed.
> FAILED:  
> junit.framework.TestSuite.org.apache.solr.request.RegexBytesRefFilterTest
> 
> Error Message:
> The test or suite printed 260049 bytes to stdout and stderr, even though the 
> limit was set to 8192 bytes. Increase the limit with @Limit, ignore it 
> completely with @SuppressSysoutChecks or run with -Dtests.verbose=true
> 
> Stack Trace:
> java.lang.AssertionError: The test or suite printed 260049 bytes to stdout 
> and stderr, even though the limit was set to 8192 bytes. Increase the limit 
> with @Limit, ignore it completely with @SuppressSysoutChecks or run with 
> -Dtests.verbose=true
>   at __randomizedtesting.SeedInfo.seed([3BC403A6CD8A518C]:0)
>   at 
> org.apache.lucene.util.TestRuleLimitSysouts.afterIfSuccessful(TestRuleLimitSysouts.java:282)
>   at 
> com.carrotsearch.randomizedtesting.rules.TestRuleAdapter$1.afterIfSuccessful(TestRuleAdapter.java:36)
>   at 
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:37)
>   at 
> org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
>   at 
> org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
>   at 
> org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
>   at 
> org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
>   at 
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>   at 
> com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
>   at java.base/java.lang.Thread.run(Thread.java:835)
> 
> 
> 
> 
> Build Log:
> [...truncated 13396 lines...]
>   [junit4] Suite: org.apache.solr.request.RegexBytesRefFilterTest
>   [junit4]   2> 505091 INFO  (READER17) [] o.a.s.c.S.Request 
> [collection1]  webapp=null path=null params={q=id:81=true=json} 
> hits=1 status=0 QTime=0
>   [junit4]   2> 505091 INFO  (READER6) [] o.a.s.c.S.Request [collection1] 
>  webapp=null path=null params={q=id:81=true=json} hits=1 
> status=0 QTime=0
>   [junit4]   2> 505091 INFO  (READER12) [] o.a.s.c.S.Request 
> [collection1]  webapp=null path=null 
> params={q=id:107=true=json} hits=0 status=0 QTime=0
>   [junit4]   2> 505091 INFO  (READER3) [] o.a.s.c.S.Request [collection1] 
>  webapp=null path=null params={wt=json=/get=81} status=0 QTime=0
>   [junit4]   2> 505091 INFO  (READER4) [] o.a.s.c.S.Request [collection1] 
>  webapp=null path=null params={wt=json=/get=117} status=0 QTime=0
>   [junit4]   2> 505091 INFO  (READER17) [] o.a.s.c.S.Request 
> [collection1]  webapp=null path=null params={wt=json=/get=81} status=0 
> QTime=0
>   [junit4]   2> 505091 INFO  (READER3) [] o.a.s.c.S.Request [collection1] 
>  webapp=null path=null params={wt=json=/get=28} status=0 QTime=0
>   [junit4]   2> 505091 INFO  (READER12) [] o.a.s.c.S.Request 
> [collection1]  webapp=null path=null params={wt=json=/get=81} status=0 
> QTime=0
>   [junit4]   2> 505091 INFO  (READER4) [] o.a.s.c.S.Request [collection1] 
>  webapp=null path=null params={q=id:93=true=json} hits=0 
> status=0 QTime=0
>   [junit4]   2> 505091 INFO  (READER6) [] o.a.s.c.S.Request [collection1] 
>  webapp=null path=null params={wt=json=/get=81} status=0 QTime=0
>   [junit4]   2> 505091 INFO  (READER17) [] o.a.s.c.S.Request 
> [collection1]  webapp=null path=null params={q=id:72=true=json} 
> hits=0 status=0 QTime=0
>   [junit4]   2> 505091 INFO  (READER3) [] o.a.s.c.S.Request [collection1] 
>  webapp=null path=null params={wt=json=/get=108} status=0 QTime=0
>   [junit4]   2> 505091 INFO  (READER12) [] o.a.s.c.S.Request 
> [collection1]  webapp=null path=null params={wt=json=/get=114} 
> status=0 QTime=0
>   [junit4]   2> 505091 INFO  (READER6) [] o.a.s.c.S.Request [collection1] 
>  webapp=null path=null params={wt=json=/get=1} status=0 QTime=0
>   [junit4]   2> 505091 INFO  (READER17) [] o.a.s.c.S.Request 
> [collection1]  webapp=null path=null params={wt=json=/get=47} status=0 
> QTime=0
>   [junit4]   2> 505091 INFO  (READER4) [] o.a.s.c.S.Request [collection1] 
>  webapp=null path=null params={q=id:128=true=json} hits=0 
> status=0 QTime=0
>   [junit4]   2> 505091 INFO  (READER3) [] o.a.s.c.S.Request [collection1] 
>  webapp=null path=null params={q=id:81=true=json} hits=1 
> status=0 QTime=0
>   [junit4]   2> 505091 INFO  (READER12) [] o.a.s.c.S.Request 
> [collection1]  webapp=null path=null params={wt=json=/get=5} status=0 
> QTime=0
>   [junit4]   2> 505091 INFO  (READER6) [] o.a.s.c.S.Request [collection1] 
>  

[JENKINS] Lucene-Solr-NightlyTests-master - Build # 1790 - Still Unstable

2019-03-12 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-NightlyTests-master/1790/

1 tests failed.
FAILED:  
org.apache.solr.prometheus.exporter.SolrExporterIntegrationTest.jvmMetrics

Error Message:
expected:<4> but was:<0>

Stack Trace:
java.lang.AssertionError: expected:<4> but was:<0>
at 
__randomizedtesting.SeedInfo.seed([1FF65AEB8C24D41F:CC42670499CA188B]:0)
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:834)
at org.junit.Assert.assertEquals(Assert.java:645)
at org.junit.Assert.assertEquals(Assert.java:631)
at 
org.apache.solr.prometheus.exporter.SolrExporterIntegrationTest.jvmMetrics(SolrExporterIntegrationTest.java:68)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1750)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:938)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:974)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:988)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:947)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:832)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:883)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:894)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at java.lang.Thread.run(Thread.java:748)




Build Log:
[...truncated 24283 lines...]
   [junit4] Suite: 
org.apache.solr.prometheus.exporter.SolrExporterIntegrationTest
   [junit4]   2> ERROR StatusLogger No Log4j 2 configuration file found. Using 

[jira] [Commented] (SOLR-1690) JSONKeyValueTokenizerFactory -- JSON Tokenizer

2019-03-12 Thread Erick Erickson (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16790791#comment-16790791
 ] 

Erick Erickson commented on SOLR-1690:
--

There hasn't been any work done (patches added) for over 9 years, so there's no 
evidence of much interest. Someone would have to pick it up and start over, 
provide a patch etc. Without a compelling use-case on a recent version of Solr, 
there's unlikely to _be_ any progress.

> JSONKeyValueTokenizerFactory -- JSON Tokenizer
> --
>
> Key: SOLR-1690
> URL: https://issues.apache.org/jira/browse/SOLR-1690
> Project: Solr
>  Issue Type: New Feature
>  Components: Schema and Analysis
>Reporter: Ryan McKinley
>Priority: Minor
> Attachments: SOLR-1690-JSONKeyValueTokenizerFactory.patch, 
> noggit-1.0-A1.jar
>
>
> Sometimes it is nice to group structured data into a single field.
> This (rough) patch, takes JSON input and indexes tokens based on the key 
> values pairs in the json.
> {code:xml|title=schema.xml}
> 
>  omitNorms="true">
>   
>  hierarchicalKey="false"/>
> 
> 
>   
>   
> 
> 
> 
>   
> 
> {code}
> Given text:
> {code}
>  { "hello": "world", "rank":5 }
> {code}
> indexed as two tokens:
> || term position |1 | 2 |
> || term text |hello:world | rank:5 |
> || term type |word |  word |
> || source start,end | 12,17   | 27,28 |



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: ISSUE:solrj “org.apache.solr.common.util.SimpleOrderedMap cannot be cast to java.util.Map” exception when using “/suggest” handler

2019-03-12 Thread Walter Underwood
I’ve responded on Stack Overflow, but questions always to go 
solr-u...@lucene.apache.org, never to this list.

Also, a quick summary of the question in the email would make it more likely 
that you would get help.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Mar 12, 2019, at 2:39 AM, praveenraj 4ever  
> wrote:
> 
> HI Team,
> Can you please look into this issue as raised in StackOverflow.
> 
> https://stackoverflow.com/questions/55115760/solrj-org-apache-solr-common-util-simpleorderedmap-cannot-be-cast-to-java-util
>  
> 
> 
> 
> Regards,
> Praveenraj D,
> 9566067066



[jira] [Commented] (SOLR-11921) cursorMark with elevateIds throws Exception

2019-03-12 Thread Erick Erickson (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-11921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16790779#comment-16790779
 ] 

Erick Erickson commented on SOLR-11921:
---

Hmmm, what's the use case? Should the same documents appear at the top of every 
page as you're paging?

CursorMark is intended to handle the "deep paging" issue where "start" is at 
least in the hundreds. If this combination is intended to keep the elevated 
documents at the top of successive pages while a user pages, cursorMark is 
overkill and a workaround would be to just use start without cursormark.

> cursorMark with elevateIds throws Exception
> ---
>
> Key: SOLR-11921
> URL: https://issues.apache.org/jira/browse/SOLR-11921
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SearchComponents - other
>Affects Versions: 7.2
>Reporter: Greg Roodt
>Priority: Critical
>
> When cursorMark pagination is used together with elevateIds, an exception is 
> thrown.
>  
> Steps to reproduce described below.
>  
> 1. Start solr with the `demo` core
> {{docker run --name solr_cursor_elevate -d -p 8983:8983 solr:7.2 solr-demo}}
>  
> 2. Add some test documents
> {{curl http://localhost:8983/solr/demo/update?commit=true -d '}}
> {{[}}
> {{ {"id" : "book1",}}
> {{ "title_t" : "book one"}}
> {{ },}}
> {{ {"id" : "book2",}}
> {{ "title_t" : "book two"}}
> {{ },}}
> {{ {"id" : "book3",}}
> {{ "title_t" : "book three"}}
> {{ }}}
> {{]'}}
>  
> 3. Execute a query with cursorMark and elevateIds
> curl 
> '[http://localhost:8983/solr/demo/elevate?cursorMark=*=book3=true=title_t=id,title_t=true=book=2=id%20asc'|http://localhost:8983/solr/demo/elevate?cursorMark=*=book3=true=title_t=id,title_t=true=book=2=id%20asc%27]
>  
> Observe the stacktrace:
> null:java.lang.ClassCastException: java.lang.Integer cannot be cast to 
> org.apache.lucene.util.BytesRef
>   at 
> org.apache.solr.schema.FieldType.marshalStringSortValue(FieldType.java:1127)
>   at org.apache.solr.schema.StrField.marshalSortValue(StrField.java:100)
>   at 
> org.apache.solr.search.CursorMark.getSerializedTotem(CursorMark.java:250)
>   at 
> org.apache.solr.handler.component.QueryComponent.doProcessUngroupedSearch(QueryComponent.java:1445)
>   at 
> org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:375)
>   at 
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:295)
>   at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:177)
>   at org.apache.solr.core.SolrCore.execute(SolrCore.java:2503)
>   at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:710)
>   at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:516)
>   at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:382)
>   at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:326)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1751)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>   at 
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>   at 
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
>   at 
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
>   at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
>   at 
> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
>   at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
>   at org.eclipse.jetty.server.Server.handle(Server.java:534)
>   at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)
>   at 
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
>   at 
> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283)
>   at 

[jira] [Updated] (SOLR-11921) cursorMark with elevateIds throws Exception

2019-03-12 Thread Erick Erickson (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-11921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson updated SOLR-11921:
--
Priority: Minor  (was: Critical)

> cursorMark with elevateIds throws Exception
> ---
>
> Key: SOLR-11921
> URL: https://issues.apache.org/jira/browse/SOLR-11921
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SearchComponents - other
>Affects Versions: 7.2
>Reporter: Greg Roodt
>Priority: Minor
>
> When cursorMark pagination is used together with elevateIds, an exception is 
> thrown.
>  
> Steps to reproduce described below.
>  
> 1. Start solr with the `demo` core
> {{docker run --name solr_cursor_elevate -d -p 8983:8983 solr:7.2 solr-demo}}
>  
> 2. Add some test documents
> {{curl http://localhost:8983/solr/demo/update?commit=true -d '}}
> {{[}}
> {{ {"id" : "book1",}}
> {{ "title_t" : "book one"}}
> {{ },}}
> {{ {"id" : "book2",}}
> {{ "title_t" : "book two"}}
> {{ },}}
> {{ {"id" : "book3",}}
> {{ "title_t" : "book three"}}
> {{ }}}
> {{]'}}
>  
> 3. Execute a query with cursorMark and elevateIds
> curl 
> '[http://localhost:8983/solr/demo/elevate?cursorMark=*=book3=true=title_t=id,title_t=true=book=2=id%20asc'|http://localhost:8983/solr/demo/elevate?cursorMark=*=book3=true=title_t=id,title_t=true=book=2=id%20asc%27]
>  
> Observe the stacktrace:
> null:java.lang.ClassCastException: java.lang.Integer cannot be cast to 
> org.apache.lucene.util.BytesRef
>   at 
> org.apache.solr.schema.FieldType.marshalStringSortValue(FieldType.java:1127)
>   at org.apache.solr.schema.StrField.marshalSortValue(StrField.java:100)
>   at 
> org.apache.solr.search.CursorMark.getSerializedTotem(CursorMark.java:250)
>   at 
> org.apache.solr.handler.component.QueryComponent.doProcessUngroupedSearch(QueryComponent.java:1445)
>   at 
> org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:375)
>   at 
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:295)
>   at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:177)
>   at org.apache.solr.core.SolrCore.execute(SolrCore.java:2503)
>   at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:710)
>   at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:516)
>   at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:382)
>   at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:326)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1751)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>   at 
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>   at 
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
>   at 
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
>   at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
>   at 
> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
>   at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
>   at org.eclipse.jetty.server.Server.handle(Server.java:534)
>   at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)
>   at 
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
>   at 
> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283)
>   at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:108)
>   at 
> org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
>   at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
>   at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)
>   at 
> 

[jira] [Commented] (LUCENE-8721) LatLonShapePolygon and LineQuery fail on shared dateline queries

2019-03-12 Thread Nicholas Knize (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16790781#comment-16790781
 ] 

Nicholas Knize commented on LUCENE-8721:


bq. Should we check whether there is an intersection between the range of 
latitudes of the query and of the box/triangle?

Yes. I posted the wrong patch. Corrected in the new one.

bq. the coordinates are on the encoded space.

The coordinates are in the decoded space but they have been quantized so I've 
corrected maxLon to use its quantized variant.

bq. +1 to simplify the logic on EdgeTree. It is trivial to merge the methods 
relate &  internalComponentRelate and it makes sense. I will open an issue.

+1 to simplify. I went ahead and started by merging the two methods in this 
patch since it made sense. The new issue can explore further refactoring the 
logic for clarity.

> LatLonShapePolygon and LineQuery fail on shared dateline queries
> 
>
> Key: LUCENE-8721
> URL: https://issues.apache.org/jira/browse/LUCENE-8721
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Nicholas Knize
>Priority: Major
> Attachments: LUCENE-8721.patch, LUCENE-8721.patch
>
>
> Indexed shapes should be returned with search geometries that share the 
> dateline on the opposite hemisphere. 
> For example:
> {code:java}
>   public void testSharedDateline() throws Exception {
>  index /
> Directory dir = newDirectory();
> RandomIndexWriter w = new RandomIndexWriter(random(), dir);
> Document doc = new Document();
> // index western hemisphere geometry
> Polygon indexPoly = new Polygon(
> new double[] {-7.5d, 15d, 15d, 0d, -7.5d},
> new double[] {-180d, -180d, -176d, -176d, -180d}
> );
> Field[] fields = LatLonShape.createIndexableFields("test", indexPoly);
> for (Field f : fields) {
>   doc.add(f);
> }
> w.addDocument(doc);
> w.forceMerge(1);
> / search //
> IndexReader reader = w.getReader();
> w.close();
> IndexSearcher searcher = newSearcher(reader);
> // search w/ eastern hemisphere geometry that shares the dateline
> Polygon searchPoly = new Polygon(new double[] {-7.5d, 15d, 15d, 0d, 
> -7.5d},
> new double[] {180d, 180d, 170d, 170d, 180d});
> Query q = LatLonShape.newPolygonQuery("test", QueryRelation.INTERSECTS, 
> searchPoly);
> assertEquals(1, searcher.count(q));
> IOUtils.close(w, reader, dir);
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11921) cursorMark with elevateIds throws Exception

2019-03-12 Thread Monica Marrero (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-11921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16790749#comment-16790749
 ] 

Monica Marrero commented on SOLR-11921:
---

I have the same problem: when cursorMark is used together with elevation an 
error is thrown if we want to force the elevated documents to appear first.

I reproduced the error with Solr 7.7 and the techproducts collection. This is 
what happens:

Request1: 
[http://localhost:8983/solr/techproducts/elevate?q=ipod=text=id,[elevated],score=2=id%20asc=*]

Result1: Works fine but the document is not boosted to the first results (we 
are ordering just by id)

 

Request2: 
[http://localhost:8983/solr/techproducts/elevate?q=ipod=text=id,[elevated],score=2&*sort=id%20asc=true*=*]

Result2: Error: java.lang.Integer cannot be cast to 
org.apache.lucene.util.BytesRef"

 

Request3: 
[http://localhost:8983/solr/techproducts/elevate?q=ipod=text=id,[elevated],score=*=2&*sort=score%20desc,%20id%20asc*]

Result3: Error:java.lang.Float cannot be cast to org.apache.lucene.util.BytesRef

> cursorMark with elevateIds throws Exception
> ---
>
> Key: SOLR-11921
> URL: https://issues.apache.org/jira/browse/SOLR-11921
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SearchComponents - other
>Affects Versions: 7.2
>Reporter: Greg Roodt
>Priority: Critical
>
> When cursorMark pagination is used together with elevateIds, an exception is 
> thrown.
>  
> Steps to reproduce described below.
>  
> 1. Start solr with the `demo` core
> {{docker run --name solr_cursor_elevate -d -p 8983:8983 solr:7.2 solr-demo}}
>  
> 2. Add some test documents
> {{curl http://localhost:8983/solr/demo/update?commit=true -d '}}
> {{[}}
> {{ {"id" : "book1",}}
> {{ "title_t" : "book one"}}
> {{ },}}
> {{ {"id" : "book2",}}
> {{ "title_t" : "book two"}}
> {{ },}}
> {{ {"id" : "book3",}}
> {{ "title_t" : "book three"}}
> {{ }}}
> {{]'}}
>  
> 3. Execute a query with cursorMark and elevateIds
> curl 
> '[http://localhost:8983/solr/demo/elevate?cursorMark=*=book3=true=title_t=id,title_t=true=book=2=id%20asc'|http://localhost:8983/solr/demo/elevate?cursorMark=*=book3=true=title_t=id,title_t=true=book=2=id%20asc%27]
>  
> Observe the stacktrace:
> null:java.lang.ClassCastException: java.lang.Integer cannot be cast to 
> org.apache.lucene.util.BytesRef
>   at 
> org.apache.solr.schema.FieldType.marshalStringSortValue(FieldType.java:1127)
>   at org.apache.solr.schema.StrField.marshalSortValue(StrField.java:100)
>   at 
> org.apache.solr.search.CursorMark.getSerializedTotem(CursorMark.java:250)
>   at 
> org.apache.solr.handler.component.QueryComponent.doProcessUngroupedSearch(QueryComponent.java:1445)
>   at 
> org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:375)
>   at 
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:295)
>   at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:177)
>   at org.apache.solr.core.SolrCore.execute(SolrCore.java:2503)
>   at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:710)
>   at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:516)
>   at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:382)
>   at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:326)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1751)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>   at 
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>   at 
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
>   at 
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
>   at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
>   at 
> 

[JENKINS-EA] Lucene-Solr-8.x-Linux (64bit/jdk-13-ea+8) - Build # 254 - Unstable!

2019-03-12 Thread Policeman Jenkins Server
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-8.x-Linux/254/
Java: 64bit/jdk-13-ea+8 -XX:-UseCompressedOops -XX:+UseSerialGC

1 tests failed.
FAILED:  
junit.framework.TestSuite.org.apache.solr.request.RegexBytesRefFilterTest

Error Message:
The test or suite printed 260049 bytes to stdout and stderr, even though the 
limit was set to 8192 bytes. Increase the limit with @Limit, ignore it 
completely with @SuppressSysoutChecks or run with -Dtests.verbose=true

Stack Trace:
java.lang.AssertionError: The test or suite printed 260049 bytes to stdout and 
stderr, even though the limit was set to 8192 bytes. Increase the limit with 
@Limit, ignore it completely with @SuppressSysoutChecks or run with 
-Dtests.verbose=true
at __randomizedtesting.SeedInfo.seed([3BC403A6CD8A518C]:0)
at 
org.apache.lucene.util.TestRuleLimitSysouts.afterIfSuccessful(TestRuleLimitSysouts.java:282)
at 
com.carrotsearch.randomizedtesting.rules.TestRuleAdapter$1.afterIfSuccessful(TestRuleAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:37)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at java.base/java.lang.Thread.run(Thread.java:835)




Build Log:
[...truncated 13396 lines...]
   [junit4] Suite: org.apache.solr.request.RegexBytesRefFilterTest
   [junit4]   2> 505091 INFO  (READER17) [] o.a.s.c.S.Request [collection1] 
 webapp=null path=null params={q=id:81=true=json} hits=1 status=0 
QTime=0
   [junit4]   2> 505091 INFO  (READER6) [] o.a.s.c.S.Request [collection1]  
webapp=null path=null params={q=id:81=true=json} hits=1 status=0 
QTime=0
   [junit4]   2> 505091 INFO  (READER12) [] o.a.s.c.S.Request [collection1] 
 webapp=null path=null params={q=id:107=true=json} hits=0 
status=0 QTime=0
   [junit4]   2> 505091 INFO  (READER3) [] o.a.s.c.S.Request [collection1]  
webapp=null path=null params={wt=json=/get=81} status=0 QTime=0
   [junit4]   2> 505091 INFO  (READER4) [] o.a.s.c.S.Request [collection1]  
webapp=null path=null params={wt=json=/get=117} status=0 QTime=0
   [junit4]   2> 505091 INFO  (READER17) [] o.a.s.c.S.Request [collection1] 
 webapp=null path=null params={wt=json=/get=81} status=0 QTime=0
   [junit4]   2> 505091 INFO  (READER3) [] o.a.s.c.S.Request [collection1]  
webapp=null path=null params={wt=json=/get=28} status=0 QTime=0
   [junit4]   2> 505091 INFO  (READER12) [] o.a.s.c.S.Request [collection1] 
 webapp=null path=null params={wt=json=/get=81} status=0 QTime=0
   [junit4]   2> 505091 INFO  (READER4) [] o.a.s.c.S.Request [collection1]  
webapp=null path=null params={q=id:93=true=json} hits=0 status=0 
QTime=0
   [junit4]   2> 505091 INFO  (READER6) [] o.a.s.c.S.Request [collection1]  
webapp=null path=null params={wt=json=/get=81} status=0 QTime=0
   [junit4]   2> 505091 INFO  (READER17) [] o.a.s.c.S.Request [collection1] 
 webapp=null path=null params={q=id:72=true=json} hits=0 status=0 
QTime=0
   [junit4]   2> 505091 INFO  (READER3) [] o.a.s.c.S.Request [collection1]  
webapp=null path=null params={wt=json=/get=108} status=0 QTime=0
   [junit4]   2> 505091 INFO  (READER12) [] o.a.s.c.S.Request [collection1] 
 webapp=null path=null params={wt=json=/get=114} status=0 QTime=0
   [junit4]   2> 505091 INFO  (READER6) [] o.a.s.c.S.Request [collection1]  
webapp=null path=null params={wt=json=/get=1} status=0 QTime=0
   [junit4]   2> 505091 INFO  (READER17) [] o.a.s.c.S.Request [collection1] 
 webapp=null path=null params={wt=json=/get=47} status=0 QTime=0
   [junit4]   2> 505091 INFO  (READER4) [] o.a.s.c.S.Request [collection1]  
webapp=null path=null params={q=id:128=true=json} hits=0 status=0 
QTime=0
   [junit4]   2> 505091 INFO  (READER3) [] o.a.s.c.S.Request [collection1]  
webapp=null path=null params={q=id:81=true=json} hits=1 status=0 
QTime=0
   [junit4]   2> 505091 INFO  (READER12) [] o.a.s.c.S.Request [collection1] 
 webapp=null path=null params={wt=json=/get=5} status=0 QTime=0
   [junit4]   2> 505091 INFO  (READER6) [] o.a.s.c.S.Request [collection1]  
webapp=null path=null params={q=id:45=true=json} hits=0 status=0 
QTime=0
   [junit4]   2> 505091 INFO  (READER17) [] o.a.s.c.S.Request [collection1] 
 webapp=null path=null params={wt=json=/get=0} status=0 QTime=0
   [junit4]   2> 505091 INFO  (READER4) [] o.a.s.c.S.Request 

[jira] [Resolved] (SOLR-13317) cursorMark elevation

2019-03-12 Thread Monica Marrero (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-13317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Monica Marrero resolved SOLR-13317.
---
Resolution: Invalid

> cursorMark elevation
> 
>
> Key: SOLR-13317
> URL: https://issues.apache.org/jira/browse/SOLR-13317
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Monica Marrero
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-13317) cursorMark elevation

2019-03-12 Thread Monica Marrero (JIRA)
Monica Marrero created SOLR-13317:
-

 Summary: cursorMark elevation
 Key: SOLR-13317
 URL: https://issues.apache.org/jira/browse/SOLR-13317
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Monica Marrero






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8718) Add docValueCount support for SortedSetDocValues

2019-03-12 Thread John Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16790707#comment-16790707
 ] 

John Wang commented on LUCENE-8718:
---

[~jpountz] The convention has been to return -1 when underlying codec does not 
support a feature, maybe we can use the same convention for this? e.g. we can 
create a default impl on SortedSetDocValues to return -1.

Let me describe the motivation for this:

It is common to pick SortedSetDocValues over SortedDocValues to avoid issues 
when a docid has multiple values, and you don't want the indexing process to 
fail. However, which most docs, or even all of the docs have only 1 value.

To get the corresponding value, even though all docs have only 1 value, you 
would still need to setup a loop, which is an overhead to pay for each 
iteration in the loop. However, with this api, opens doors for an optimization 
opportunity where you can case it out, e.g. if (numVal == 1) {} else \{set up a 
loop}, the branch prediction here would be mostly correct, and therefore almost 
free.

 

> Add docValueCount support for SortedSetDocValues
> 
>
> Key: LUCENE-8718
> URL: https://issues.apache.org/jira/browse/LUCENE-8718
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/search
>Affects Versions: 7.7.1
>Reporter: John Wang
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> implement docValueCount method for SortedSetDocValues, see comment:
>  
> [https://github.com/apache/lucene-solr/blob/master/lucene/core/src/java/org/apache/lucene/index/SortedSetDocValues.java#L54]
>  
> Patch/PR: https://github.com/apache/lucene-solr/pull/603



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-8720) Integer overflow bug in NameIntCacheLRU.makeRoomLRU()

2019-03-12 Thread Michael McCandless (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-8720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless resolved LUCENE-8720.

   Resolution: Fixed
Fix Version/s: (was: 7.1.1)
   8.1
   master (9.0)

> Integer overflow bug in NameIntCacheLRU.makeRoomLRU()
> -
>
> Key: LUCENE-8720
> URL: https://issues.apache.org/jira/browse/LUCENE-8720
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/search
>Affects Versions: 7.7.1
> Environment: Mac OS X 10.11.6 but this bug is not affected by the 
> environment because it is a straightforward integer overflow bug.
>Reporter: Russell A Brown
>Priority: Major
>  Labels: easyfix, patch
> Fix For: master (9.0), 8.1
>
> Attachments: LUCENE-.patch
>
>
> The NameIntCacheLRU.makeRoomLRU() method has an integer overflow bug because 
> if maxCacheSize >= Integer.MAX_VALUE/2, 2*maxCacheSize will overflow to 
> -(2^30) and the value of n will overflow to a negative integer as well, which 
> will prevent any clearing of the cache whatsoever. Hence, performance will 
> degrade once the cache becomes full because it will be impossible to remove 
> any entries in order to add new entries to the cache.
> Moreover, comments in NameIntCacheLRU.java and LruTaxonomyWriterCache.java 
> indicate that 2/3 of the cache will be cleared, whereas in fact only 1/3 of 
> the cache is cleared. So as not to change the behavior of the 
> NameIntCacheLRU.makeRoomLRU() method, I have not changed the code to clear 
> 2/3 of the cache but instead I have changed the comments to indicate that 1/3 
> of the cache is cleared.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8720) Integer overflow bug in NameIntCacheLRU.makeRoomLRU()

2019-03-12 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16790706#comment-16790706
 ] 

ASF subversion and git services commented on LUCENE-8720:
-

Commit c9de94c66333fa3adfd0878ca6d38e05faff1738 in lucene-solr's branch 
refs/heads/branch_8x from Michael McCandless
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=c9de94c ]

LUCENE-8720: fix int overflow in NameIntCacheLRU


> Integer overflow bug in NameIntCacheLRU.makeRoomLRU()
> -
>
> Key: LUCENE-8720
> URL: https://issues.apache.org/jira/browse/LUCENE-8720
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/search
>Affects Versions: 7.7.1
> Environment: Mac OS X 10.11.6 but this bug is not affected by the 
> environment because it is a straightforward integer overflow bug.
>Reporter: Russell A Brown
>Priority: Major
>  Labels: easyfix, patch
> Fix For: 7.1.1
>
> Attachments: LUCENE-.patch
>
>
> The NameIntCacheLRU.makeRoomLRU() method has an integer overflow bug because 
> if maxCacheSize >= Integer.MAX_VALUE/2, 2*maxCacheSize will overflow to 
> -(2^30) and the value of n will overflow to a negative integer as well, which 
> will prevent any clearing of the cache whatsoever. Hence, performance will 
> degrade once the cache becomes full because it will be impossible to remove 
> any entries in order to add new entries to the cache.
> Moreover, comments in NameIntCacheLRU.java and LruTaxonomyWriterCache.java 
> indicate that 2/3 of the cache will be cleared, whereas in fact only 1/3 of 
> the cache is cleared. So as not to change the behavior of the 
> NameIntCacheLRU.makeRoomLRU() method, I have not changed the code to clear 
> 2/3 of the cache but instead I have changed the comments to indicate that 1/3 
> of the cache is cleared.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8720) Integer overflow bug in NameIntCacheLRU.makeRoomLRU()

2019-03-12 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16790705#comment-16790705
 ] 

ASF subversion and git services commented on LUCENE-8720:
-

Commit c1bea96cf9c6929be717306f04b9b467e58de68d in lucene-solr's branch 
refs/heads/master from Michael McCandless
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=c1bea96 ]

LUCENE-8720: fix int overflow in NameIntCacheLRU


> Integer overflow bug in NameIntCacheLRU.makeRoomLRU()
> -
>
> Key: LUCENE-8720
> URL: https://issues.apache.org/jira/browse/LUCENE-8720
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/search
>Affects Versions: 7.7.1
> Environment: Mac OS X 10.11.6 but this bug is not affected by the 
> environment because it is a straightforward integer overflow bug.
>Reporter: Russell A Brown
>Priority: Major
>  Labels: easyfix, patch
> Fix For: 7.1.1
>
> Attachments: LUCENE-.patch
>
>
> The NameIntCacheLRU.makeRoomLRU() method has an integer overflow bug because 
> if maxCacheSize >= Integer.MAX_VALUE/2, 2*maxCacheSize will overflow to 
> -(2^30) and the value of n will overflow to a negative integer as well, which 
> will prevent any clearing of the cache whatsoever. Hence, performance will 
> degrade once the cache becomes full because it will be impossible to remove 
> any entries in order to add new entries to the cache.
> Moreover, comments in NameIntCacheLRU.java and LruTaxonomyWriterCache.java 
> indicate that 2/3 of the cache will be cleared, whereas in fact only 1/3 of 
> the cache is cleared. So as not to change the behavior of the 
> NameIntCacheLRU.makeRoomLRU() method, I have not changed the code to clear 
> 2/3 of the cache but instead I have changed the comments to indicate that 1/3 
> of the cache is cleared.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Amazon/Netflix/Expedia and "Open Distro for Elasticsearch"

2019-03-12 Thread Michael McCandless
Elastic's response:
https://www.elastic.co/blog/on-open-distros-open-source-and-building-a-company

Mike McCandless

http://blog.mikemccandless.com


On Mon, Mar 11, 2019 at 2:15 PM Pedram Rezaei 
wrote:

>
> https://aws.amazon.com/blogs/opensource/keeping-open-source-open-open-distro-for-elasticsearch/
>
>
>
> Very interesting!
>


[jira] [Commented] (LUCENE-8717) Handle stop words that appear at articulation points

2019-03-12 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16790698#comment-16790698
 ] 

Michael McCandless commented on LUCENE-8717:


+1 for {{TermDeletedAttribute}}.

Are we also fixing {{StopFilter}} to set {{TermDeletedAttribute}}?  Would this 
mean that a {{SynonymFilter}} trying to match a synonym containing a stop word 
would now match even when {{StopFilter}} before it marked the token deleted?

> Handle stop words that appear at articulation points
> 
>
> Key: LUCENE-8717
> URL: https://issues.apache.org/jira/browse/LUCENE-8717
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Alan Woodward
>Assignee: Alan Woodward
>Priority: Major
> Attachments: LUCENE-8717.patch, LUCENE-8717.patch
>
>
> Our set of TokenFilters currently cannot handle the case where a multi-term 
> synonym starts with a stopword.  This means that given a synonym file 
> containing the mapping "the walking dead => twd" and a standard english 
> stopword filter, QueryBuilder will produce incorrect queries.
> The tricky part here is that our standard way of dealing with stopwords, 
> which is to just remove them entirely from the token stream and use a larger 
> position increment on subsequent tokens, doesn't work when the removed token 
> also has a position length greater than 1.  There are various tricks you can 
> do to increment position length on the previous token, but this doesn't work 
> if the stopword is the first token in the token stream, or if there are 
> multiple stopwords in the side path.
> Instead, I'd like to propose adding a new TermDeletedAttribute, which we only 
> use on tokens that should be removed from the stream but which hold necessary 
> information about the structure of the token graph.  These tokens can then be 
> removed by GraphTokenStreamFiniteStrings at query time, and by 
> FlattenGraphFilter at index time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8692) IndexWriter.getTragicException() may not reflect all corrupting exceptions (notably: NoSuchFileException)

2019-03-12 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16790690#comment-16790690
 ] 

Michael McCandless commented on LUCENE-8692:


{{rollback}} gives you a way to close {{IndexWriter}} without doing a commit, 
which seems useful.  If you removed that, what would users do instead?

> IndexWriter.getTragicException() may not reflect all corrupting exceptions 
> (notably: NoSuchFileException)
> -
>
> Key: LUCENE-8692
> URL: https://issues.apache.org/jira/browse/LUCENE-8692
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Hoss Man
>Priority: Major
> Attachments: LUCENE-8692.patch, LUCENE-8692.patch, LUCENE-8692.patch, 
> LUCENE-8692_test.patch
>
>
> Backstory...
> Solr has a "LeaderTragicEventTest" which uses MockDirectoryWrapper's 
> {{corruptFiles}} to introduce corruption into the "leader" node's index and 
> then assert that this solr node gives up it's leadership of the shard and 
> another replica takes over.
> This can currently fail sporadically (but usually reproducibly - see 
> SOLR-13237) due to the leader not giving up it's leadership even after the 
> corruption causes an update/commit to fail. Solr's leadership code makes this 
> decision after encountering an exception from the IndexWriter based on wether 
> {{IndexWriter.getTragicException()}} is (non-)null.
> 
> While investigating this, I created an isolated Lucene-Core equivilent test 
> that demonstrates the same basic situation:
>  * Gradually cause corruption on an index untill (otherwise) valid execution 
> of IW.add() + IW.commit() calls throw an exception to the IW client.
>  * assert that if an exception is thrown to the IW client, 
> {{getTragicException()}} is now non-null.
> It's fairly easy to make my new test fail reproducibly – in every situation 
> I've seen the underlying exception is a {{NoSuchFileException}} (ie: the 
> randomly introduced corruption was to delete some file).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-12891) Injection Dangers in Streaming Expressions

2019-03-12 Thread Gus Heck (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-12891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gus Heck resolved SOLR-12891.
-
   Resolution: Fixed
 Assignee: (was: Gus Heck)
Fix Version/s: 8.1
   master (9.0)

Finally got back to this. Tweaked the final patch slightly to allow it to pass 
a couple of unit tests in core, fix CHANGES.txt.

> Injection Dangers in Streaming Expressions
> --
>
> Key: SOLR-12891
> URL: https://issues.apache.org/jira/browse/SOLR-12891
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: streaming expressions
>Affects Versions: 7.5, 8.0
>Reporter: Gus Heck
>Priority: Minor
>  Labels: security
> Fix For: master (9.0), 8.1
>
> Attachments: SOLR-12891.patch, SOLR-12891.patch, SOLR-12891.patch, 
> SOLR-12891.patch, SOLR12819example.java
>
>
> I just spent some time fiddling with streaming expressions for fun, reading 
> Erick Erickson's blog 
> ([https://lucidworks.com/2017/12/06/streaming-expressions-in-solrj/)] and the 
> example given in the ref guide 
> ([https://lucene.apache.org/solr/guide/7_5/streaming-expressions.html#streaming-requests-and-responses)]
>  and it occurred to me that we are recommending string concatenation into an 
> expression language with the power to harm the server, or other network 
> services visible from the server. I'm starting this Jira as a security issue 
> to avoid creating a public impression of insecurity, feel free to undo that 
> if I have guessed wrong. I haven't developed an exploit example, but it would 
> go something like this:
>  # Some portion of an expression is built including user supplied data using 
> the techniques we're recommending in the ref guide
>  # Malicious user constructs input data that breaks out of the expression 
> (SOLR-10894 is relevant here), probably somewhere inside a let() expression 
> where one could simply define an additional variable taking the value of a 
> malicious expression...
>  # update() expression is executed to add/overwrite data, jdbc() makes a JDBC 
> connection to a database visible to the server, or the malicious expression 
> executes some very expensive expression for DOS effect.
> Technically this is of course the fault of the end user who allowed unchecked 
> input into programmatic execution, but when I think about how to check the 
> input I realize that the only way to be sure is to construct for myself a 
> notion of exactly how the parser behaves and then determine what needs to be 
> escaped. To do this I need to dig into the expression parser code...
> How to escape input is also already unclear as shown by SOLR-10894
> There's another important wrinkle that would easily be missed by someone 
> trying to construct their own escaping/protection system relating to 
> parameter substitution as discussed here: SOLR-8458 
> I think the solution to this is that SolrJ API should be enhanced to provide 
> an escaping utility at a minimum and possibly a "prepared expression" similar 
> to SQL prepared statements and call this issue to attention in the ref guide 
> once these tools are available... 
> Additionally, templating features might be a useful addition to help folks 
> manage large expressions and facilitate re-use of patterns... such templating 
> should also have this issue in mind when/if they are added.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12891) Injection Dangers in Streaming Expressions

2019-03-12 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16790685#comment-16790685
 ] 

ASF subversion and git services commented on SOLR-12891:


Commit 470813143d3b3a31232de2788a987b01a742c67e in lucene-solr's branch 
refs/heads/branch_8x from Gus Heck
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=4708131 ]

SOLR-12891 MacroExpander will no longer will expand URL parameters by
default inside of the 'expr' parameter, add InjectionDefense class
for safer handling of untrusted data in streaming expressions and add
-DStreamingExpressionMacros system property to revert to legacy behavior

(cherry picked from commit 9edc557f4526ffbbf35daea06972eb2c595e692b)


> Injection Dangers in Streaming Expressions
> --
>
> Key: SOLR-12891
> URL: https://issues.apache.org/jira/browse/SOLR-12891
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: streaming expressions
>Affects Versions: 7.5, 8.0
>Reporter: Gus Heck
>Assignee: Gus Heck
>Priority: Minor
>  Labels: security
> Attachments: SOLR-12891.patch, SOLR-12891.patch, SOLR-12891.patch, 
> SOLR-12891.patch, SOLR12819example.java
>
>
> I just spent some time fiddling with streaming expressions for fun, reading 
> Erick Erickson's blog 
> ([https://lucidworks.com/2017/12/06/streaming-expressions-in-solrj/)] and the 
> example given in the ref guide 
> ([https://lucene.apache.org/solr/guide/7_5/streaming-expressions.html#streaming-requests-and-responses)]
>  and it occurred to me that we are recommending string concatenation into an 
> expression language with the power to harm the server, or other network 
> services visible from the server. I'm starting this Jira as a security issue 
> to avoid creating a public impression of insecurity, feel free to undo that 
> if I have guessed wrong. I haven't developed an exploit example, but it would 
> go something like this:
>  # Some portion of an expression is built including user supplied data using 
> the techniques we're recommending in the ref guide
>  # Malicious user constructs input data that breaks out of the expression 
> (SOLR-10894 is relevant here), probably somewhere inside a let() expression 
> where one could simply define an additional variable taking the value of a 
> malicious expression...
>  # update() expression is executed to add/overwrite data, jdbc() makes a JDBC 
> connection to a database visible to the server, or the malicious expression 
> executes some very expensive expression for DOS effect.
> Technically this is of course the fault of the end user who allowed unchecked 
> input into programmatic execution, but when I think about how to check the 
> input I realize that the only way to be sure is to construct for myself a 
> notion of exactly how the parser behaves and then determine what needs to be 
> escaped. To do this I need to dig into the expression parser code...
> How to escape input is also already unclear as shown by SOLR-10894
> There's another important wrinkle that would easily be missed by someone 
> trying to construct their own escaping/protection system relating to 
> parameter substitution as discussed here: SOLR-8458 
> I think the solution to this is that SolrJ API should be enhanced to provide 
> an escaping utility at a minimum and possibly a "prepared expression" similar 
> to SQL prepared statements and call this issue to attention in the ref guide 
> once these tools are available... 
> Additionally, templating features might be a useful addition to help folks 
> manage large expressions and facilitate re-use of patterns... such templating 
> should also have this issue in mind when/if they are added.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8720) Integer overflow bug in NameIntCacheLRU.makeRoomLRU()

2019-03-12 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16790681#comment-16790681
 ] 

Michael McCandless commented on LUCENE-8720:


Thanks [~kirigirisu], nice catch – I'll pass tests and push soon.

> Integer overflow bug in NameIntCacheLRU.makeRoomLRU()
> -
>
> Key: LUCENE-8720
> URL: https://issues.apache.org/jira/browse/LUCENE-8720
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/search
>Affects Versions: 7.7.1
> Environment: Mac OS X 10.11.6 but this bug is not affected by the 
> environment because it is a straightforward integer overflow bug.
>Reporter: Russell A Brown
>Priority: Major
>  Labels: easyfix, patch
> Fix For: 7.1.1
>
> Attachments: LUCENE-.patch
>
>
> The NameIntCacheLRU.makeRoomLRU() method has an integer overflow bug because 
> if maxCacheSize >= Integer.MAX_VALUE/2, 2*maxCacheSize will overflow to 
> -(2^30) and the value of n will overflow to a negative integer as well, which 
> will prevent any clearing of the cache whatsoever. Hence, performance will 
> degrade once the cache becomes full because it will be impossible to remove 
> any entries in order to add new entries to the cache.
> Moreover, comments in NameIntCacheLRU.java and LruTaxonomyWriterCache.java 
> indicate that 2/3 of the cache will be cleared, whereas in fact only 1/3 of 
> the cache is cleared. So as not to change the behavior of the 
> NameIntCacheLRU.makeRoomLRU() method, I have not changed the code to clear 
> 2/3 of the cache but instead I have changed the comments to indicate that 1/3 
> of the cache is cleared.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8542) Provide the LeafSlice to CollectorManager.newCollector to save memory on small index slices

2019-03-12 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16790677#comment-16790677
 ] 

Michael McCandless commented on LUCENE-8542:


Maybe we should try swapping in the JDK's {{PriorityQueue}} and measure if this 
really hurts search throughput?

> Provide the LeafSlice to CollectorManager.newCollector to save memory on 
> small index slices
> ---
>
> Key: LUCENE-8542
> URL: https://issues.apache.org/jira/browse/LUCENE-8542
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/search
>Reporter: Christoph Kaser
>Priority: Minor
> Attachments: LUCENE-8542.patch
>
>
> I have an index consisting of 44 million documents spread across 60 segments. 
> When I run a query against this index with a huge number of results requested 
> (e.g. 5 million), this query uses more than 5 GB of heap if the IndexSearch 
> was configured to use an ExecutorService.
> (I know this kind of query is fairly unusual and it would be better to use 
> paging and searchAfter, but our architecture does not allow this at the 
> moment.)
> The reason for the huge memory requirement is that the search [will create a 
> TopScoreDocCollector for each 
> segment|https://github.com/apache/lucene-solr/blob/master/lucene/core/src/java/org/apache/lucene/search/IndexSearcher.java#L404],
>  each one with numHits = 5 million. This is fine for the large segments, but 
> many of those segments are fairly small and only contain several thousand 
> documents. This wastes a huge amount of memory for queries with large values 
> of numHits on indices with many segments.
> Therefore, I propose to change the CollectorManager - interface in the 
> following way:
>  * change the method newCollector to accept a parameter LeafSlice that can be 
> used to determine the total count of documents in the LeafSlice
>  * Maybe, in order to remain backwards compatible, it would be possible to 
> introduce this as a new method with a default implementation that calls the 
> old method - otherwise, it probably has to wait for Lucene 8?
>  * This can then be used to cap numHits for each TopScoreDocCollector to the 
> leafslice-size.
> If this is something that would make sense for you, I can try to provide a 
> patch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr] s1monw commented on a change in pull request #606: LUCENE-8671: Allow more fine-grained control over off-heap term dictionaries

2019-03-12 Thread GitBox
s1monw commented on a change in pull request #606: LUCENE-8671: Allow more 
fine-grained control over off-heap term dictionaries
URL: https://github.com/apache/lucene-solr/pull/606#discussion_r264726504
 
 

 ##
 File path: 
lucene/core/src/java/org/apache/lucene/store/ByteBuffersIndexInput.java
 ##
 @@ -193,5 +193,10 @@ private void ensureOpen() {
 if (in == null) {
   throw new AlreadyClosedException("Already closed.");
 }
-  }  
+  }
+
+  @Override
+  public boolean isMMapped() {
+return true;
 
 Review comment:
   it's rather mmapped or in ram so `true` is fine?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr] dweiss commented on a change in pull request #606: LUCENE-8671: Allow more fine-grained control over off-heap term dictionaries

2019-03-12 Thread GitBox
dweiss commented on a change in pull request #606: LUCENE-8671: Allow more 
fine-grained control over off-heap term dictionaries
URL: https://github.com/apache/lucene-solr/pull/606#discussion_r264720893
 
 

 ##
 File path: 
lucene/core/src/java/org/apache/lucene/store/ByteBuffersIndexInput.java
 ##
 @@ -193,5 +193,10 @@ private void ensureOpen() {
 if (in == null) {
   throw new AlreadyClosedException("Already closed.");
 }
-  }  
+  }
+
+  @Override
+  public boolean isMMapped() {
+return true;
 
 Review comment:
   Hi Simon. Whether this should return true depends on what byte buffers are 
used? The same applies to ByteBufferIndexInput, actually... I don't think you 
can generally tell whether the ByteBuffers the input operates on come from a 
mmap call or from somewhere else (even direct buffers don't have to be a result 
of mmap).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr] s1monw commented on issue #601: Adding reader settings for moving fst offheap

2019-03-12 Thread GitBox
s1monw commented on issue #601: Adding reader settings for moving fst offheap
URL: https://github.com/apache/lucene-solr/pull/601#issuecomment-472032681
 
 
   I tired a different way today and I wonder what you think of this #606 (note 
I am currently getting a 404 on it - seems a github issue)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12891) Injection Dangers in Streaming Expressions

2019-03-12 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16790623#comment-16790623
 ] 

ASF subversion and git services commented on SOLR-12891:


Commit 9edc557f4526ffbbf35daea06972eb2c595e692b in lucene-solr's branch 
refs/heads/master from Gus Heck
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=9edc557 ]

SOLR-12891 MacroExpander will no longer will expand URL parameters by
default inside of the 'expr' parameter, add InjectionDefense class
for safer handling of untrusted data in streaming expressions and add
-DStreamingExpressionMacros system property to revert to legacy behavior


> Injection Dangers in Streaming Expressions
> --
>
> Key: SOLR-12891
> URL: https://issues.apache.org/jira/browse/SOLR-12891
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: streaming expressions
>Affects Versions: 7.5, 8.0
>Reporter: Gus Heck
>Assignee: Gus Heck
>Priority: Minor
>  Labels: security
> Attachments: SOLR-12891.patch, SOLR-12891.patch, SOLR-12891.patch, 
> SOLR-12891.patch, SOLR12819example.java
>
>
> I just spent some time fiddling with streaming expressions for fun, reading 
> Erick Erickson's blog 
> ([https://lucidworks.com/2017/12/06/streaming-expressions-in-solrj/)] and the 
> example given in the ref guide 
> ([https://lucene.apache.org/solr/guide/7_5/streaming-expressions.html#streaming-requests-and-responses)]
>  and it occurred to me that we are recommending string concatenation into an 
> expression language with the power to harm the server, or other network 
> services visible from the server. I'm starting this Jira as a security issue 
> to avoid creating a public impression of insecurity, feel free to undo that 
> if I have guessed wrong. I haven't developed an exploit example, but it would 
> go something like this:
>  # Some portion of an expression is built including user supplied data using 
> the techniques we're recommending in the ref guide
>  # Malicious user constructs input data that breaks out of the expression 
> (SOLR-10894 is relevant here), probably somewhere inside a let() expression 
> where one could simply define an additional variable taking the value of a 
> malicious expression...
>  # update() expression is executed to add/overwrite data, jdbc() makes a JDBC 
> connection to a database visible to the server, or the malicious expression 
> executes some very expensive expression for DOS effect.
> Technically this is of course the fault of the end user who allowed unchecked 
> input into programmatic execution, but when I think about how to check the 
> input I realize that the only way to be sure is to construct for myself a 
> notion of exactly how the parser behaves and then determine what needs to be 
> escaped. To do this I need to dig into the expression parser code...
> How to escape input is also already unclear as shown by SOLR-10894
> There's another important wrinkle that would easily be missed by someone 
> trying to construct their own escaping/protection system relating to 
> parameter substitution as discussed here: SOLR-8458 
> I think the solution to this is that SolrJ API should be enhanced to provide 
> an escaping utility at a minimum and possibly a "prepared expression" similar 
> to SQL prepared statements and call this issue to attention in the ref guide 
> once these tools are available... 
> Additionally, templating features might be a useful addition to help folks 
> manage large expressions and facilitate re-use of patterns... such templating 
> should also have this issue in mind when/if they are added.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr] s1monw opened a new pull request #606: LUCENE-8671: Allow more fine-grained control over off-heap term dictionaries

2019-03-12 Thread GitBox
s1monw opened a new pull request #606: LUCENE-8671: Allow more fine-grained 
control over off-heap term dictionaries
URL: https://github.com/apache/lucene-solr/pull/606
 
 
   This change allows to control if term-dictionaries are loaded off-heap or on
   heap on a per reader basis. None NRT readers will access all term 
dictionaries
   off heap including ID fields while readers that require fast ID access like 
all
   readers used within index writer will by default only load non-ID like fields
   heap. Additionally, IOContext has now the ability to specify if ram-usage 
should be
   minimized and can control the off vs. on-heap decisions on a per-reader 
basis.
   
   Off heap term dictionaries are still only used if the index input is memory 
mapped.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1690) JSONKeyValueTokenizerFactory -- JSON Tokenizer

2019-03-12 Thread Anatoly Konstantinov (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16790613#comment-16790613
 ] 

Anatoly Konstantinov commented on SOLR-1690:


+ 1. Guys, is there progress on committing this ticket into the repo?

> JSONKeyValueTokenizerFactory -- JSON Tokenizer
> --
>
> Key: SOLR-1690
> URL: https://issues.apache.org/jira/browse/SOLR-1690
> Project: Solr
>  Issue Type: New Feature
>  Components: Schema and Analysis
>Reporter: Ryan McKinley
>Priority: Minor
> Attachments: SOLR-1690-JSONKeyValueTokenizerFactory.patch, 
> noggit-1.0-A1.jar
>
>
> Sometimes it is nice to group structured data into a single field.
> This (rough) patch, takes JSON input and indexes tokens based on the key 
> values pairs in the json.
> {code:xml|title=schema.xml}
> 
>  omitNorms="true">
>   
>  hierarchicalKey="false"/>
> 
> 
>   
>   
> 
> 
> 
>   
> 
> {code}
> Given text:
> {code}
>  { "hello": "world", "rank":5 }
> {code}
> indexed as two tokens:
> || term position |1 | 2 |
> || term text |hello:world | rank:5 |
> || term type |word |  word |
> || source start,end | 12,17   | 27,28 |



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-1690) JSONKeyValueTokenizerFactory -- JSON Tokenizer

2019-03-12 Thread Anatoly Konstantinov (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16790613#comment-16790613
 ] 

Anatoly Konstantinov edited comment on SOLR-1690 at 3/12/19 2:38 PM:
-

+ 1. Guys, is there any progress on committing this ticket into the repo?


was (Author: anatoliy4041):
+ 1. Guys, is there progress on committing this ticket into the repo?

> JSONKeyValueTokenizerFactory -- JSON Tokenizer
> --
>
> Key: SOLR-1690
> URL: https://issues.apache.org/jira/browse/SOLR-1690
> Project: Solr
>  Issue Type: New Feature
>  Components: Schema and Analysis
>Reporter: Ryan McKinley
>Priority: Minor
> Attachments: SOLR-1690-JSONKeyValueTokenizerFactory.patch, 
> noggit-1.0-A1.jar
>
>
> Sometimes it is nice to group structured data into a single field.
> This (rough) patch, takes JSON input and indexes tokens based on the key 
> values pairs in the json.
> {code:xml|title=schema.xml}
> 
>  omitNorms="true">
>   
>  hierarchicalKey="false"/>
> 
> 
>   
>   
> 
> 
> 
>   
> 
> {code}
> Given text:
> {code}
>  { "hello": "world", "rank":5 }
> {code}
> indexed as two tokens:
> || term position |1 | 2 |
> || term text |hello:world | rank:5 |
> || term type |word |  word |
> || source start,end | 12,17   | 27,28 |



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8542) Provide the LeafSlice to CollectorManager.newCollector to save memory on small index slices

2019-03-12 Thread Christoph Kaser (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16790611#comment-16790611
 ] 

Christoph Kaser commented on LUCENE-8542:
-

While it's true the slice size is a bad upper bound, the change does help: As 
you can see in the [table in my 
comment|https://issues.apache.org/jira/browse/LUCENE-8542?focusedCommentId=16704391=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16704391],
 it reduces the heap requirement by 90% in my use case, due to the large number 
of small slices.

Making PriorityQueue growable would certainly be a better solution, however it 
is much harder to do this without affecting the "sane" use case performance.

> Provide the LeafSlice to CollectorManager.newCollector to save memory on 
> small index slices
> ---
>
> Key: LUCENE-8542
> URL: https://issues.apache.org/jira/browse/LUCENE-8542
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/search
>Reporter: Christoph Kaser
>Priority: Minor
> Attachments: LUCENE-8542.patch
>
>
> I have an index consisting of 44 million documents spread across 60 segments. 
> When I run a query against this index with a huge number of results requested 
> (e.g. 5 million), this query uses more than 5 GB of heap if the IndexSearch 
> was configured to use an ExecutorService.
> (I know this kind of query is fairly unusual and it would be better to use 
> paging and searchAfter, but our architecture does not allow this at the 
> moment.)
> The reason for the huge memory requirement is that the search [will create a 
> TopScoreDocCollector for each 
> segment|https://github.com/apache/lucene-solr/blob/master/lucene/core/src/java/org/apache/lucene/search/IndexSearcher.java#L404],
>  each one with numHits = 5 million. This is fine for the large segments, but 
> many of those segments are fairly small and only contain several thousand 
> documents. This wastes a huge amount of memory for queries with large values 
> of numHits on indices with many segments.
> Therefore, I propose to change the CollectorManager - interface in the 
> following way:
>  * change the method newCollector to accept a parameter LeafSlice that can be 
> used to determine the total count of documents in the LeafSlice
>  * Maybe, in order to remain backwards compatible, it would be possible to 
> introduce this as a new method with a default implementation that calls the 
> old method - otherwise, it probably has to wait for Lucene 8?
>  * This can then be used to cap numHits for each TopScoreDocCollector to the 
> leafslice-size.
> If this is something that would make sense for you, I can try to provide a 
> patch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8542) Provide the LeafSlice to CollectorManager.newCollector to save memory on small index slices

2019-03-12 Thread Adrien Grand (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16790607#comment-16790607
 ] 

Adrien Grand commented on LUCENE-8542:
--

I don't think this change really helps as the number of documents in a slice is 
a pretty bad upper bound of the total number of documents that will be 
collected. If we want to better support this use-case, I'd rather like that we 
look into making PriorityQueue growable.

> Provide the LeafSlice to CollectorManager.newCollector to save memory on 
> small index slices
> ---
>
> Key: LUCENE-8542
> URL: https://issues.apache.org/jira/browse/LUCENE-8542
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/search
>Reporter: Christoph Kaser
>Priority: Minor
> Attachments: LUCENE-8542.patch
>
>
> I have an index consisting of 44 million documents spread across 60 segments. 
> When I run a query against this index with a huge number of results requested 
> (e.g. 5 million), this query uses more than 5 GB of heap if the IndexSearch 
> was configured to use an ExecutorService.
> (I know this kind of query is fairly unusual and it would be better to use 
> paging and searchAfter, but our architecture does not allow this at the 
> moment.)
> The reason for the huge memory requirement is that the search [will create a 
> TopScoreDocCollector for each 
> segment|https://github.com/apache/lucene-solr/blob/master/lucene/core/src/java/org/apache/lucene/search/IndexSearcher.java#L404],
>  each one with numHits = 5 million. This is fine for the large segments, but 
> many of those segments are fairly small and only contain several thousand 
> documents. This wastes a huge amount of memory for queries with large values 
> of numHits on indices with many segments.
> Therefore, I propose to change the CollectorManager - interface in the 
> following way:
>  * change the method newCollector to accept a parameter LeafSlice that can be 
> used to determine the total count of documents in the LeafSlice
>  * Maybe, in order to remain backwards compatible, it would be possible to 
> introduce this as a new method with a default implementation that calls the 
> old method - otherwise, it probably has to wait for Lucene 8?
>  * This can then be used to cap numHits for each TopScoreDocCollector to the 
> leafslice-size.
> If this is something that would make sense for you, I can try to provide a 
> patch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8542) Provide the LeafSlice to CollectorManager.newCollector to save memory on small index slices

2019-03-12 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16790576#comment-16790576
 ] 

Michael McCandless commented on LUCENE-8542:


I think the core API change is quite minor and reasonable – letting the 
{{Collector.newCollector}} know which segments (slice) it will collect?  E.g. 
we already pass the {{LeafReaderContext}} to {{Collector.newLeafCollector}} so 
it's informed about the details of which segment it's about to collect.

 

I agree the motivating use case here is somewhat abusive, and a custom 
Collector is probably needed anyway, but I think this API change could help 
non-abusive cases too.

Alternatively we could explore fixing our default top hits collectors to not 
pre-allocate the full topN for every slice ... that is really unexpected 
behavior, and users have tripped up on this multiple times in the past causing 
us to make some partial fixes for it.

> Provide the LeafSlice to CollectorManager.newCollector to save memory on 
> small index slices
> ---
>
> Key: LUCENE-8542
> URL: https://issues.apache.org/jira/browse/LUCENE-8542
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/search
>Reporter: Christoph Kaser
>Priority: Minor
> Attachments: LUCENE-8542.patch
>
>
> I have an index consisting of 44 million documents spread across 60 segments. 
> When I run a query against this index with a huge number of results requested 
> (e.g. 5 million), this query uses more than 5 GB of heap if the IndexSearch 
> was configured to use an ExecutorService.
> (I know this kind of query is fairly unusual and it would be better to use 
> paging and searchAfter, but our architecture does not allow this at the 
> moment.)
> The reason for the huge memory requirement is that the search [will create a 
> TopScoreDocCollector for each 
> segment|https://github.com/apache/lucene-solr/blob/master/lucene/core/src/java/org/apache/lucene/search/IndexSearcher.java#L404],
>  each one with numHits = 5 million. This is fine for the large segments, but 
> many of those segments are fairly small and only contain several thousand 
> documents. This wastes a huge amount of memory for queries with large values 
> of numHits on indices with many segments.
> Therefore, I propose to change the CollectorManager - interface in the 
> following way:
>  * change the method newCollector to accept a parameter LeafSlice that can be 
> used to determine the total count of documents in the LeafSlice
>  * Maybe, in order to remain backwards compatible, it would be possible to 
> introduce this as a new method with a default implementation that calls the 
> old method - otherwise, it probably has to wait for Lucene 8?
>  * This can then be used to cap numHits for each TopScoreDocCollector to the 
> leafslice-size.
> If this is something that would make sense for you, I can try to provide a 
> patch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13040) Harden TestSQLHandler.

2019-03-12 Thread Mikhail Khludnev (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16790563#comment-16790563
 ] 

Mikhail Khludnev commented on SOLR-13040:
-

again 
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-master-Windows/7771/
Java: 32bit/jdk1.8.0_172 -client -XX:+UseParallelGC

6 tests failed.
FAILED:  org.apache.solr.handler.TestSQLHandler.doTest

Error Message:
--> http://127.0.0.1:56507/ilynu/collection1_shard1_replica_n1:Failed to 
execute sqlQuery 'select id, field_i, str_s, field_i_p, field_f_p, field_d_p, 
field_l_p from collection1 where (text='()' OR text='') AND text='' 
order by field_i desc' against JDBC connection 'jdbc:calcitesolr:'. Error while 
executing SQL "select id, field_i, str_s, field_i_p, field_f_p, field_d_p, 
field_l_p from collection1 where (text='()' OR text='') AND text='' 
order by field_i desc": java.io.IOException: 
java.util.concurrent.ExecutionException: java.io.IOException: --> 
http://127.0.0.1:56489/ilynu/collection1_shard2_replica_n2/:id{type=string,properties=indexed,stored,sortMissingLast,uninvertible}
 must have DocValues to use this feature.

Stack Trace:
java.io.IOException: --> 
http://127.0.0.1:56507/ilynu/collection1_shard1_replica_n1:Failed to execute 
sqlQuery 'select id, field_i, str_s, field_i_p, field_f_p, field_d_p, field_l_p 
from collection1 where (text='()' OR text='') AND text='' order by 
field_i desc' against JDBC connection 'jdbc:calcitesolr:'.
Error while executing SQL "select id, field_i, str_s, field_i_p, field_f_p, 
field_d_p, field_l_p from collection1 where (text='()' OR text='') AND 
text='' order by field_i desc": java.io.IOException: 
java.util.concurrent.ExecutionException: java.io.IOException: --> 
http://127.0.0.1:56489/ilynu/collection1_shard2_replica_n2/:id{type=string,properties=indexed,stored,sortMissingLast,uninvertible}
 must have DocValues to use this feature.
at 
__randomizedtesting.SeedInfo.seed([BD7F25E0646FE183:1A3B9D4409D4F23A]:0)
at 
org.apache.solr.client.solrj.io.stream.SolrStream.read(SolrStream.java:215)
at 
org.apache.solr.handler.TestSQLHandler.getTuples(TestSQLHandler.java:2517)
at 
org.apache.solr.handler.TestSQLHandler.testBasicSelect(TestSQLHandler.java:148)

> Harden TestSQLHandler.
> --
>
> Key: SOLR-13040
> URL: https://issues.apache.org/jira/browse/SOLR-13040
> Project: Solr
>  Issue Type: Sub-task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Mark Miller
>Assignee: Joel Bernstein
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13040) Harden TestSQLHandler.

2019-03-12 Thread Mikhail Khludnev (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16790562#comment-16790562
 ] 

Mikhail Khludnev commented on SOLR-13040:
-

again 
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-master-Windows/7771/
Java: 32bit/jdk1.8.0_172 -client -XX:+UseParallelGC

6 tests failed.
FAILED:  org.apache.solr.handler.TestSQLHandler.doTest

Error Message:
--> http://127.0.0.1:56507/ilynu/collection1_shard1_replica_n1:Failed to 
execute sqlQuery 'select id, field_i, str_s, field_i_p, field_f_p, field_d_p, 
field_l_p from collection1 where (text='()' OR text='') AND text='' 
order by field_i desc' against JDBC connection 'jdbc:calcitesolr:'. Error while 
executing SQL "select id, field_i, str_s, field_i_p, field_f_p, field_d_p, 
field_l_p from collection1 where (text='()' OR text='') AND text='' 
order by field_i desc": java.io.IOException: 
java.util.concurrent.ExecutionException: java.io.IOException: --> 
http://127.0.0.1:56489/ilynu/collection1_shard2_replica_n2/:id{type=string,properties=indexed,stored,sortMissingLast,uninvertible}
 must have DocValues to use this feature.

Stack Trace:
java.io.IOException: --> 
http://127.0.0.1:56507/ilynu/collection1_shard1_replica_n1:Failed to execute 
sqlQuery 'select id, field_i, str_s, field_i_p, field_f_p, field_d_p, field_l_p 
from collection1 where (text='()' OR text='') AND text='' order by 
field_i desc' against JDBC connection 'jdbc:calcitesolr:'.
Error while executing SQL "select id, field_i, str_s, field_i_p, field_f_p, 
field_d_p, field_l_p from collection1 where (text='()' OR text='') AND 
text='' order by field_i desc": java.io.IOException: 
java.util.concurrent.ExecutionException: java.io.IOException: --> 
http://127.0.0.1:56489/ilynu/collection1_shard2_replica_n2/:id{type=string,properties=indexed,stored,sortMissingLast,uninvertible}
 must have DocValues to use this feature.
at 
__randomizedtesting.SeedInfo.seed([BD7F25E0646FE183:1A3B9D4409D4F23A]:0)
at 
org.apache.solr.client.solrj.io.stream.SolrStream.read(SolrStream.java:215)
at 
org.apache.solr.handler.TestSQLHandler.getTuples(TestSQLHandler.java:2517)
at 
org.apache.solr.handler.TestSQLHandler.testBasicSelect(TestSQLHandler.java:148)

> Harden TestSQLHandler.
> --
>
> Key: SOLR-13040
> URL: https://issues.apache.org/jira/browse/SOLR-13040
> Project: Solr
>  Issue Type: Sub-task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Mark Miller
>Assignee: Joel Bernstein
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8542) Provide the LeafSlice to CollectorManager.newCollector to save memory on small index slices

2019-03-12 Thread Christoph Kaser (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16790560#comment-16790560
 ] 

Christoph Kaser commented on LUCENE-8542:
-

That's too bad, given that this is only a minor change to an experimental API 
(and does not cause extra work in the reasonable use case). But I understand 
your reasons.

I may try to build such a collector when I find the time (though I suspect this 
may involve quite a lot of code duplication if no changes to the core should be 
made) - for now we simply limit the amount of concurrent queries with huge 
values of numHits so they fit into the heap.

> Provide the LeafSlice to CollectorManager.newCollector to save memory on 
> small index slices
> ---
>
> Key: LUCENE-8542
> URL: https://issues.apache.org/jira/browse/LUCENE-8542
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/search
>Reporter: Christoph Kaser
>Priority: Minor
> Attachments: LUCENE-8542.patch
>
>
> I have an index consisting of 44 million documents spread across 60 segments. 
> When I run a query against this index with a huge number of results requested 
> (e.g. 5 million), this query uses more than 5 GB of heap if the IndexSearch 
> was configured to use an ExecutorService.
> (I know this kind of query is fairly unusual and it would be better to use 
> paging and searchAfter, but our architecture does not allow this at the 
> moment.)
> The reason for the huge memory requirement is that the search [will create a 
> TopScoreDocCollector for each 
> segment|https://github.com/apache/lucene-solr/blob/master/lucene/core/src/java/org/apache/lucene/search/IndexSearcher.java#L404],
>  each one with numHits = 5 million. This is fine for the large segments, but 
> many of those segments are fairly small and only contain several thousand 
> documents. This wastes a huge amount of memory for queries with large values 
> of numHits on indices with many segments.
> Therefore, I propose to change the CollectorManager - interface in the 
> following way:
>  * change the method newCollector to accept a parameter LeafSlice that can be 
> used to determine the total count of documents in the LeafSlice
>  * Maybe, in order to remain backwards compatible, it would be possible to 
> introduce this as a new method with a default implementation that calls the 
> old method - otherwise, it probably has to wait for Lucene 8?
>  * This can then be used to cap numHits for each TopScoreDocCollector to the 
> leafslice-size.
> If this is something that would make sense for you, I can try to provide a 
> patch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-master-Windows (32bit/jdk1.8.0_172) - Build # 7771 - Unstable!

2019-03-12 Thread Policeman Jenkins Server
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-master-Windows/7771/
Java: 32bit/jdk1.8.0_172 -client -XX:+UseParallelGC

6 tests failed.
FAILED:  org.apache.solr.handler.TestSQLHandler.doTest

Error Message:
--> http://127.0.0.1:56507/ilynu/collection1_shard1_replica_n1:Failed to 
execute sqlQuery 'select id, field_i, str_s, field_i_p, field_f_p, field_d_p, 
field_l_p from collection1 where (text='()' OR text='') AND text='' 
order by field_i desc' against JDBC connection 'jdbc:calcitesolr:'. Error while 
executing SQL "select id, field_i, str_s, field_i_p, field_f_p, field_d_p, 
field_l_p from collection1 where (text='()' OR text='') AND text='' 
order by field_i desc": java.io.IOException: 
java.util.concurrent.ExecutionException: java.io.IOException: --> 
http://127.0.0.1:56489/ilynu/collection1_shard2_replica_n2/:id{type=string,properties=indexed,stored,sortMissingLast,uninvertible}
 must have DocValues to use this feature.

Stack Trace:
java.io.IOException: --> 
http://127.0.0.1:56507/ilynu/collection1_shard1_replica_n1:Failed to execute 
sqlQuery 'select id, field_i, str_s, field_i_p, field_f_p, field_d_p, field_l_p 
from collection1 where (text='()' OR text='') AND text='' order by 
field_i desc' against JDBC connection 'jdbc:calcitesolr:'.
Error while executing SQL "select id, field_i, str_s, field_i_p, field_f_p, 
field_d_p, field_l_p from collection1 where (text='()' OR text='') AND 
text='' order by field_i desc": java.io.IOException: 
java.util.concurrent.ExecutionException: java.io.IOException: --> 
http://127.0.0.1:56489/ilynu/collection1_shard2_replica_n2/:id{type=string,properties=indexed,stored,sortMissingLast,uninvertible}
 must have DocValues to use this feature.
at 
__randomizedtesting.SeedInfo.seed([BD7F25E0646FE183:1A3B9D4409D4F23A]:0)
at 
org.apache.solr.client.solrj.io.stream.SolrStream.read(SolrStream.java:215)
at 
org.apache.solr.handler.TestSQLHandler.getTuples(TestSQLHandler.java:2517)
at 
org.apache.solr.handler.TestSQLHandler.testBasicSelect(TestSQLHandler.java:148)
at org.apache.solr.handler.TestSQLHandler.doTest(TestSQLHandler.java:99)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1750)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:938)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:974)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:988)
at 
org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsFixedStatement.callStatement(BaseDistributedSearchTestCase.java:1082)
at 
org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsStatement.evaluate(BaseDistributedSearchTestCase.java:1054)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:947)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:832)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:883)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:894)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 

[jira] [Commented] (LUCENE-8542) Provide the LeafSlice to CollectorManager.newCollector to save memory on small index slices

2019-03-12 Thread Adrien Grand (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16790497#comment-16790497
 ] 

Adrien Grand commented on LUCENE-8542:
--

I think this goes against the use-case that Lucene is designed to solve indeed, 
which is to compute the top-k matches of a query for some reasonable value of 
k. I'd be ok with eg. having a collector in the sandbox that works better when 
collecting thousands of hits, but I'd like to avoid making changes to the core 
only to support this use-case.

> Provide the LeafSlice to CollectorManager.newCollector to save memory on 
> small index slices
> ---
>
> Key: LUCENE-8542
> URL: https://issues.apache.org/jira/browse/LUCENE-8542
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/search
>Reporter: Christoph Kaser
>Priority: Minor
> Attachments: LUCENE-8542.patch
>
>
> I have an index consisting of 44 million documents spread across 60 segments. 
> When I run a query against this index with a huge number of results requested 
> (e.g. 5 million), this query uses more than 5 GB of heap if the IndexSearch 
> was configured to use an ExecutorService.
> (I know this kind of query is fairly unusual and it would be better to use 
> paging and searchAfter, but our architecture does not allow this at the 
> moment.)
> The reason for the huge memory requirement is that the search [will create a 
> TopScoreDocCollector for each 
> segment|https://github.com/apache/lucene-solr/blob/master/lucene/core/src/java/org/apache/lucene/search/IndexSearcher.java#L404],
>  each one with numHits = 5 million. This is fine for the large segments, but 
> many of those segments are fairly small and only contain several thousand 
> documents. This wastes a huge amount of memory for queries with large values 
> of numHits on indices with many segments.
> Therefore, I propose to change the CollectorManager - interface in the 
> following way:
>  * change the method newCollector to accept a parameter LeafSlice that can be 
> used to determine the total count of documents in the LeafSlice
>  * Maybe, in order to remain backwards compatible, it would be possible to 
> introduce this as a new method with a default implementation that calls the 
> old method - otherwise, it probably has to wait for Lucene 8?
>  * This can then be used to cap numHits for each TopScoreDocCollector to the 
> leafslice-size.
> If this is something that would make sense for you, I can try to provide a 
> patch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8542) Provide the LeafSlice to CollectorManager.newCollector to save memory on small index slices

2019-03-12 Thread Christoph Kaser (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16790485#comment-16790485
 ] 

Christoph Kaser commented on LUCENE-8542:
-

Is there anything I can change / add to get this committed? Or do you think it 
makes no sense for the general use case of lucene?

> Provide the LeafSlice to CollectorManager.newCollector to save memory on 
> small index slices
> ---
>
> Key: LUCENE-8542
> URL: https://issues.apache.org/jira/browse/LUCENE-8542
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/search
>Reporter: Christoph Kaser
>Priority: Minor
> Attachments: LUCENE-8542.patch
>
>
> I have an index consisting of 44 million documents spread across 60 segments. 
> When I run a query against this index with a huge number of results requested 
> (e.g. 5 million), this query uses more than 5 GB of heap if the IndexSearch 
> was configured to use an ExecutorService.
> (I know this kind of query is fairly unusual and it would be better to use 
> paging and searchAfter, but our architecture does not allow this at the 
> moment.)
> The reason for the huge memory requirement is that the search [will create a 
> TopScoreDocCollector for each 
> segment|https://github.com/apache/lucene-solr/blob/master/lucene/core/src/java/org/apache/lucene/search/IndexSearcher.java#L404],
>  each one with numHits = 5 million. This is fine for the large segments, but 
> many of those segments are fairly small and only contain several thousand 
> documents. This wastes a huge amount of memory for queries with large values 
> of numHits on indices with many segments.
> Therefore, I propose to change the CollectorManager - interface in the 
> following way:
>  * change the method newCollector to accept a parameter LeafSlice that can be 
> used to determine the total count of documents in the LeafSlice
>  * Maybe, in order to remain backwards compatible, it would be possible to 
> introduce this as a new method with a default implementation that calls the 
> old method - otherwise, it probably has to wait for Lucene 8?
>  * This can then be used to cap numHits for each TopScoreDocCollector to the 
> leafslice-size.
> If this is something that would make sense for you, I can try to provide a 
> patch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-NightlyTests-8.x - Build # 42 - Unstable

2019-03-12 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-NightlyTests-8.x/42/

1 tests failed.
FAILED:  org.apache.lucene.search.TestSynonymQuery.testBoosts

Error Message:
expected:<1.1163563> but was:<1.0560191>

Stack Trace:
java.lang.AssertionError: expected:<1.1163563> but was:<1.0560191>
at 
__randomizedtesting.SeedInfo.seed([7398C84F47B19A9B:DA6116D9359D1A4D]:0)
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:834)
at org.junit.Assert.assertEquals(Assert.java:575)
at org.junit.Assert.assertEquals(Assert.java:700)
at 
org.apache.lucene.search.TestSynonymQuery.doTestBoosts(TestSynonymQuery.java:205)
at 
org.apache.lucene.search.TestSynonymQuery.testBoosts(TestSynonymQuery.java:157)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1750)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:938)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:974)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:988)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:947)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:832)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:883)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:894)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at java.lang.Thread.run(Thread.java:748)




Build Log:
[...truncated 1027 lines...]
   [junit4] Suite: org.apache.lucene.search.TestSynonymQuery
   [junit4]   2> NOTE: download the large Jenkins line-docs file by running 
'ant get-jenkins-line-docs' in the lucene directory.
   [junit4]   2> NOTE: reproduce with: ant test  -Dtestcase=TestSynonymQuery 
-Dtests.method=testBoosts -Dtests.seed=7398C84F47B19A9B -Dtests.multiplier=2 
-Dtests.nightly=true -Dtests.slow=true 

[jira] [Assigned] (SOLR-13315) Possible SolrIndexSearcher leak through LogWatcher and FunctionScoreQuery

2019-03-12 Thread Alan Woodward (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-13315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Woodward reassigned SOLR-13315:


Assignee: Alan Woodward

> Possible SolrIndexSearcher leak through LogWatcher and FunctionScoreQuery
> -
>
> Key: SOLR-13315
> URL: https://issues.apache.org/jira/browse/SOLR-13315
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 7.5
>Reporter: Yury Pakhomov
>Assignee: Alan Woodward
>Priority: Major
> Attachments: SOLR-13315.patch, path_to_gc_root_from_heap_dump.png
>
>
> Here is possible leak of SolrIndexSearcher. Which prevents unused searchers 
> to be reclaimed by gc.
> This problem was found after analyzing heap dump which was created before 
> Full GC.
> 1) Where unused ref to SolrIndexSearcher is stored.
> Log4j2Watcher implements LogWatcher
>  and has CircularList history inherited from LogWatcher
> In history we can store Log4jLogEvent which can hold ref to 
> ParameterizedMessage
>  and ParameterizedMessage stores refs to all arguments of event log. (here we 
> can store objects which are no longer in use directly or indirectly)
> 2) How SolrIndexSearcher can be indirectly reached through this log buffer.
> If during FunctionScoreQuery execution ExitingReaderException("The request 
> took too long to iterate over terms. Timeout: " ..) will be thrown this query 
> will be logged with warn level and it ref will be store in Log4j2Watcher.
>  (Here can be any exception which will log this query to Log4j2Watcher)
> In general it should be ok but in this case FunctionScoreQuery indirectly 
> stores ref to SolrIndexSearcher.
> As the result we have refs to already closed searchers which are no longer in 
> use.
>  Searcher has refs to caches (Docs, Filers, results ...) and they can not be 
> reclaimed by gc.
> 3) How SolrIndexSearcher can be accessed through FunctionScoreQuery
> There is a FunctionScoreQuery which can hold ref to 
> MultiplicativeBoostValuesSource
>  which can hold ref to WrappedDoubleValuesSource
>  and the last one can hold ref to SolrIndexSearcher.
> public final class FunctionScoreQuery extends Query
> { ... private final DoubleValuesSource source; ... }
> private static class MultiplicativeBoostValuesSource extends 
> DoubleValuesSource
> { private final DoubleValuesSource boost; ... }
> private static class WrappedDoubleValuesSource extends DoubleValuesSource
> { private final ValueSource in; private IndexSearcher searcher; ... }
> Actually any DoubleValuesSource implementation which stores ref to 
> IndexSearcher 
>  on method call
> public abstract DoubleValuesSource rewrite(IndexSearcher reader) throws 
> IOException;
> can couse a problem if it will be logged via Log4j2Watcher.
> 4) How to temporary solve this problem
>  It is possible to disable Log4j2Watcher in solr.xml
> 5) How to fix this issue in more reliable way ?
>  I think that it is very dangerous to buffer refs to log messages arguments.
>  And may be Log4j2Watcher should be reworked to avoid buffering refs but 
> LoggingHandler depends on Log4j2Watcher.
> But may be there are better ways to solve this issue.
> Path to gc root is attached.
>  !path_to_gc_root_from_heap_dump.png|width=811,height=387!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-8722) Simplify relate logic on EdgeTree

2019-03-12 Thread Ignacio Vera (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-8722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ignacio Vera updated LUCENE-8722:
-
Summary: Simplify relate logic on EdgeTree  (was: Simplify relate login on 
EdgeTree)

> Simplify relate logic on EdgeTree
> -
>
> Key: LUCENE-8722
> URL: https://issues.apache.org/jira/browse/LUCENE-8722
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Ignacio Vera
>Priority: Trivial
>
> Currently Edge tree contains three methods for {{relate}}: relate, 
> internalComponentRelate and componentRelate.
> {{internalComponentRelate}} does not bring any benefit and it is trivial to 
> merge the logic it contains into the {{relate}} method.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-13315) Possible SolrIndexSearcher leak through LogWatcher and FunctionScoreQuery

2019-03-12 Thread Alan Woodward (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-13315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Woodward updated SOLR-13315:
-
Attachment: SOLR-13315.patch

> Possible SolrIndexSearcher leak through LogWatcher and FunctionScoreQuery
> -
>
> Key: SOLR-13315
> URL: https://issues.apache.org/jira/browse/SOLR-13315
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 7.5
>Reporter: Yury Pakhomov
>Priority: Major
> Attachments: SOLR-13315.patch, path_to_gc_root_from_heap_dump.png
>
>
> Here is possible leak of SolrIndexSearcher. Which prevents unused searchers 
> to be reclaimed by gc.
> This problem was found after analyzing heap dump which was created before 
> Full GC.
> 1) Where unused ref to SolrIndexSearcher is stored.
> Log4j2Watcher implements LogWatcher
>  and has CircularList history inherited from LogWatcher
> In history we can store Log4jLogEvent which can hold ref to 
> ParameterizedMessage
>  and ParameterizedMessage stores refs to all arguments of event log. (here we 
> can store objects which are no longer in use directly or indirectly)
> 2) How SolrIndexSearcher can be indirectly reached through this log buffer.
> If during FunctionScoreQuery execution ExitingReaderException("The request 
> took too long to iterate over terms. Timeout: " ..) will be thrown this query 
> will be logged with warn level and it ref will be store in Log4j2Watcher.
>  (Here can be any exception which will log this query to Log4j2Watcher)
> In general it should be ok but in this case FunctionScoreQuery indirectly 
> stores ref to SolrIndexSearcher.
> As the result we have refs to already closed searchers which are no longer in 
> use.
>  Searcher has refs to caches (Docs, Filers, results ...) and they can not be 
> reclaimed by gc.
> 3) How SolrIndexSearcher can be accessed through FunctionScoreQuery
> There is a FunctionScoreQuery which can hold ref to 
> MultiplicativeBoostValuesSource
>  which can hold ref to WrappedDoubleValuesSource
>  and the last one can hold ref to SolrIndexSearcher.
> public final class FunctionScoreQuery extends Query
> { ... private final DoubleValuesSource source; ... }
> private static class MultiplicativeBoostValuesSource extends 
> DoubleValuesSource
> { private final DoubleValuesSource boost; ... }
> private static class WrappedDoubleValuesSource extends DoubleValuesSource
> { private final ValueSource in; private IndexSearcher searcher; ... }
> Actually any DoubleValuesSource implementation which stores ref to 
> IndexSearcher 
>  on method call
> public abstract DoubleValuesSource rewrite(IndexSearcher reader) throws 
> IOException;
> can couse a problem if it will be logged via Log4j2Watcher.
> 4) How to temporary solve this problem
>  It is possible to disable Log4j2Watcher in solr.xml
> 5) How to fix this issue in more reliable way ?
>  I think that it is very dangerous to buffer refs to log messages arguments.
>  And may be Log4j2Watcher should be reworked to avoid buffering refs but 
> LoggingHandler depends on Log4j2Watcher.
> But may be there are better ways to solve this issue.
> Path to gc root is attached.
>  !path_to_gc_root_from_heap_dump.png|width=811,height=387!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-8722) Simplify relate login on EdgeTree

2019-03-12 Thread Ignacio Vera (JIRA)
Ignacio Vera created LUCENE-8722:


 Summary: Simplify relate login on EdgeTree
 Key: LUCENE-8722
 URL: https://issues.apache.org/jira/browse/LUCENE-8722
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Ignacio Vera


Currently Edge tree contains three methods for {{relate}}: relate, 
internalComponentRelate and componentRelate.

{{internalComponentRelate}} does not bring any benefit and it is trivial to 
merge the logic it contains into the {{relate}} method.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-8631) How Nori Tokenizer can deal with Longest-Matching

2019-03-12 Thread Jim Ferenczi (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-8631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Ferenczi resolved LUCENE-8631.
--
   Resolution: Fixed
Fix Version/s: 8.1
   master (9.0)

Thanks [~gritmind]!

> How Nori Tokenizer can deal with Longest-Matching
> -
>
> Key: LUCENE-8631
> URL: https://issues.apache.org/jira/browse/LUCENE-8631
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/analysis
>Reporter: Yeongsu Kim
>Priority: Major
> Fix For: master (9.0), 8.1
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> I think... Nori tokenizer has one issue. 
> I don’t understand why “Longest-Matching” is NOT working to Nori tokenizer 
> via config mode (config mode: 
> [https://www.elastic.co/guide/en/elasticsearch/plugins/6.x/analysis-nori-tokenizer.html).]
>  
> Here is an example for explaining what is longest-matching.
> Let assume we have `userdict_ko.txt` including only three Korean single-words 
> such as ‘골드’, ‘브라운’, ‘골드브라운’, and save it to Nori analyzer. After update, we 
> can see that it outputs two tokens such as ‘골드’ and ‘브라운’, when the input is 
> ‘골드브라운’. (In English: ‘골드’ means ‘gold’, ‘브라운’ means ‘brown’, and ‘골드브라운’ 
> means ‘goldbrown’)
>  
> With this result, we recognize that “Longest-Matching” is NOT working. If 
> “Longest-Matching” is working, the output must be ‘골드브라운’, which is the 
> longest matching word in the user dictionary.
>  
> Curiously enough, when we add user dictionary via custom mode (custom mode: 
> [https://github.com/jimczi/nori/blob/master/how-to-custom-dict.asciidoc|https://apac01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fjimczi%2Fnori%2Fblob%2Fmaster%2Fhow-to-custom-dict.asciidoc=02%7C01%7Chigh_yeongsu%40wemakeprice.com%7C6953d739414e4da5ad1408d67473a6fe%7C6322d5f522044e9d9ca6d18828a04daf%7C0%7C0%7C636824437418170758=5iuNvKr8WJCXlCkJQrf5r3BgDVnF5hpG7l%2BQL0Ok7Aw%3D=0]),
>  we found the result is ‘골드브라운’, where ‘Longest-Matching’ is applied. We 
> think the reason is because learned Mecab engine automatically generates word 
> costs by its own criteria. We hope this mechanism is also applied to config 
> mode.
>  
> Would you tell me the way to “Longest-Matching” via config mode (not custom) 
> or give me some hints (e.g. where to modify source codes) to solve this 
> problem?
>  
> P.S
> Recently, I've mailed to [~jim.ferenczi], who is a developer of Nori, and 
> received his suggestions:
>    - Add a way to set a score to each new rule (this way you could set up a 
> negative cost for the compound word that is less than the sum of the two 
> single words.
>    - Same as above but the cost is computed from the statistics of the 
> training (like the custom dictionary does when you recompile entirely).
>    - Implement longest-match first in the dictionary.
>  
> Thanks for your support.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr] jimczi closed pull request #576: LUCENE-8631: Longest-Matching for User words in Nori Tokenizer

2019-03-12 Thread GitBox
jimczi closed pull request #576: LUCENE-8631: Longest-Matching for User words 
in Nori Tokenizer
URL: https://github.com/apache/lucene-solr/pull/576
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr] jimczi commented on issue #576: LUCENE-8631: Longest-Matching for User words in Nori Tokenizer

2019-03-12 Thread GitBox
jimczi commented on issue #576: LUCENE-8631: Longest-Matching for User words in 
Nori Tokenizer
URL: https://github.com/apache/lucene-solr/pull/576#issuecomment-471928338
 
 
   I merged this change in master and 8x: 
https://gitbox.apache.org/repos/asf?p=lucene-solr.git;a=commit;h=b1f870a4164769df62b24af63048aa2f9b21af47.
 Thanks @gritmind!


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8631) How Nori Tokenizer can deal with Longest-Matching

2019-03-12 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16790391#comment-16790391
 ] 

ASF subversion and git services commented on LUCENE-8631:
-

Commit 8d0652451ea4ed9d0285fb5f8c7568c058c6730b in lucene-solr's branch 
refs/heads/branch_8x from Yeongsu Kim
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=8d06524 ]

LUCENE-8631: The Korean user dictionary now picks the longest-matching word and 
discards the other matches.


> How Nori Tokenizer can deal with Longest-Matching
> -
>
> Key: LUCENE-8631
> URL: https://issues.apache.org/jira/browse/LUCENE-8631
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/analysis
>Reporter: Yeongsu Kim
>Priority: Major
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> I think... Nori tokenizer has one issue. 
> I don’t understand why “Longest-Matching” is NOT working to Nori tokenizer 
> via config mode (config mode: 
> [https://www.elastic.co/guide/en/elasticsearch/plugins/6.x/analysis-nori-tokenizer.html).]
>  
> Here is an example for explaining what is longest-matching.
> Let assume we have `userdict_ko.txt` including only three Korean single-words 
> such as ‘골드’, ‘브라운’, ‘골드브라운’, and save it to Nori analyzer. After update, we 
> can see that it outputs two tokens such as ‘골드’ and ‘브라운’, when the input is 
> ‘골드브라운’. (In English: ‘골드’ means ‘gold’, ‘브라운’ means ‘brown’, and ‘골드브라운’ 
> means ‘goldbrown’)
>  
> With this result, we recognize that “Longest-Matching” is NOT working. If 
> “Longest-Matching” is working, the output must be ‘골드브라운’, which is the 
> longest matching word in the user dictionary.
>  
> Curiously enough, when we add user dictionary via custom mode (custom mode: 
> [https://github.com/jimczi/nori/blob/master/how-to-custom-dict.asciidoc|https://apac01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fjimczi%2Fnori%2Fblob%2Fmaster%2Fhow-to-custom-dict.asciidoc=02%7C01%7Chigh_yeongsu%40wemakeprice.com%7C6953d739414e4da5ad1408d67473a6fe%7C6322d5f522044e9d9ca6d18828a04daf%7C0%7C0%7C636824437418170758=5iuNvKr8WJCXlCkJQrf5r3BgDVnF5hpG7l%2BQL0Ok7Aw%3D=0]),
>  we found the result is ‘골드브라운’, where ‘Longest-Matching’ is applied. We 
> think the reason is because learned Mecab engine automatically generates word 
> costs by its own criteria. We hope this mechanism is also applied to config 
> mode.
>  
> Would you tell me the way to “Longest-Matching” via config mode (not custom) 
> or give me some hints (e.g. where to modify source codes) to solve this 
> problem?
>  
> P.S
> Recently, I've mailed to [~jim.ferenczi], who is a developer of Nori, and 
> received his suggestions:
>    - Add a way to set a score to each new rule (this way you could set up a 
> negative cost for the compound word that is less than the sum of the two 
> single words.
>    - Same as above but the cost is computed from the statistics of the 
> training (like the custom dictionary does when you recompile entirely).
>    - Implement longest-match first in the dictionary.
>  
> Thanks for your support.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



ISSUE:solrj “org.apache.solr.common.util.SimpleOrderedMap cannot be cast to java.util.Map” exception when using “/suggest” handler

2019-03-12 Thread praveenraj 4ever
HI Team,
Can you please look into this issue as raised in StackOverflow.

https://stackoverflow.com/questions/55115760/solrj-org-apache-solr-common-util-simpleorderedmap-cannot-be-cast-to-java-util


Regards,
Praveenraj D,
9566067066


[jira] [Commented] (LUCENE-8721) LatLonShapePolygon and LineQuery fail on shared dateline queries

2019-03-12 Thread Ignacio Vera (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16790389#comment-16790389
 ] 

Ignacio Vera commented on LUCENE-8721:
--

In addition, it won't work for all cases for triangles as the coordinates are 
on the encoded space, so 180 becomes 179.9991618097 so one side of the 
equality will never be true.

+1 to simplify the logic on EdgeTree. It is trivial to merge the methods 
{{relate}} &  {{internalComponentRelate}} and it makes sense. I will open an 
issue.

> LatLonShapePolygon and LineQuery fail on shared dateline queries
> 
>
> Key: LUCENE-8721
> URL: https://issues.apache.org/jira/browse/LUCENE-8721
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Nicholas Knize
>Priority: Major
> Attachments: LUCENE-8721.patch
>
>
> Indexed shapes should be returned with search geometries that share the 
> dateline on the opposite hemisphere. 
> For example:
> {code:java}
>   public void testSharedDateline() throws Exception {
>  index /
> Directory dir = newDirectory();
> RandomIndexWriter w = new RandomIndexWriter(random(), dir);
> Document doc = new Document();
> // index western hemisphere geometry
> Polygon indexPoly = new Polygon(
> new double[] {-7.5d, 15d, 15d, 0d, -7.5d},
> new double[] {-180d, -180d, -176d, -176d, -180d}
> );
> Field[] fields = LatLonShape.createIndexableFields("test", indexPoly);
> for (Field f : fields) {
>   doc.add(f);
> }
> w.addDocument(doc);
> w.forceMerge(1);
> / search //
> IndexReader reader = w.getReader();
> w.close();
> IndexSearcher searcher = newSearcher(reader);
> // search w/ eastern hemisphere geometry that shares the dateline
> Polygon searchPoly = new Polygon(new double[] {-7.5d, 15d, 15d, 0d, 
> -7.5d},
> new double[] {180d, 180d, 170d, 170d, 180d});
> Query q = LatLonShape.newPolygonQuery("test", QueryRelation.INTERSECTS, 
> searchPoly);
> assertEquals(1, searcher.count(q));
> IOUtils.close(w, reader, dir);
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13315) Possible SolrIndexSearcher leak through LogWatcher and FunctionScoreQuery

2019-03-12 Thread Alan Woodward (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16790388#comment-16790388
 ] 

Alan Woodward commented on SOLR-13315:
--

Here's a patch containing a fix and test.  Thanks for raising the issue 
[~ypakhomov], would you be able to try this patch out and check that it solves 
your problem?

> Possible SolrIndexSearcher leak through LogWatcher and FunctionScoreQuery
> -
>
> Key: SOLR-13315
> URL: https://issues.apache.org/jira/browse/SOLR-13315
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 7.5
>Reporter: Yury Pakhomov
>Priority: Major
> Attachments: SOLR-13315.patch, path_to_gc_root_from_heap_dump.png
>
>
> Here is possible leak of SolrIndexSearcher. Which prevents unused searchers 
> to be reclaimed by gc.
> This problem was found after analyzing heap dump which was created before 
> Full GC.
> 1) Where unused ref to SolrIndexSearcher is stored.
> Log4j2Watcher implements LogWatcher
>  and has CircularList history inherited from LogWatcher
> In history we can store Log4jLogEvent which can hold ref to 
> ParameterizedMessage
>  and ParameterizedMessage stores refs to all arguments of event log. (here we 
> can store objects which are no longer in use directly or indirectly)
> 2) How SolrIndexSearcher can be indirectly reached through this log buffer.
> If during FunctionScoreQuery execution ExitingReaderException("The request 
> took too long to iterate over terms. Timeout: " ..) will be thrown this query 
> will be logged with warn level and it ref will be store in Log4j2Watcher.
>  (Here can be any exception which will log this query to Log4j2Watcher)
> In general it should be ok but in this case FunctionScoreQuery indirectly 
> stores ref to SolrIndexSearcher.
> As the result we have refs to already closed searchers which are no longer in 
> use.
>  Searcher has refs to caches (Docs, Filers, results ...) and they can not be 
> reclaimed by gc.
> 3) How SolrIndexSearcher can be accessed through FunctionScoreQuery
> There is a FunctionScoreQuery which can hold ref to 
> MultiplicativeBoostValuesSource
>  which can hold ref to WrappedDoubleValuesSource
>  and the last one can hold ref to SolrIndexSearcher.
> public final class FunctionScoreQuery extends Query
> { ... private final DoubleValuesSource source; ... }
> private static class MultiplicativeBoostValuesSource extends 
> DoubleValuesSource
> { private final DoubleValuesSource boost; ... }
> private static class WrappedDoubleValuesSource extends DoubleValuesSource
> { private final ValueSource in; private IndexSearcher searcher; ... }
> Actually any DoubleValuesSource implementation which stores ref to 
> IndexSearcher 
>  on method call
> public abstract DoubleValuesSource rewrite(IndexSearcher reader) throws 
> IOException;
> can couse a problem if it will be logged via Log4j2Watcher.
> 4) How to temporary solve this problem
>  It is possible to disable Log4j2Watcher in solr.xml
> 5) How to fix this issue in more reliable way ?
>  I think that it is very dangerous to buffer refs to log messages arguments.
>  And may be Log4j2Watcher should be reworked to avoid buffering refs but 
> LoggingHandler depends on Log4j2Watcher.
> But may be there are better ways to solve this issue.
> Path to gc root is attached.
>  !path_to_gc_root_from_heap_dump.png|width=811,height=387!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8652) Add boosting support in the SynonymQuery

2019-03-12 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16790371#comment-16790371
 ] 

ASF subversion and git services commented on LUCENE-8652:
-

Commit 4d351cf06605d512040dbc0c48f113f2bdc29e67 in lucene-solr's branch 
refs/heads/branch_8x from Jim Ferenczi
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=4d351cf ]

LUCENE-8652: remove unused import


> Add boosting support in the SynonymQuery
> 
>
> Key: LUCENE-8652
> URL: https://issues.apache.org/jira/browse/LUCENE-8652
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Jim Ferenczi
>Priority: Minor
> Fix For: master (9.0), 8.1
>
> Attachments: LUCENE-8652.patch, LUCENE-8652.patch
>
>
> The SynonymQuery tries to score multiple terms as if you had indexed them as 
> one term.
> This is good for "true" synonyms where each term should have the same 
> contribution to the final score but this doesn't handle the case where terms 
> have different weights. For scoring purpose it would be nice to be able to 
> assign a boost per term that we could multiply with the term's document 
> frequency in order to take into account the importance of the term within the 
> synonym list.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8652) Add boosting support in the SynonymQuery

2019-03-12 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16790370#comment-16790370
 ] 

ASF subversion and git services commented on LUCENE-8652:
-

Commit b2c83de361403c646bac837541788cbe8add73ca in lucene-solr's branch 
refs/heads/master from Jim Ferenczi
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=b2c83de ]

LUCENE-8652: remove unused import


> Add boosting support in the SynonymQuery
> 
>
> Key: LUCENE-8652
> URL: https://issues.apache.org/jira/browse/LUCENE-8652
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Jim Ferenczi
>Priority: Minor
> Fix For: master (9.0), 8.1
>
> Attachments: LUCENE-8652.patch, LUCENE-8652.patch
>
>
> The SynonymQuery tries to score multiple terms as if you had indexed them as 
> one term.
> This is good for "true" synonyms where each term should have the same 
> contribution to the final score but this doesn't handle the case where terms 
> have different weights. For scoring purpose it would be nice to be able to 
> assign a boost per term that we could multiply with the term's document 
> frequency in order to take into account the importance of the term within the 
> synonym list.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8631) How Nori Tokenizer can deal with Longest-Matching

2019-03-12 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16790369#comment-16790369
 ] 

ASF subversion and git services commented on LUCENE-8631:
-

Commit b1f870a4164769df62b24af63048aa2f9b21af47 in lucene-solr's branch 
refs/heads/master from Yeongsu Kim
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=b1f870a ]

LUCENE-8631: The Korean user dictionary now picks the longest-matching word and 
discards the other matches.


> How Nori Tokenizer can deal with Longest-Matching
> -
>
> Key: LUCENE-8631
> URL: https://issues.apache.org/jira/browse/LUCENE-8631
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/analysis
>Reporter: Yeongsu Kim
>Priority: Major
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> I think... Nori tokenizer has one issue. 
> I don’t understand why “Longest-Matching” is NOT working to Nori tokenizer 
> via config mode (config mode: 
> [https://www.elastic.co/guide/en/elasticsearch/plugins/6.x/analysis-nori-tokenizer.html).]
>  
> Here is an example for explaining what is longest-matching.
> Let assume we have `userdict_ko.txt` including only three Korean single-words 
> such as ‘골드’, ‘브라운’, ‘골드브라운’, and save it to Nori analyzer. After update, we 
> can see that it outputs two tokens such as ‘골드’ and ‘브라운’, when the input is 
> ‘골드브라운’. (In English: ‘골드’ means ‘gold’, ‘브라운’ means ‘brown’, and ‘골드브라운’ 
> means ‘goldbrown’)
>  
> With this result, we recognize that “Longest-Matching” is NOT working. If 
> “Longest-Matching” is working, the output must be ‘골드브라운’, which is the 
> longest matching word in the user dictionary.
>  
> Curiously enough, when we add user dictionary via custom mode (custom mode: 
> [https://github.com/jimczi/nori/blob/master/how-to-custom-dict.asciidoc|https://apac01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fjimczi%2Fnori%2Fblob%2Fmaster%2Fhow-to-custom-dict.asciidoc=02%7C01%7Chigh_yeongsu%40wemakeprice.com%7C6953d739414e4da5ad1408d67473a6fe%7C6322d5f522044e9d9ca6d18828a04daf%7C0%7C0%7C636824437418170758=5iuNvKr8WJCXlCkJQrf5r3BgDVnF5hpG7l%2BQL0Ok7Aw%3D=0]),
>  we found the result is ‘골드브라운’, where ‘Longest-Matching’ is applied. We 
> think the reason is because learned Mecab engine automatically generates word 
> costs by its own criteria. We hope this mechanism is also applied to config 
> mode.
>  
> Would you tell me the way to “Longest-Matching” via config mode (not custom) 
> or give me some hints (e.g. where to modify source codes) to solve this 
> problem?
>  
> P.S
> Recently, I've mailed to [~jim.ferenczi], who is a developer of Nori, and 
> received his suggestions:
>    - Add a way to set a score to each new rule (this way you could set up a 
> negative cost for the compound word that is less than the sum of the two 
> single words.
>    - Same as above but the cost is computed from the statistics of the 
> training (like the custom dictionary does when you recompile entirely).
>    - Implement longest-match first in the dictionary.
>  
> Thanks for your support.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-13283) when facet with certain terms via {!terms}, facet.limit, facet.offset does not work

2019-03-12 Thread superman369 (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16790348#comment-16790348
 ] 

superman369 edited comment on SOLR-13283 at 3/12/19 8:52 AM:
-

@[~munendrasn], thanks for your reply!    I  encountered such a demand. For 
example : I use  \{!terms} multivalued field facet,so facet result include all 
the multivalued field values, 

among them are not required。 So I need to Incoming field values with   
\{!terms} , in order to filter facet value.  For example:   mutivalued field:   
"location.authorIds":["1","2","3","4"]

I need to facet with the field(location.authorIds) , but I want to  filter  
location.authorIds = "1" and location.authorIds = "2" , others not required.

So I do it like this: 

 [facet.field 
|http://192.168.106.62:8219/hljzyydxyf-solr/search/select?facet.field=]= 
\{!terms="1,2"}location.authorIds=on=search.resourcetype:2=**:**=count
 .

if \{!terms=""} 

include many values, I need to  pagination with facet.limit and facet.offset.   
 Could you understand me? Thank you very much.

 

 


was (Author: superman369):
@[~munendrasn], thanks for your reply!    I  encountered such a demand. For 
example : I use  \{!terms} multivalued field facet,so facet result include all 
the multivalued field values, 

among them are not required。 So I need to Incoming field values with   
\{!terms} , in order to filter facet value.  For example:   mutivalued field:   
"location.authorIds":["1","2","3","4"]

I need to facet with the field(location.authorIds) , but I want to  filter  
location.authorIds = "1" and location.authorIds = "2" , others not required.

So I do it like this: 

 [facet.field 
|http://192.168.106.62:8219/hljzyydxyf-solr/search/select?facet.field=]= 
\{!terms="1,2"}location.authorIds=on=search.resourcetype:2=*:*=0=count
 .

if \{!terms=""} 

include many values, I need to  pagination with facet.limit and facet.offset.   
 Could you understand me? Thank you very much.

 

 

> when facet with certain terms via {!terms}, facet.limit, facet.offset does 
> not work
> ---
>
> Key: SOLR-13283
> URL: https://issues.apache.org/jira/browse/SOLR-13283
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Facet Module
>Affects Versions: 7.7
>Reporter: superman369
>Priority: Major
>
> h4.  I do a test in solr7.7.   Limiting field facet with certain terms via 
> \{!terms}  is work use facet.sort=count, but facet.limit, facet.offset does 
> not work. What's wrong?
> h4. for example: 
> facet.field=\{!terms='1453,1452,1248'}location.authorIds=true=1=1,
>   facet result  : "location.authorIds" has 3 items.  I think it should  have 
> one item. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-13283) when facet with certain terms via {!terms}, facet.limit, facet.offset does not work

2019-03-12 Thread superman369 (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16790348#comment-16790348
 ] 

superman369 edited comment on SOLR-13283 at 3/12/19 8:51 AM:
-

@[~munendrasn], thanks for your reply!    I  encountered such a demand. For 
example : I use  \{!terms} multivalued field facet,so facet result include all 
the multivalued field values, 

among them are not required。 So I need to Incoming field values with   
\{!terms} , in order to filter facet value.  For example:   mutivalued field:   
"location.authorIds":["1","2","3","4"]

I need to facet with the field(location.authorIds) , but I want to  filter  
location.authorIds = "1" and location.authorIds = "2" , others not required.

So I do it like this: 

 [facet.field 
|http://192.168.106.62:8219/hljzyydxyf-solr/search/select?facet.field=]= 
\{!terms="1,2"}location.authorIds=on=search.resourcetype:2=*:*=0=count
 .

if \{!terms=""} 

include many values, I need to  pagination with facet.limit and facet.offset.   
 Could you understand me? Thank you very much.

 

 


was (Author: superman369):
@[~munendrasn], thanks for your reply!    I  encountered such a demand. For 
example : I use  \{!terms} multivalued field facet,so facet result include all 
the multivalued field values, 

among them are not required。 So I need to Incoming field values with   
\{!terms} , in order to filter facet value.  For example:   mutivalued field:   
"location.authorIds":["1","2","3","4"]

I need to facet with the field(location.authorIds) , but I want to  filter  
location.authorIds = "1" and location.authorIds = "2" , others not required.

So I do it like this: 

 [facet.field 
|http://192.168.106.62:8219/hljzyydxyf-solr/search/select?facet.field=]= 
\{!terms="1,2"}location.authorIds=on=search.resourcetype:2=*:*=0=count
 .

if {!terms=""}include many values, I need to  pagination with facet.limit and 
facet.offset.    Could you understand me? Thank you very much.

 

 

> when facet with certain terms via {!terms}, facet.limit, facet.offset does 
> not work
> ---
>
> Key: SOLR-13283
> URL: https://issues.apache.org/jira/browse/SOLR-13283
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Facet Module
>Affects Versions: 7.7
>Reporter: superman369
>Priority: Major
>
> h4.  I do a test in solr7.7.   Limiting field facet with certain terms via 
> \{!terms}  is work use facet.sort=count, but facet.limit, facet.offset does 
> not work. What's wrong?
> h4. for example: 
> facet.field=\{!terms='1453,1452,1248'}location.authorIds=true=1=1,
>   facet result  : "location.authorIds" has 3 items.  I think it should  have 
> one item. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13315) Possible SolrIndexSearcher leak through LogWatcher and FunctionScoreQuery

2019-03-12 Thread Alan Woodward (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16790358#comment-16790358
 ] 

Alan Woodward commented on SOLR-13315:
--

I think the error here is in WrappedDoubleValuesSource (again - this has been a 
pain to get right!).  DVS.rewrite() is called by 
FunctionScoreQuery#createWeight() and the returned IndexSearcher-specific 
reference is not stored on the query.  Most searcher-specific implementations 
return new objects, but WrappedDoubleValuesSource.rewrite() returns itself 
(after caching the Searcher reference), which is where the leak occurs.  I'll 
prepare a patch.

> Possible SolrIndexSearcher leak through LogWatcher and FunctionScoreQuery
> -
>
> Key: SOLR-13315
> URL: https://issues.apache.org/jira/browse/SOLR-13315
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 7.5
>Reporter: Yury Pakhomov
>Priority: Major
> Attachments: path_to_gc_root_from_heap_dump.png
>
>
> Here is possible leak of SolrIndexSearcher. Which prevents unused searchers 
> to be reclaimed by gc.
> This problem was found after analyzing heap dump which was created before 
> Full GC.
> 1) Where unused ref to SolrIndexSearcher is stored.
> Log4j2Watcher implements LogWatcher
>  and has CircularList history inherited from LogWatcher
> In history we can store Log4jLogEvent which can hold ref to 
> ParameterizedMessage
>  and ParameterizedMessage stores refs to all arguments of event log. (here we 
> can store objects which are no longer in use directly or indirectly)
> 2) How SolrIndexSearcher can be indirectly reached through this log buffer.
> If during FunctionScoreQuery execution ExitingReaderException("The request 
> took too long to iterate over terms. Timeout: " ..) will be thrown this query 
> will be logged with warn level and it ref will be store in Log4j2Watcher.
>  (Here can be any exception which will log this query to Log4j2Watcher)
> In general it should be ok but in this case FunctionScoreQuery indirectly 
> stores ref to SolrIndexSearcher.
> As the result we have refs to already closed searchers which are no longer in 
> use.
>  Searcher has refs to caches (Docs, Filers, results ...) and they can not be 
> reclaimed by gc.
> 3) How SolrIndexSearcher can be accessed through FunctionScoreQuery
> There is a FunctionScoreQuery which can hold ref to 
> MultiplicativeBoostValuesSource
>  which can hold ref to WrappedDoubleValuesSource
>  and the last one can hold ref to SolrIndexSearcher.
> public final class FunctionScoreQuery extends Query
> { ... private final DoubleValuesSource source; ... }
> private static class MultiplicativeBoostValuesSource extends 
> DoubleValuesSource
> { private final DoubleValuesSource boost; ... }
> private static class WrappedDoubleValuesSource extends DoubleValuesSource
> { private final ValueSource in; private IndexSearcher searcher; ... }
> Actually any DoubleValuesSource implementation which stores ref to 
> IndexSearcher 
>  on method call
> public abstract DoubleValuesSource rewrite(IndexSearcher reader) throws 
> IOException;
> can couse a problem if it will be logged via Log4j2Watcher.
> 4) How to temporary solve this problem
>  It is possible to disable Log4j2Watcher in solr.xml
> 5) How to fix this issue in more reliable way ?
>  I think that it is very dangerous to buffer refs to log messages arguments.
>  And may be Log4j2Watcher should be reworked to avoid buffering refs but 
> LoggingHandler depends on Log4j2Watcher.
> But may be there are better ways to solve this issue.
> Path to gc root is attached.
>  !path_to_gc_root_from_heap_dump.png|width=811,height=387!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-13283) when facet with certain terms via {!terms}, facet.limit, facet.offset does not work

2019-03-12 Thread superman369 (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16790348#comment-16790348
 ] 

superman369 edited comment on SOLR-13283 at 3/12/19 8:22 AM:
-

@[~munendrasn], thanks for your reply!    I  encountered such a demand. For 
example : I use  \{!terms} multivalued field facet,so facet result include all 
the multivalued field values, 

among them are not required。 So I need to Incoming field values with   
\{!terms} , in order to filter facet value.  For example:   mutivalued field:   
"location.authorIds":["1","2","3","4"]

I need to facet with the field(location.authorIds) , but I want to  filter  
location.authorIds = "1" and location.authorIds = "2" , others not required.

So I do it like this: 

 
[facet.field=|http://192.168.106.62:8219/hljzyydxyf-solr/search/select?facet.field=]!terms="1,2"}location.authorIds=on=search.resourcetype:2=*:*=0=count
 .

if {!terms=""} include many values, I need to  pagination with facet.limit and 
facet.offset.    Could you understand me? Thank you very much!

 

 


was (Author: superman369):
@[~munendrasn], thanks for your reply!    I  encountered such a demand. For 
example : I use  \{!terms} multivalued field facet,so facet result include all 
the multivalued field values, 

among them are not required。 So I need to Incoming field values with   
\{!terms} , in order to filter facet value.  For example:   mutivalued field:   
"location.authorIds":["1","2","3","4"]

I need to facet with the field(location.authorIds) , but I want to  filter  
location.authorIds = "1" and location.authorIds = "2" , others not required.

So I do it like this:   
[facet.field=|http://192.168.106.62:8219/hljzyydxyf-solr/search/select?facet.field=]{!terms="1,2"}location.authorIds=on=search.resourcetype:2=*:*=0=count
 .

if \{!terms=""} include many values, I need to  pagination with facet.limit and 
facet.offset.    Could you understand me? Thank you very much!

 

 

> when facet with certain terms via {!terms}, facet.limit, facet.offset does 
> not work
> ---
>
> Key: SOLR-13283
> URL: https://issues.apache.org/jira/browse/SOLR-13283
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Facet Module
>Affects Versions: 7.7
>Reporter: superman369
>Priority: Major
>
> h4.  I do a test in solr7.7.   Limiting field facet with certain terms via 
> \{!terms}  is work use facet.sort=count, but facet.limit, facet.offset does 
> not work. What's wrong?
> h4. for example: 
> facet.field=\{!terms='1453,1452,1248'}location.authorIds=true=1=1,
>   facet result  : "location.authorIds" has 3 items.  I think it should  have 
> one item. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8652) Add boosting support in the SynonymQuery

2019-03-12 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16790356#comment-16790356
 ] 

ASF subversion and git services commented on LUCENE-8652:
-

Commit d01dc484e93ada6fbe26dbb9f8ad2616c56bdf76 in lucene-solr's branch 
refs/heads/branch_8x from Jim Ferenczi
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=d01dc48 ]

LUCENE-8652: ensure that the norm doesn't influence the score in 
TestSynonymQuery#testBoosts


> Add boosting support in the SynonymQuery
> 
>
> Key: LUCENE-8652
> URL: https://issues.apache.org/jira/browse/LUCENE-8652
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Jim Ferenczi
>Priority: Minor
> Fix For: master (9.0), 8.1
>
> Attachments: LUCENE-8652.patch, LUCENE-8652.patch
>
>
> The SynonymQuery tries to score multiple terms as if you had indexed them as 
> one term.
> This is good for "true" synonyms where each term should have the same 
> contribution to the final score but this doesn't handle the case where terms 
> have different weights. For scoring purpose it would be nice to be able to 
> assign a boost per term that we could multiply with the term's document 
> frequency in order to take into account the importance of the term within the 
> synonym list.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8652) Add boosting support in the SynonymQuery

2019-03-12 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16790355#comment-16790355
 ] 

ASF subversion and git services commented on LUCENE-8652:
-

Commit c87e7614f1634d721e8328ef9c2f49ab7be911e6 in lucene-solr's branch 
refs/heads/master from Jim Ferenczi
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=c87e761 ]

LUCENE-8652: ensure that the norm doesn't influence the score in 
TestSynonymQuery#testBoosts


> Add boosting support in the SynonymQuery
> 
>
> Key: LUCENE-8652
> URL: https://issues.apache.org/jira/browse/LUCENE-8652
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Jim Ferenczi
>Priority: Minor
> Fix For: master (9.0), 8.1
>
> Attachments: LUCENE-8652.patch, LUCENE-8652.patch
>
>
> The SynonymQuery tries to score multiple terms as if you had indexed them as 
> one term.
> This is good for "true" synonyms where each term should have the same 
> contribution to the final score but this doesn't handle the case where terms 
> have different weights. For scoring purpose it would be nice to be able to 
> assign a boost per term that we could multiply with the term's document 
> frequency in order to take into account the importance of the term within the 
> synonym list.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (LUCENE-8713) Add Line2D tests

2019-03-12 Thread Ignacio Vera (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-8713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ignacio Vera reassigned LUCENE-8713:


   Resolution: Fixed
 Assignee: Ignacio Vera
Fix Version/s: master (9.0)
   8.x

> Add Line2D tests
> 
>
> Key: LUCENE-8713
> URL: https://issues.apache.org/jira/browse/LUCENE-8713
> Project: Lucene - Core
>  Issue Type: Test
>Reporter: Ignacio Vera
>Assignee: Ignacio Vera
>Priority: Minor
> Fix For: 8.x, master (9.0)
>
> Attachments: LUCENE-8713.patch, LUCENE-8713.patch
>
>
> Line2D does not have specific test. This issue will add them.
> Actually when developing the test, I realised that during the refactoring of 
> this class in LUCENE-8680, a bug was introduced. The patch fixes that as well
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-13283) when facet with certain terms via {!terms}, facet.limit, facet.offset does not work

2019-03-12 Thread superman369 (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16790348#comment-16790348
 ] 

superman369 edited comment on SOLR-13283 at 3/12/19 8:26 AM:
-

@[~munendrasn], thanks for your reply!    I  encountered such a demand. For 
example : I use  \{!terms} multivalued field facet,so facet result include all 
the multivalued field values, 

among them are not required。 So I need to Incoming field values with   
\{!terms} , in order to filter facet value.  For example:   mutivalued field:   
"location.authorIds":["1","2","3","4"]

I need to facet with the field(location.authorIds) , but I want to  filter  
location.authorIds = "1" and location.authorIds = "2" , others not required.

So I do it like this: 

 [facet.field 
|http://192.168.106.62:8219/hljzyydxyf-solr/search/select?facet.field=]= 
\{!terms="1,2"}location.authorIds=on=search.resourcetype:2=*:*=0=count
 .

if {!terms=""}include many values, I need to  pagination with facet.limit and 
facet.offset.    Could you understand me? Thank you very much.

 

 


was (Author: superman369):
@[~munendrasn], thanks for your reply!    I  encountered such a demand. For 
example : I use  \{!terms} multivalued field facet,so facet result include all 
the multivalued field values, 

among them are not required。 So I need to Incoming field values with   
\{!terms} , in order to filter facet value.  For example:   mutivalued field:   
"location.authorIds":["1","2","3","4"]

I need to facet with the field(location.authorIds) , but I want to  filter  
location.authorIds = "1" and location.authorIds = "2" , others not required.

So I do it like this: 

 [facet.field 
|http://192.168.106.62:8219/hljzyydxyf-solr/search/select?facet.field=]= 
\{!terms="1,2"}location.authorIds=on=search.resourcetype:2=*:*=0=count
 .

if \{!terms=""}include many values, I need to  pagination with facet.limit and 
facet.offset.    Could you understand me? Thank you very much!

 

 

> when facet with certain terms via {!terms}, facet.limit, facet.offset does 
> not work
> ---
>
> Key: SOLR-13283
> URL: https://issues.apache.org/jira/browse/SOLR-13283
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Facet Module
>Affects Versions: 7.7
>Reporter: superman369
>Priority: Major
>
> h4.  I do a test in solr7.7.   Limiting field facet with certain terms via 
> \{!terms}  is work use facet.sort=count, but facet.limit, facet.offset does 
> not work. What's wrong?
> h4. for example: 
> facet.field=\{!terms='1453,1452,1248'}location.authorIds=true=1=1,
>   facet result  : "location.authorIds" has 3 items.  I think it should  have 
> one item. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-13283) when facet with certain terms via {!terms}, facet.limit, facet.offset does not work

2019-03-12 Thread superman369 (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16790348#comment-16790348
 ] 

superman369 edited comment on SOLR-13283 at 3/12/19 8:25 AM:
-

@[~munendrasn], thanks for your reply!    I  encountered such a demand. For 
example : I use  \{!terms} multivalued field facet,so facet result include all 
the multivalued field values, 

among them are not required。 So I need to Incoming field values with   
\{!terms} , in order to filter facet value.  For example:   mutivalued field:   
"location.authorIds":["1","2","3","4"]

I need to facet with the field(location.authorIds) , but I want to  filter  
location.authorIds = "1" and location.authorIds = "2" , others not required.

So I do it like this: 

 [facet.field 
|http://192.168.106.62:8219/hljzyydxyf-solr/search/select?facet.field=]= 
\{!terms="1,2"}location.authorIds=on=search.resourcetype:2=*:*=0=count
 .

if \{!terms=""}include many values, I need to  pagination with facet.limit and 
facet.offset.    Could you understand me? Thank you very much!

 

 


was (Author: superman369):
@[~munendrasn], thanks for your reply!    I  encountered such a demand. For 
example : I use  \{!terms} multivalued field facet,so facet result include all 
the multivalued field values, 

among them are not required。 So I need to Incoming field values with   
\{!terms} , in order to filter facet value.  For example:   mutivalued field:   
"location.authorIds":["1","2","3","4"]

I need to facet with the field(location.authorIds) , but I want to  filter  
location.authorIds = "1" and location.authorIds = "2" , others not required.

So I do it like this: 

 [facet.field 
|http://192.168.106.62:8219/hljzyydxyf-solr/search/select?facet.field=]= 
\{!terms="1,2"}location.authorIds=on=search.resourcetype:2=*:*=0=count
 .

if \{!terms=""}include many values, I need to  pagination with facet.limit and 
facet.offset.    Could you understand me? Thank you very much) not found.

 

 

> when facet with certain terms via {!terms}, facet.limit, facet.offset does 
> not work
> ---
>
> Key: SOLR-13283
> URL: https://issues.apache.org/jira/browse/SOLR-13283
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Facet Module
>Affects Versions: 7.7
>Reporter: superman369
>Priority: Major
>
> h4.  I do a test in solr7.7.   Limiting field facet with certain terms via 
> \{!terms}  is work use facet.sort=count, but facet.limit, facet.offset does 
> not work. What's wrong?
> h4. for example: 
> facet.field=\{!terms='1453,1452,1248'}location.authorIds=true=1=1,
>   facet result  : "location.authorIds" has 3 items.  I think it should  have 
> one item. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8721) LatLonShapePolygon and LineQuery fail on shared dateline queries

2019-03-12 Thread Adrien Grand (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16790349#comment-16790349
 ] 

Adrien Grand commented on LUCENE-8721:
--

Should we check whether there is an intersection between the range of latitudes 
of the query and of the box/triangle? The patch seems to only look at 
longitudes, which means it might return CROSSES when it could return DISJOINT 
if latitudes don't overlap?

Unrelated to your change but I find the relate logic a bit hard to follow due 
to how it's split between EdgeTree and its sub classes via 3 methods: relate, 
internalComponentRelate and componentRelate.

> LatLonShapePolygon and LineQuery fail on shared dateline queries
> 
>
> Key: LUCENE-8721
> URL: https://issues.apache.org/jira/browse/LUCENE-8721
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Nicholas Knize
>Priority: Major
> Attachments: LUCENE-8721.patch
>
>
> Indexed shapes should be returned with search geometries that share the 
> dateline on the opposite hemisphere. 
> For example:
> {code:java}
>   public void testSharedDateline() throws Exception {
>  index /
> Directory dir = newDirectory();
> RandomIndexWriter w = new RandomIndexWriter(random(), dir);
> Document doc = new Document();
> // index western hemisphere geometry
> Polygon indexPoly = new Polygon(
> new double[] {-7.5d, 15d, 15d, 0d, -7.5d},
> new double[] {-180d, -180d, -176d, -176d, -180d}
> );
> Field[] fields = LatLonShape.createIndexableFields("test", indexPoly);
> for (Field f : fields) {
>   doc.add(f);
> }
> w.addDocument(doc);
> w.forceMerge(1);
> / search //
> IndexReader reader = w.getReader();
> w.close();
> IndexSearcher searcher = newSearcher(reader);
> // search w/ eastern hemisphere geometry that shares the dateline
> Polygon searchPoly = new Polygon(new double[] {-7.5d, 15d, 15d, 0d, 
> -7.5d},
> new double[] {180d, 180d, 170d, 170d, 180d});
> Query q = LatLonShape.newPolygonQuery("test", QueryRelation.INTERSECTS, 
> searchPoly);
> assertEquals(1, searcher.count(q));
> IOUtils.close(w, reader, dir);
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-13283) when facet with certain terms via {!terms}, facet.limit, facet.offset does not work

2019-03-12 Thread superman369 (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16790348#comment-16790348
 ] 

superman369 edited comment on SOLR-13283 at 3/12/19 8:24 AM:
-

@[~munendrasn], thanks for your reply!    I  encountered such a demand. For 
example : I use  \{!terms} multivalued field facet,so facet result include all 
the multivalued field values, 

among them are not required。 So I need to Incoming field values with   
\{!terms} , in order to filter facet value.  For example:   mutivalued field:   
"location.authorIds":["1","2","3","4"]

I need to facet with the field(location.authorIds) , but I want to  filter  
location.authorIds = "1" and location.authorIds = "2" , others not required.

So I do it like this: 

 [facet.field 
|http://192.168.106.62:8219/hljzyydxyf-solr/search/select?facet.field=]= 
\{!terms="1,2"}location.authorIds=on=search.resourcetype:2=*:*=0=count
 .

if {!terms=""} include many values, I need to  pagination with facet.limit and 
facet.offset.    Could you understand me? Thank you very much) not found.

 

 


was (Author: superman369):
@[~munendrasn], thanks for your reply!    I  encountered such a demand. For 
example : I use  \{!terms} multivalued field facet,so facet result include all 
the multivalued field values, 

among them are not required。 So I need to Incoming field values with   
\{!terms} , in order to filter facet value.  For example:   mutivalued field:   
"location.authorIds":["1","2","3","4"]

I need to facet with the field(location.authorIds) , but I want to  filter  
location.authorIds = "1" and location.authorIds = "2" , others not required.

So I do it like this: 

 
[facet.field=|http://192.168.106.62:8219/hljzyydxyf-solr/search/select?facet.field=]!terms="1,2"}location.authorIds=on=search.resourcetype:2=*:*=0=count
 .

if \{!terms=""} include many values, I need to  pagination with facet.limit and 
facet.offset.    Could you understand me? Thank you very much!

 

 

> when facet with certain terms via {!terms}, facet.limit, facet.offset does 
> not work
> ---
>
> Key: SOLR-13283
> URL: https://issues.apache.org/jira/browse/SOLR-13283
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Facet Module
>Affects Versions: 7.7
>Reporter: superman369
>Priority: Major
>
> h4.  I do a test in solr7.7.   Limiting field facet with certain terms via 
> \{!terms}  is work use facet.sort=count, but facet.limit, facet.offset does 
> not work. What's wrong?
> h4. for example: 
> facet.field=\{!terms='1453,1452,1248'}location.authorIds=true=1=1,
>   facet result  : "location.authorIds" has 3 items.  I think it should  have 
> one item. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-13283) when facet with certain terms via {!terms}, facet.limit, facet.offset does not work

2019-03-12 Thread superman369 (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16790348#comment-16790348
 ] 

superman369 edited comment on SOLR-13283 at 3/12/19 8:24 AM:
-

@[~munendrasn], thanks for your reply!    I  encountered such a demand. For 
example : I use  \{!terms} multivalued field facet,so facet result include all 
the multivalued field values, 

among them are not required。 So I need to Incoming field values with   
\{!terms} , in order to filter facet value.  For example:   mutivalued field:   
"location.authorIds":["1","2","3","4"]

I need to facet with the field(location.authorIds) , but I want to  filter  
location.authorIds = "1" and location.authorIds = "2" , others not required.

So I do it like this: 

 [facet.field 
|http://192.168.106.62:8219/hljzyydxyf-solr/search/select?facet.field=]= 
\{!terms="1,2"}location.authorIds=on=search.resourcetype:2=*:*=0=count
 .

if \{!terms=""}include many values, I need to  pagination with facet.limit and 
facet.offset.    Could you understand me? Thank you very much) not found.

 

 


was (Author: superman369):
@[~munendrasn], thanks for your reply!    I  encountered such a demand. For 
example : I use  \{!terms} multivalued field facet,so facet result include all 
the multivalued field values, 

among them are not required。 So I need to Incoming field values with   
\{!terms} , in order to filter facet value.  For example:   mutivalued field:   
"location.authorIds":["1","2","3","4"]

I need to facet with the field(location.authorIds) , but I want to  filter  
location.authorIds = "1" and location.authorIds = "2" , others not required.

So I do it like this: 

 [facet.field 
|http://192.168.106.62:8219/hljzyydxyf-solr/search/select?facet.field=]= 
\{!terms="1,2"}location.authorIds=on=search.resourcetype:2=*:*=0=count
 .

if {!terms=""} include many values, I need to  pagination with facet.limit and 
facet.offset.    Could you understand me? Thank you very much) not found.

 

 

> when facet with certain terms via {!terms}, facet.limit, facet.offset does 
> not work
> ---
>
> Key: SOLR-13283
> URL: https://issues.apache.org/jira/browse/SOLR-13283
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Facet Module
>Affects Versions: 7.7
>Reporter: superman369
>Priority: Major
>
> h4.  I do a test in solr7.7.   Limiting field facet with certain terms via 
> \{!terms}  is work use facet.sort=count, but facet.limit, facet.offset does 
> not work. What's wrong?
> h4. for example: 
> facet.field=\{!terms='1453,1452,1248'}location.authorIds=true=1=1,
>   facet result  : "location.authorIds" has 3 items.  I think it should  have 
> one item. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-13283) when facet with certain terms via {!terms}, facet.limit, facet.offset does not work

2019-03-12 Thread superman369 (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16790348#comment-16790348
 ] 

superman369 edited comment on SOLR-13283 at 3/12/19 8:23 AM:
-

@[~munendrasn], thanks for your reply!    I  encountered such a demand. For 
example : I use  \{!terms} multivalued field facet,so facet result include all 
the multivalued field values, 

among them are not required。 So I need to Incoming field values with   
\{!terms} , in order to filter facet value.  For example:   mutivalued field:   
"location.authorIds":["1","2","3","4"]

I need to facet with the field(location.authorIds) , but I want to  filter  
location.authorIds = "1" and location.authorIds = "2" , others not required.

So I do it like this: 

 
[facet.field=|http://192.168.106.62:8219/hljzyydxyf-solr/search/select?facet.field=]!terms="1,2"}location.authorIds=on=search.resourcetype:2=*:*=0=count
 .

if \{!terms=""} include many values, I need to  pagination with facet.limit and 
facet.offset.    Could you understand me? Thank you very much!

 

 


was (Author: superman369):
@[~munendrasn], thanks for your reply!    I  encountered such a demand. For 
example : I use  \{!terms} multivalued field facet,so facet result include all 
the multivalued field values, 

among them are not required。 So I need to Incoming field values with   
\{!terms} , in order to filter facet value.  For example:   mutivalued field:   
"location.authorIds":["1","2","3","4"]

I need to facet with the field(location.authorIds) , but I want to  filter  
location.authorIds = "1" and location.authorIds = "2" , others not required.

So I do it like this: 

 
[facet.field=|http://192.168.106.62:8219/hljzyydxyf-solr/search/select?facet.field=]!terms="1,2"}location.authorIds=on=search.resourcetype:2=*:*=0=count
 .

if {!terms=""} include many values, I need to  pagination with facet.limit and 
facet.offset.    Could you understand me? Thank you very much!

 

 

> when facet with certain terms via {!terms}, facet.limit, facet.offset does 
> not work
> ---
>
> Key: SOLR-13283
> URL: https://issues.apache.org/jira/browse/SOLR-13283
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Facet Module
>Affects Versions: 7.7
>Reporter: superman369
>Priority: Major
>
> h4.  I do a test in solr7.7.   Limiting field facet with certain terms via 
> \{!terms}  is work use facet.sort=count, but facet.limit, facet.offset does 
> not work. What's wrong?
> h4. for example: 
> facet.field=\{!terms='1453,1452,1248'}location.authorIds=true=1=1,
>   facet result  : "location.authorIds" has 3 items.  I think it should  have 
> one item. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13283) when facet with certain terms via {!terms}, facet.limit, facet.offset does not work

2019-03-12 Thread superman369 (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16790348#comment-16790348
 ] 

superman369 commented on SOLR-13283:


@[~munendrasn], thanks for your reply!    I  encountered such a demand. For 
example : I use  \{!terms} multivalued field facet,so facet result include all 
the multivalued field values, 

among them are not required。 So I need to Incoming field values with   
\{!terms} , in order to filter facet value.  For example:   mutivalued field:   
"location.authorIds":["1","2","3","4"]

I need to facet with the field(location.authorIds) , but I want to  filter  
location.authorIds = "1" and location.authorIds = "2" , others not required.

So I do it like this:   
[facet.field=|http://192.168.106.62:8219/hljzyydxyf-solr/search/select?facet.field=]{!terms="1,2"}location.authorIds=on=search.resourcetype:2=*:*=0=count
 .

if \{!terms=""} include many values, I need to  pagination with facet.limit and 
facet.offset.    Could you understand me? Thank you very much!

 

 

> when facet with certain terms via {!terms}, facet.limit, facet.offset does 
> not work
> ---
>
> Key: SOLR-13283
> URL: https://issues.apache.org/jira/browse/SOLR-13283
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Facet Module
>Affects Versions: 7.7
>Reporter: superman369
>Priority: Major
>
> h4.  I do a test in solr7.7.   Limiting field facet with certain terms via 
> \{!terms}  is work use facet.sort=count, but facet.limit, facet.offset does 
> not work. What's wrong?
> h4. for example: 
> facet.field=\{!terms='1453,1452,1248'}location.authorIds=true=1=1,
>   facet result  : "location.authorIds" has 3 items.  I think it should  have 
> one item. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8713) Add Line2D tests

2019-03-12 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16790346#comment-16790346
 ] 

ASF subversion and git services commented on LUCENE-8713:
-

Commit 31809b3e27eddf353d59f8b9820f85deb552ef96 in lucene-solr's branch 
refs/heads/branch_8x from Ignacio Vera
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=31809b3 ]

LUCENE-8713: Add Line2D tests


> Add Line2D tests
> 
>
> Key: LUCENE-8713
> URL: https://issues.apache.org/jira/browse/LUCENE-8713
> Project: Lucene - Core
>  Issue Type: Test
>Reporter: Ignacio Vera
>Priority: Minor
> Attachments: LUCENE-8713.patch, LUCENE-8713.patch
>
>
> Line2D does not have specific test. This issue will add them.
> Actually when developing the test, I realised that during the refactoring of 
> this class in LUCENE-8680, a bug was introduced. The patch fixes that as well
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



  1   2   >