[jira] [Commented] (LUCENE-8776) Start offset going backwards has a legitimate purpose

2020-08-18 Thread Roman (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-8776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17180202#comment-17180202
 ] 

Roman commented on LUCENE-8776:
---

[~mgibney] now it's my turn to apologize, all the time I have missed that 
WordDelimiter factory was involved. Well, the problems shown in my two examples 
above are real (the positionLength spans 4 tokens that don't belong together), 
but I now have to question my understanding of everything else. See my email to 
[~mikemccand] from few moments ago (the word-delimiter factory is involved; to 
my slight relief - that's a stock solr version)

 

from the email:

 

Hi Mike,

I'm sorry, the problem all the time is inside related to a word-delimiter 
filter factory. This is embarrassing but I have to admit publicly and 
self-flagellate. 

A word-delimiter filter is used to split tokens, these then are used to find 
multi-token synonyms (hence the connection). In my desire to simplify, I have 
omitted that detail while writing my first email. 

I went to generate the stack trace:

```
{code:java}
assertU(adoc("id", "603", "bibcode", "xx603",
        "title", "THE HUBBLE constant: a summary of the HUBBLE SPACE TELESCOPE 
program"));```{code}
 
{code:java}
stage:indexer term=xx603 pos=1 type=word offsetStart=0 offsetEnd=13
stage:indexer term=acr::the pos=1 type=ACRONYM offsetStart=0 offsetEnd=3
stage:indexer term=hubble pos=1 type=word offsetStart=4 offsetEnd=10
stage:indexer term=acr::hubble pos=0 type=ACRONYM offsetStart=4 offsetEnd=10
stage:indexer term=constant pos=1 type=word offsetStart=11 offsetEnd=20
stage:indexer term=summary pos=1 type=word offsetStart=23 offsetEnd=30
stage:indexer term=hubble pos=1 type=word offsetStart=38 offsetEnd=44
stage:indexer term=syn::hubble space telescope pos=0 type=SYNONYM 
offsetStart=38 offsetEnd=60
stage:indexer term=syn::hst pos=0 type=SYNONYM offsetStart=38 offsetEnd=60
stage:indexer term=space pos=1 type=word offsetStart=45 offsetEnd=50
stage:indexer term=telescope pos=1 type=word offsetStart=51 offsetEnd=60
stage:indexer term=program pos=1 type=word offsetStart=61 offsetEnd=68{code}


that worked, only the next one failed:



{code:java}
assertU(adoc("id", "605", "bibcode", "xx604",
        "title", "MIT and anti de sitter space-time"));{code}




{code:java}
stage:indexer term=xx604 pos=1 type=word offsetStart=0 offsetEnd=13
stage:indexer term=mit pos=1 type=word offsetStart=0 offsetEnd=3
stage:indexer term=acr::mit pos=0 type=ACRONYM offsetStart=0 offsetEnd=3
stage:indexer term=syn::massachusetts institute of technology pos=0 
type=SYNONYM offsetStart=0 offsetEnd=3
stage:indexer term=syn::mit pos=0 type=SYNONYM offsetStart=0 offsetEnd=3
stage:indexer term=anti pos=1 type=word offsetStart=8 offsetEnd=12
stage:indexer term=syn::ads pos=0 type=SYNONYM offsetStart=8 offsetEnd=28
stage:indexer term=syn::anti de sitter space pos=0 type=SYNONYM offsetStart=8 
offsetEnd=28
stage:indexer term=syn::antidesitter spacetime pos=0 type=SYNONYM offsetStart=8 
offsetEnd=28
stage:indexer term=de pos=1 type=word offsetStart=13 offsetEnd=15
stage:indexer term=sitter pos=1 type=word offsetStart=16 offsetEnd=22
stage:indexer term=space pos=1 type=word offsetStart=23 offsetEnd=28
stage:indexer term=time pos=1 type=word offsetStart=29 offsetEnd=33
stage:indexer term=spacetime pos=0 type=word offsetStart=23 offsetEnd=33{code}
 

stacktrace:

 
{code:java}
325677 ERROR 
(TEST-TestAdsabsTypeFulltextParsing.testNoSynChain-seed#[ADFAB495DA8F6F40]) [   
 ] o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException: Exception 
writing document id 605 to the index; possible analysis error: startOffset must 
be non-negative, and endOffset must be >= startOffset, and offsets must not go 
backwards startOffset=23,endOffset=33,lastStartOffset=29 for field 'title'
at 
org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:242)
at 
org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:67)
at 
org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:55)
at 
org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(DistributedUpdateProcessor.java:1002)
at 
org.apache.solr.update.processor.DistributedUpdateProcessor.doVersionAdd(DistributedUpdateProcessor.java:1233)
at 
org.apache.solr.update.processor.DistributedUpdateProcessor.lambda$2(DistributedUpdateProcessor.java:1082)
at org.apache.solr.update.VersionBucket.runWithLock(VersionBucket.java:50)
at 
org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:1082)
at 
org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:694)
at 
org.apache.solr.update.processor.LogUpdateProcessorFactory$LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:103)
at 

[jira] [Updated] (SOLR-14660) Migrating HDFS into a package

2020-08-18 Thread Noble Paul (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-14660:
--
Labels: package packagemanager  (was: packagemanager)

> Migrating HDFS into a package
> -
>
> Key: SOLR-14660
> URL: https://issues.apache.org/jira/browse/SOLR-14660
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Ishan Chattopadhyaya
>Priority: Major
>  Labels: package, packagemanager
>
> Following up on the deprecation of HDFS (SOLR-14021), we need to work on 
> isolating it away from Solr core and making a package for this. This issue is 
> to track the efforts for that.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14660) Migrating HDFS into a package

2020-08-18 Thread Noble Paul (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-14660:
--
Labels: packagemanager  (was: )

> Migrating HDFS into a package
> -
>
> Key: SOLR-14660
> URL: https://issues.apache.org/jira/browse/SOLR-14660
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Ishan Chattopadhyaya
>Priority: Major
>  Labels: packagemanager
>
> Following up on the deprecation of HDFS (SOLR-14021), we need to work on 
> isolating it away from Solr core and making a package for this. This issue is 
> to track the efforts for that.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14750) Harden TestBulkSchemaConcurrent

2020-08-18 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17180191#comment-17180191
 ] 

ASF subversion and git services commented on SOLR-14750:


Commit 0f88cce8418f8a65b1643951a43fff66722f9547 in lucene-solr's branch 
refs/heads/branch_8x from Noble Paul
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=0f88cce ]

SOLR-14750: TestBulkSchemaConcurrent fails often(#1760)



> Harden TestBulkSchemaConcurrent
> ---
>
> Key: SOLR-14750
> URL: https://issues.apache.org/jira/browse/SOLR-14750
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Tests
>Reporter: Erick Erickson
>Assignee: Noble Paul
>Priority: Major
> Attachments: SOLR-14750.patch
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> This test has been failing quite often lately. I poked around a bit and see 
> what I _think_ is evidence of a race condition in CoreContainer.reload where 
> a reload on the same core is happening from two places in close succession. 
> I'll attach a preliminary patch soon.
> Without this patch I had 25 failures out of 1,000 runs, with it 0.
> I consider this patch a WIP, putting up for comment. Well, it has nocommits 
> so... But In particular, I have to review some changes I made about which 
> name we're using for PendingCoreOps. I also want to back out my changes and 
> beast it again with some more logging to see if I can nail down that multiple 
> reloads are happening before declaring victory.
> What this does is put the name of the core we're reloading in pendingCoreOps 
> earlier in the reload process. Then the second call to reload will wait until 
> the first is completed. I also restructured it a bit because I don't like if 
> clauses that go on forever and a small else clause way down the code. I 
> inverted the test and bailed out of the method rather than fall off the end 
> after the else clause.
> One thing I don't like about this is two reloads in such rapid succession 
> seems wasteful. Even so, I can imagine that one reload gets through far 
> enough to load the schema, then a schema update changes the schema _then_ 
> calls reload. So I don't think just returning if there's a reload happening 
> on that core already is valid.
> More to come.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14750) Harden TestBulkSchemaConcurrent

2020-08-18 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17180190#comment-17180190
 ] 

ASF subversion and git services commented on SOLR-14750:


Commit 8caf57d50b4ee24e1e59b888fe42ffa2968d562a in lucene-solr's branch 
refs/heads/master from Noble Paul
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=8caf57d ]

SOLR-14750: TestBulkSchemaConcurrent fails often(#1760)



> Harden TestBulkSchemaConcurrent
> ---
>
> Key: SOLR-14750
> URL: https://issues.apache.org/jira/browse/SOLR-14750
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Tests
>Reporter: Erick Erickson
>Assignee: Noble Paul
>Priority: Major
> Attachments: SOLR-14750.patch
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> This test has been failing quite often lately. I poked around a bit and see 
> what I _think_ is evidence of a race condition in CoreContainer.reload where 
> a reload on the same core is happening from two places in close succession. 
> I'll attach a preliminary patch soon.
> Without this patch I had 25 failures out of 1,000 runs, with it 0.
> I consider this patch a WIP, putting up for comment. Well, it has nocommits 
> so... But In particular, I have to review some changes I made about which 
> name we're using for PendingCoreOps. I also want to back out my changes and 
> beast it again with some more logging to see if I can nail down that multiple 
> reloads are happening before declaring victory.
> What this does is put the name of the core we're reloading in pendingCoreOps 
> earlier in the reload process. Then the second call to reload will wait until 
> the first is completed. I also restructured it a bit because I don't like if 
> clauses that go on forever and a small else clause way down the code. I 
> inverted the test and bailed out of the method rather than fall off the end 
> after the else clause.
> One thing I don't like about this is two reloads in such rapid succession 
> seems wasteful. Even so, I can imagine that one reload gets through far 
> enough to load the schema, then a schema update changes the schema _then_ 
> calls reload. So I don't think just returning if there's a reload happening 
> on that core already is valid.
> More to come.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] noblepaul merged pull request #1760: SOLR-14750: TestBulkSchemaConcurrent passes but schema plugins fail

2020-08-18 Thread GitBox


noblepaul merged pull request #1760:
URL: https://github.com/apache/lucene-solr/pull/1760


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14413) allow timeAllowed and cursorMark parameters

2020-08-18 Thread John Gallagher (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17180102#comment-17180102
 ] 

John Gallagher commented on SOLR-14413:
---

Hi folks, sorry I stepped away from this for a bit, and thanks [~mdrob] , and 
[~epugh] for the comments, and [~bvd] for contributing!

 

To Mike's earlier point: when partialResults is true, you cannot be sure 
whether there are more results or not.  You could summarize it this way:
|| ||partialResults not present|| partialResults: true||
|more results?|nextCursorMark != cursorMark|unknown|

After figuring out how to properly generate documentation, I took a stab at 
adding two notes to the documentation:

1.  In timeAllowed documentation,  
common-query-parameters.html#timeallowed-parameter

!image-2020-08-18-16-56-59-178.png|width=508,height=116!

2. In the Constraints when using Cursors documentation, 
pagination-of-results.html#constraints-when-using-cursors
!image-2020-08-18-16-56-41-736.png|width=410,height=250!

Please let me know what you think of these additions.

 

I was also able to reproduce [~bvd]'s case where nextCursorMark was null (the 
field actually wasn't present in the response at all).  I tracked it down to 
[this code in 
SearchHandler.java|[https://github.com/slackhappy/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/handler/component/SearchHandler.java#L349]].
  When the Search times out very early, an exception is thrown before responses 
can be handled, which skips the calculation of the next cursor mark.  In that 
case, the response is null, and the SearchHandler creates an empty response to 
return (numFound 0, docs:[]). 

 

I added code to also return the original cursorMark in that case, since we 
haven't progressed (it is the value you would want to pass to a subsequent 
search).

 

I added a basic test to CursorPagingTest that checks that the parameters can be 
used in conjunction, and that the nextCursorMark returned is a valid one that 
can be used in subsequent requests.  Perhaps that test can be combined with 
[~bvd]'s.  I struggled a bit to come up with a test that would produce reliable 
results when using timeAllowed.  I borrowed a bit from 
ExitableDirectoryReaderTest's use of a DelayingSearchComponent, which helped 
somewhat.

 

The updated patch file is SOLR-14413-jg-update1.patch, and I have updated the 
[corresponding PR|[https://github.com/apache/lucene-solr/pull/1436]].

 

John

> allow timeAllowed and cursorMark parameters
> ---
>
> Key: SOLR-14413
> URL: https://issues.apache.org/jira/browse/SOLR-14413
> Project: Solr
>  Issue Type: Improvement
>  Components: search
>Reporter: John Gallagher
>Priority: Minor
> Attachments: SOLR-14413-bram.patch, SOLR-14413-jg-update1.patch, 
> SOLR-14413.patch, image-2020-08-18-16-56-41-736.png, 
> image-2020-08-18-16-56-59-178.png, timeallowed_cursormarks_results.txt
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Ever since cursorMarks were introduced in SOLR-5463 in 2014, cursorMark and 
> timeAllowed parameters were not allowed in combination ("Can not search using 
> both cursorMark and timeAllowed")
> , from [QueryComponent.java|#L359]]:
>  
> {code:java}
>  
>  if (null != rb.getCursorMark() && 0 < timeAllowed) {
>   // fundamentally incompatible
>   throw new SolrException(SolrException.ErrorCode.BAD_REQUEST, "Can not 
> search using both " + CursorMarkParams.CURSOR_MARK_PARAM + " and " + 
> CommonParams.TIME_ALLOWED);
> } {code}
> While theoretically impure to use them in combination, it is often desirable 
> to support cursormarks-style deep paging and attempt to protect Solr nodes 
> from runaway queries using timeAllowed, in the hopes that most of the time, 
> the query completes in the allotted time, and there is no conflict.
>  
> However if the query takes too long, it may be preferable to end the query 
> and protect the Solr node and provide the user with a somewhat inaccurate 
> sorted list. As noted in SOLR-6930, SOLR-5986 and others, timeAllowed is 
> frequently used to prevent runaway load.  In fact, cursorMark and 
> shards.tolerant are allowed in combination, so any argument in favor of 
> purity would be a bit muddied in my opinion.
>  
> This was discussed once in the mailing list that I can find: 
> [https://mail-archives.apache.org/mod_mbox/lucene-solr-user/201506.mbox/%3c5591740b.4080...@elyograg.org%3E]
>  It did not look like there was strong support for preventing the combination.
>  
> I have tested cursorMark and timeAllowed combination together, and even when 
> partial results are returned because the timeAllowed is exceeded, the 
> cursorMark response value is still valid and reasonable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (LUCENE-8776) Start offset going backwards has a legitimate purpose

2020-08-18 Thread Roman (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-8776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17180098#comment-17180098
 ] 

Roman commented on LUCENE-8776:
---

Michael, I appreciate your thinking about the issue - it's always pleasure to 
meet somebody who changes his/her mind based on evidence. As far as it seems to 
me, the backward going offsets are indeed the 'only' problem – but 
unfortunately of some 'fundamental' category. It doesn't square nicely with the 
desire to efficiently store offsets using deltas.

I can reassure you that the problem doesn't appear on (only) first or lasts 
positions. The unittests linked fro my post have specific examples for that 
(even searching for phrases embedded/surrounded by other tokens)

The swapping - unless i'm missing something important - will not work (for 
multi-token situation). Because the next token is going to be checked against 
offsets of the last emitted token. So wherever the multi-token gets emitted (as 
first, last, or even in the middle) - it will trip the wire for the surrounding 
tokens (or these already emitted tokens will trip the wire for the multi-word 
token)

 

funnily enough, swapping 0th position could trigger another issue (but only for 
the very first token in the stream): 
[https://github.com/apache/lucene-solr/blob/master/lucene/core/src/java/org/apache/lucene/index/DefaultIndexingChain.java#L897]

 

that the 

> Start offset going backwards has a legitimate purpose
> -
>
> Key: LUCENE-8776
> URL: https://issues.apache.org/jira/browse/LUCENE-8776
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/search
>Affects Versions: 7.6
>Reporter: Ram Venkat
>Priority: Major
> Attachments: LUCENE-8776-proof-of-concept.patch
>
>
> Here is the use case where startOffset can go backwards:
> Say there is a line "Organic light-emitting-diode glows", and I want to run 
> span queries and highlight them properly. 
> During index time, light-emitting-diode is split into three words, which 
> allows me to search for 'light', 'emitting' and 'diode' individually. The 
> three words occupy adjacent positions in the index, as 'light' adjacent to 
> 'emitting' and 'light' at a distance of two words from 'diode' need to match 
> this word. So, the order of words after splitting are: Organic, light, 
> emitting, diode, glows. 
> But, I also want to search for 'organic' being adjacent to 
> 'light-emitting-diode' or 'light-emitting-diode' being adjacent to 'glows'. 
> The way I solved this was to also generate 'light-emitting-diode' at two 
> positions: (a) In the same position as 'light' and (b) in the same position 
> as 'glows', like below:
> ||organic||light||emitting||diode||glows||
> | |light-emitting-diode| |light-emitting-diode| |
> |0|1|2|3|4|
> The positions of the two 'light-emitting-diode' are 1 and 3, but the offsets 
> are obviously the same. This works beautifully in Lucene 5.x in both 
> searching and highlighting with span queries. 
> But when I try this in Lucene 7.6, it hits the condition "Offsets must not go 
> backwards" at DefaultIndexingChain:818. This IllegalArgumentException is 
> being thrown without any comments on why this check is needed. As I explained 
> above, startOffset going backwards is perfectly valid, to deal with word 
> splitting and span operations on these specialized use cases. On the other 
> hand, it is not clear what value is added by this check and which highlighter 
> code is affected by offsets going backwards. This same check is done at 
> BaseTokenStreamTestCase:245. 
> I see others talk about how this check found bugs in WordDelimiter etc. but 
> it also prevents legitimate use cases. Can this check be removed?  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14413) allow timeAllowed and cursorMark parameters

2020-08-18 Thread John Gallagher (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Gallagher updated SOLR-14413:
--
Attachment: SOLR-14413-jg-update1.patch

> allow timeAllowed and cursorMark parameters
> ---
>
> Key: SOLR-14413
> URL: https://issues.apache.org/jira/browse/SOLR-14413
> Project: Solr
>  Issue Type: Improvement
>  Components: search
>Reporter: John Gallagher
>Priority: Minor
> Attachments: SOLR-14413-bram.patch, SOLR-14413-jg-update1.patch, 
> SOLR-14413.patch, image-2020-08-18-16-56-41-736.png, 
> image-2020-08-18-16-56-59-178.png, timeallowed_cursormarks_results.txt
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Ever since cursorMarks were introduced in SOLR-5463 in 2014, cursorMark and 
> timeAllowed parameters were not allowed in combination ("Can not search using 
> both cursorMark and timeAllowed")
> , from [QueryComponent.java|#L359]]:
>  
> {code:java}
>  
>  if (null != rb.getCursorMark() && 0 < timeAllowed) {
>   // fundamentally incompatible
>   throw new SolrException(SolrException.ErrorCode.BAD_REQUEST, "Can not 
> search using both " + CursorMarkParams.CURSOR_MARK_PARAM + " and " + 
> CommonParams.TIME_ALLOWED);
> } {code}
> While theoretically impure to use them in combination, it is often desirable 
> to support cursormarks-style deep paging and attempt to protect Solr nodes 
> from runaway queries using timeAllowed, in the hopes that most of the time, 
> the query completes in the allotted time, and there is no conflict.
>  
> However if the query takes too long, it may be preferable to end the query 
> and protect the Solr node and provide the user with a somewhat inaccurate 
> sorted list. As noted in SOLR-6930, SOLR-5986 and others, timeAllowed is 
> frequently used to prevent runaway load.  In fact, cursorMark and 
> shards.tolerant are allowed in combination, so any argument in favor of 
> purity would be a bit muddied in my opinion.
>  
> This was discussed once in the mailing list that I can find: 
> [https://mail-archives.apache.org/mod_mbox/lucene-solr-user/201506.mbox/%3c5591740b.4080...@elyograg.org%3E]
>  It did not look like there was strong support for preventing the combination.
>  
> I have tested cursorMark and timeAllowed combination together, and even when 
> partial results are returned because the timeAllowed is exceeded, the 
> cursorMark response value is still valid and reasonable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14413) allow timeAllowed and cursorMark parameters

2020-08-18 Thread John Gallagher (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Gallagher updated SOLR-14413:
--
Attachment: image-2020-08-18-16-56-59-178.png

> allow timeAllowed and cursorMark parameters
> ---
>
> Key: SOLR-14413
> URL: https://issues.apache.org/jira/browse/SOLR-14413
> Project: Solr
>  Issue Type: Improvement
>  Components: search
>Reporter: John Gallagher
>Priority: Minor
> Attachments: SOLR-14413-bram.patch, SOLR-14413.patch, 
> image-2020-08-18-16-56-41-736.png, image-2020-08-18-16-56-59-178.png, 
> timeallowed_cursormarks_results.txt
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Ever since cursorMarks were introduced in SOLR-5463 in 2014, cursorMark and 
> timeAllowed parameters were not allowed in combination ("Can not search using 
> both cursorMark and timeAllowed")
> , from [QueryComponent.java|#L359]]:
>  
> {code:java}
>  
>  if (null != rb.getCursorMark() && 0 < timeAllowed) {
>   // fundamentally incompatible
>   throw new SolrException(SolrException.ErrorCode.BAD_REQUEST, "Can not 
> search using both " + CursorMarkParams.CURSOR_MARK_PARAM + " and " + 
> CommonParams.TIME_ALLOWED);
> } {code}
> While theoretically impure to use them in combination, it is often desirable 
> to support cursormarks-style deep paging and attempt to protect Solr nodes 
> from runaway queries using timeAllowed, in the hopes that most of the time, 
> the query completes in the allotted time, and there is no conflict.
>  
> However if the query takes too long, it may be preferable to end the query 
> and protect the Solr node and provide the user with a somewhat inaccurate 
> sorted list. As noted in SOLR-6930, SOLR-5986 and others, timeAllowed is 
> frequently used to prevent runaway load.  In fact, cursorMark and 
> shards.tolerant are allowed in combination, so any argument in favor of 
> purity would be a bit muddied in my opinion.
>  
> This was discussed once in the mailing list that I can find: 
> [https://mail-archives.apache.org/mod_mbox/lucene-solr-user/201506.mbox/%3c5591740b.4080...@elyograg.org%3E]
>  It did not look like there was strong support for preventing the combination.
>  
> I have tested cursorMark and timeAllowed combination together, and even when 
> partial results are returned because the timeAllowed is exceeded, the 
> cursorMark response value is still valid and reasonable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14413) allow timeAllowed and cursorMark parameters

2020-08-18 Thread John Gallagher (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Gallagher updated SOLR-14413:
--
Attachment: (was: image-2020-08-18-16-56-03-425.png)

> allow timeAllowed and cursorMark parameters
> ---
>
> Key: SOLR-14413
> URL: https://issues.apache.org/jira/browse/SOLR-14413
> Project: Solr
>  Issue Type: Improvement
>  Components: search
>Reporter: John Gallagher
>Priority: Minor
> Attachments: SOLR-14413-bram.patch, SOLR-14413.patch, 
> image-2020-08-18-16-56-41-736.png, image-2020-08-18-16-56-59-178.png, 
> timeallowed_cursormarks_results.txt
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Ever since cursorMarks were introduced in SOLR-5463 in 2014, cursorMark and 
> timeAllowed parameters were not allowed in combination ("Can not search using 
> both cursorMark and timeAllowed")
> , from [QueryComponent.java|#L359]]:
>  
> {code:java}
>  
>  if (null != rb.getCursorMark() && 0 < timeAllowed) {
>   // fundamentally incompatible
>   throw new SolrException(SolrException.ErrorCode.BAD_REQUEST, "Can not 
> search using both " + CursorMarkParams.CURSOR_MARK_PARAM + " and " + 
> CommonParams.TIME_ALLOWED);
> } {code}
> While theoretically impure to use them in combination, it is often desirable 
> to support cursormarks-style deep paging and attempt to protect Solr nodes 
> from runaway queries using timeAllowed, in the hopes that most of the time, 
> the query completes in the allotted time, and there is no conflict.
>  
> However if the query takes too long, it may be preferable to end the query 
> and protect the Solr node and provide the user with a somewhat inaccurate 
> sorted list. As noted in SOLR-6930, SOLR-5986 and others, timeAllowed is 
> frequently used to prevent runaway load.  In fact, cursorMark and 
> shards.tolerant are allowed in combination, so any argument in favor of 
> purity would be a bit muddied in my opinion.
>  
> This was discussed once in the mailing list that I can find: 
> [https://mail-archives.apache.org/mod_mbox/lucene-solr-user/201506.mbox/%3c5591740b.4080...@elyograg.org%3E]
>  It did not look like there was strong support for preventing the combination.
>  
> I have tested cursorMark and timeAllowed combination together, and even when 
> partial results are returned because the timeAllowed is exceeded, the 
> cursorMark response value is still valid and reasonable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14413) allow timeAllowed and cursorMark parameters

2020-08-18 Thread John Gallagher (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Gallagher updated SOLR-14413:
--
Attachment: (was: image-2020-08-18-16-56-03-355.png)

> allow timeAllowed and cursorMark parameters
> ---
>
> Key: SOLR-14413
> URL: https://issues.apache.org/jira/browse/SOLR-14413
> Project: Solr
>  Issue Type: Improvement
>  Components: search
>Reporter: John Gallagher
>Priority: Minor
> Attachments: SOLR-14413-bram.patch, SOLR-14413.patch, 
> image-2020-08-18-16-56-41-736.png, image-2020-08-18-16-56-59-178.png, 
> timeallowed_cursormarks_results.txt
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Ever since cursorMarks were introduced in SOLR-5463 in 2014, cursorMark and 
> timeAllowed parameters were not allowed in combination ("Can not search using 
> both cursorMark and timeAllowed")
> , from [QueryComponent.java|#L359]]:
>  
> {code:java}
>  
>  if (null != rb.getCursorMark() && 0 < timeAllowed) {
>   // fundamentally incompatible
>   throw new SolrException(SolrException.ErrorCode.BAD_REQUEST, "Can not 
> search using both " + CursorMarkParams.CURSOR_MARK_PARAM + " and " + 
> CommonParams.TIME_ALLOWED);
> } {code}
> While theoretically impure to use them in combination, it is often desirable 
> to support cursormarks-style deep paging and attempt to protect Solr nodes 
> from runaway queries using timeAllowed, in the hopes that most of the time, 
> the query completes in the allotted time, and there is no conflict.
>  
> However if the query takes too long, it may be preferable to end the query 
> and protect the Solr node and provide the user with a somewhat inaccurate 
> sorted list. As noted in SOLR-6930, SOLR-5986 and others, timeAllowed is 
> frequently used to prevent runaway load.  In fact, cursorMark and 
> shards.tolerant are allowed in combination, so any argument in favor of 
> purity would be a bit muddied in my opinion.
>  
> This was discussed once in the mailing list that I can find: 
> [https://mail-archives.apache.org/mod_mbox/lucene-solr-user/201506.mbox/%3c5591740b.4080...@elyograg.org%3E]
>  It did not look like there was strong support for preventing the combination.
>  
> I have tested cursorMark and timeAllowed combination together, and even when 
> partial results are returned because the timeAllowed is exceeded, the 
> cursorMark response value is still valid and reasonable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14413) allow timeAllowed and cursorMark parameters

2020-08-18 Thread John Gallagher (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Gallagher updated SOLR-14413:
--
Attachment: image-2020-08-18-16-56-03-425.png

> allow timeAllowed and cursorMark parameters
> ---
>
> Key: SOLR-14413
> URL: https://issues.apache.org/jira/browse/SOLR-14413
> Project: Solr
>  Issue Type: Improvement
>  Components: search
>Reporter: John Gallagher
>Priority: Minor
> Attachments: SOLR-14413-bram.patch, SOLR-14413.patch, 
> image-2020-08-18-16-56-03-355.png, image-2020-08-18-16-56-03-425.png, 
> image-2020-08-18-16-56-41-736.png, timeallowed_cursormarks_results.txt
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Ever since cursorMarks were introduced in SOLR-5463 in 2014, cursorMark and 
> timeAllowed parameters were not allowed in combination ("Can not search using 
> both cursorMark and timeAllowed")
> , from [QueryComponent.java|#L359]]:
>  
> {code:java}
>  
>  if (null != rb.getCursorMark() && 0 < timeAllowed) {
>   // fundamentally incompatible
>   throw new SolrException(SolrException.ErrorCode.BAD_REQUEST, "Can not 
> search using both " + CursorMarkParams.CURSOR_MARK_PARAM + " and " + 
> CommonParams.TIME_ALLOWED);
> } {code}
> While theoretically impure to use them in combination, it is often desirable 
> to support cursormarks-style deep paging and attempt to protect Solr nodes 
> from runaway queries using timeAllowed, in the hopes that most of the time, 
> the query completes in the allotted time, and there is no conflict.
>  
> However if the query takes too long, it may be preferable to end the query 
> and protect the Solr node and provide the user with a somewhat inaccurate 
> sorted list. As noted in SOLR-6930, SOLR-5986 and others, timeAllowed is 
> frequently used to prevent runaway load.  In fact, cursorMark and 
> shards.tolerant are allowed in combination, so any argument in favor of 
> purity would be a bit muddied in my opinion.
>  
> This was discussed once in the mailing list that I can find: 
> [https://mail-archives.apache.org/mod_mbox/lucene-solr-user/201506.mbox/%3c5591740b.4080...@elyograg.org%3E]
>  It did not look like there was strong support for preventing the combination.
>  
> I have tested cursorMark and timeAllowed combination together, and even when 
> partial results are returned because the timeAllowed is exceeded, the 
> cursorMark response value is still valid and reasonable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14413) allow timeAllowed and cursorMark parameters

2020-08-18 Thread John Gallagher (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Gallagher updated SOLR-14413:
--
Attachment: image-2020-08-18-16-56-03-355.png

> allow timeAllowed and cursorMark parameters
> ---
>
> Key: SOLR-14413
> URL: https://issues.apache.org/jira/browse/SOLR-14413
> Project: Solr
>  Issue Type: Improvement
>  Components: search
>Reporter: John Gallagher
>Priority: Minor
> Attachments: SOLR-14413-bram.patch, SOLR-14413.patch, 
> image-2020-08-18-16-56-03-355.png, image-2020-08-18-16-56-03-425.png, 
> image-2020-08-18-16-56-41-736.png, timeallowed_cursormarks_results.txt
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Ever since cursorMarks were introduced in SOLR-5463 in 2014, cursorMark and 
> timeAllowed parameters were not allowed in combination ("Can not search using 
> both cursorMark and timeAllowed")
> , from [QueryComponent.java|#L359]]:
>  
> {code:java}
>  
>  if (null != rb.getCursorMark() && 0 < timeAllowed) {
>   // fundamentally incompatible
>   throw new SolrException(SolrException.ErrorCode.BAD_REQUEST, "Can not 
> search using both " + CursorMarkParams.CURSOR_MARK_PARAM + " and " + 
> CommonParams.TIME_ALLOWED);
> } {code}
> While theoretically impure to use them in combination, it is often desirable 
> to support cursormarks-style deep paging and attempt to protect Solr nodes 
> from runaway queries using timeAllowed, in the hopes that most of the time, 
> the query completes in the allotted time, and there is no conflict.
>  
> However if the query takes too long, it may be preferable to end the query 
> and protect the Solr node and provide the user with a somewhat inaccurate 
> sorted list. As noted in SOLR-6930, SOLR-5986 and others, timeAllowed is 
> frequently used to prevent runaway load.  In fact, cursorMark and 
> shards.tolerant are allowed in combination, so any argument in favor of 
> purity would be a bit muddied in my opinion.
>  
> This was discussed once in the mailing list that I can find: 
> [https://mail-archives.apache.org/mod_mbox/lucene-solr-user/201506.mbox/%3c5591740b.4080...@elyograg.org%3E]
>  It did not look like there was strong support for preventing the combination.
>  
> I have tested cursorMark and timeAllowed combination together, and even when 
> partial results are returned because the timeAllowed is exceeded, the 
> cursorMark response value is still valid and reasonable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14413) allow timeAllowed and cursorMark parameters

2020-08-18 Thread John Gallagher (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Gallagher updated SOLR-14413:
--
Attachment: image-2020-08-18-16-56-41-736.png

> allow timeAllowed and cursorMark parameters
> ---
>
> Key: SOLR-14413
> URL: https://issues.apache.org/jira/browse/SOLR-14413
> Project: Solr
>  Issue Type: Improvement
>  Components: search
>Reporter: John Gallagher
>Priority: Minor
> Attachments: SOLR-14413-bram.patch, SOLR-14413.patch, 
> image-2020-08-18-16-56-03-355.png, image-2020-08-18-16-56-03-425.png, 
> image-2020-08-18-16-56-41-736.png, timeallowed_cursormarks_results.txt
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Ever since cursorMarks were introduced in SOLR-5463 in 2014, cursorMark and 
> timeAllowed parameters were not allowed in combination ("Can not search using 
> both cursorMark and timeAllowed")
> , from [QueryComponent.java|#L359]]:
>  
> {code:java}
>  
>  if (null != rb.getCursorMark() && 0 < timeAllowed) {
>   // fundamentally incompatible
>   throw new SolrException(SolrException.ErrorCode.BAD_REQUEST, "Can not 
> search using both " + CursorMarkParams.CURSOR_MARK_PARAM + " and " + 
> CommonParams.TIME_ALLOWED);
> } {code}
> While theoretically impure to use them in combination, it is often desirable 
> to support cursormarks-style deep paging and attempt to protect Solr nodes 
> from runaway queries using timeAllowed, in the hopes that most of the time, 
> the query completes in the allotted time, and there is no conflict.
>  
> However if the query takes too long, it may be preferable to end the query 
> and protect the Solr node and provide the user with a somewhat inaccurate 
> sorted list. As noted in SOLR-6930, SOLR-5986 and others, timeAllowed is 
> frequently used to prevent runaway load.  In fact, cursorMark and 
> shards.tolerant are allowed in combination, so any argument in favor of 
> purity would be a bit muddied in my opinion.
>  
> This was discussed once in the mailing list that I can find: 
> [https://mail-archives.apache.org/mod_mbox/lucene-solr-user/201506.mbox/%3c5591740b.4080...@elyograg.org%3E]
>  It did not look like there was strong support for preventing the combination.
>  
> I have tested cursorMark and timeAllowed combination together, and even when 
> partial results are returned because the timeAllowed is exceeded, the 
> cursorMark response value is still valid and reasonable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] sigram commented on a change in pull request #1758: SOLR-14749: Provide a clean API for cluster-level event processing, Initial draft.

2020-08-18 Thread GitBox


sigram commented on a change in pull request #1758:
URL: https://github.com/apache/lucene-solr/pull/1758#discussion_r472436918



##
File path: solr/core/src/java/org/apache/solr/cluster/events/ScheduledEvent.java
##
@@ -0,0 +1,30 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.cluster.events;
+
+/**
+ *
+ */
+public interface ScheduledEvent extends ClusterEvent {

Review comment:
   Right, generally speaking any maintenance or optimization task that 
needs to be periodically invoked. Again, we could define it as a separate API 
but I think it fits the event model nicely.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] sigram commented on a change in pull request #1758: SOLR-14749: Provide a clean API for cluster-level event processing, Initial draft.

2020-08-18 Thread GitBox


sigram commented on a change in pull request #1758:
URL: https://github.com/apache/lucene-solr/pull/1758#discussion_r472435099



##
File path: solr/core/src/java/org/apache/solr/cloud/events/ScheduledEvent.java
##
@@ -0,0 +1,25 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.cloud.events;
+
+/**
+ *
+ */
+public interface ScheduledEvent extends ClusterEvent {
+  String getScheduleName();
+  Object getScheduleParam(String key);

Review comment:
   @chatman even in 8x the scheduled events were used for things other than 
autoscaling, namely for inactive shard maintenance. Solr still needs this 
functionality in order to avoid implementing it over and over again in 
individual plugins.
   
   I don't have a strong opinion whether it must be a part of the event API 
here, but it seems to me that it's convenient to model scheduled events in this 
uniform way.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4312) Index format to store position length per position

2020-08-18 Thread Michael McCandless (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-4312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17179959#comment-17179959
 ] 

Michael McCandless commented on LUCENE-4312:


[~mgibney] I think we should try to find a way forward here?

I think what [~rcmuir] briefly suggested above would be a good approach to 
break the chicken/egg?  I do not think we can work out up front what the "bar" 
would be to promote this approach to Lucene's core, but that should not stop us 
from getting an initial version working in {{sandbox}}.

Store the position in length as a payload (simple {{TokenFilter}} can do that), 
then create custom span queries that load that payload, decode it back to 
position length, and 100% correctly match positional queries that contain 
multi-token index-time synonyms.  That correctness achievement alone will be 
incredible and help many users suffering with this longstanding issue.  I don't 
think you would need any changes to {{DefaultIndexingChain}}, {{PostingsEnum}}, 
etc. for this implementation?

We encourage usage of that approach, we run benchmarks, we iterate to improve 
performance etc. and that may eventually give us the currency to make API 
changes in Lucene's core to more directly support position length in the index. 
 Or, maybe the payload implementation is perfectly fine forever.

I think we should open a new issue for this effort (not reuse this one, or 
LUCENE-7398, or LUCENE-8776).  Yes, this might seem like Jira cancer 
metastasis, but I think the specifics of this implementation plan warrant a 
dedicated issue.  The issues purpose is to get your working payload solution 
available in Lucene's sandbox.

> Index format to store position length per position
> --
>
> Key: LUCENE-4312
> URL: https://issues.apache.org/jira/browse/LUCENE-4312
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/codecs
>Affects Versions: 6.0
>Reporter: Gang Luo
>Priority: Minor
>  Labels: Suggestion
> Attachments: positionLength-postings.patch
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> Mike Mccandless said:TokenStreams are actually graphs.
> Indexer ignores PositionLengthAttribute.Need change the index format (and 
> Codec APIs) to store an additional int position length per position.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] murblanc commented on a change in pull request #1758: SOLR-14749: Provide a clean API for cluster-level event processing, Initial draft.

2020-08-18 Thread GitBox


murblanc commented on a change in pull request #1758:
URL: https://github.com/apache/lucene-solr/pull/1758#discussion_r472343024



##
File path: 
solr/core/src/java/org/apache/solr/cluster/events/ClusterSingleton.java
##
@@ -0,0 +1,14 @@
+package org.apache.solr.cluster.events;
+
+import java.lang.annotation.Retention;
+import java.lang.annotation.RetentionPolicy;
+
+/**
+ * Intended for {@link org.apache.solr.core.CoreContainer} plugins that should 
be
+ * enabled only one instance per cluster.
+ * Implementation detail: currently these plugins are instantiated on the
+ * Overseer leader, and closed when the current node loses its leadership.
+ */
+@Retention(RetentionPolicy.RUNTIME)
+public @interface ClusterSingleton {

Review comment:
   If you don't provide this level of service to plugins (single instance 
running) you force them to somehow do it on their own. How would they even 
start to do that? I'm strongly against exposing internal implementation 
"details" such as ZK. Do we let plugins open TCP ports? Have their own ZK? 
Tomorrow we might decide to run plug-in on other containers/VM's that are not 
nodes or replace ZK by a DB. Will plugins have to reimplement another leader 
election or similar? 
   It is much cleaner and simpler to do it once in Solr. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] murblanc commented on a change in pull request #1758: SOLR-14749: Provide a clean API for cluster-level event processing, Initial draft.

2020-08-18 Thread GitBox


murblanc commented on a change in pull request #1758:
URL: https://github.com/apache/lucene-solr/pull/1758#discussion_r472346084



##
File path: solr/core/src/java/org/apache/solr/cluster/events/ScheduledEvent.java
##
@@ -0,0 +1,30 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.cluster.events;
+
+/**
+ *
+ */
+public interface ScheduledEvent extends ClusterEvent {

Review comment:
   Leader rebalancing job, cluster rebalancing, removing replicas from a 
node to be removed, unnecessary replica removal, core corruption checking etc 
etc 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] murblanc commented on a change in pull request #1758: SOLR-14749: Provide a clean API for cluster-level event processing, Initial draft.

2020-08-18 Thread GitBox


murblanc commented on a change in pull request #1758:
URL: https://github.com/apache/lucene-solr/pull/1758#discussion_r472343024



##
File path: 
solr/core/src/java/org/apache/solr/cluster/events/ClusterSingleton.java
##
@@ -0,0 +1,14 @@
+package org.apache.solr.cluster.events;
+
+import java.lang.annotation.Retention;
+import java.lang.annotation.RetentionPolicy;
+
+/**
+ * Intended for {@link org.apache.solr.core.CoreContainer} plugins that should 
be
+ * enabled only one instance per cluster.
+ * Implementation detail: currently these plugins are instantiated on the
+ * Overseer leader, and closed when the current node loses its leadership.
+ */
+@Retention(RetentionPolicy.RUNTIME)
+public @interface ClusterSingleton {

Review comment:
   If you don't provide thus level of service to plugins (single instance 
running) you force them to somehow do it on their own. How would they even 
start to do that? I'm strongly against exposing internal implementation 
"details" such as ZK. Do we let plugins open TCP ports? Have their own ZK? 
Tomorrow we might decide to run plug-in on other containers/VM's that are not 
nodes. Will plugins have to reimplement another leader election or similar? 
   It is much cleaner and simpler to do it once in Solr. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] murblanc commented on a change in pull request #1758: SOLR-14749: Provide a clean API for cluster-level event processing, Initial draft.

2020-08-18 Thread GitBox


murblanc commented on a change in pull request #1758:
URL: https://github.com/apache/lucene-solr/pull/1758#discussion_r472343024



##
File path: 
solr/core/src/java/org/apache/solr/cluster/events/ClusterSingleton.java
##
@@ -0,0 +1,14 @@
+package org.apache.solr.cluster.events;
+
+import java.lang.annotation.Retention;
+import java.lang.annotation.RetentionPolicy;
+
+/**
+ * Intended for {@link org.apache.solr.core.CoreContainer} plugins that should 
be
+ * enabled only one instance per cluster.
+ * Implementation detail: currently these plugins are instantiated on the
+ * Overseer leader, and closed when the current node loses its leadership.
+ */
+@Retention(RetentionPolicy.RUNTIME)
+public @interface ClusterSingleton {

Review comment:
   If you don't provude thus level of servuce to plugins (single instance 
running) you force them to somehow do it on their own. How would they do that? 
I'm strongly against exposing internal implementation "details" such as ZK. Do 
we let plugins open 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] s1monw commented on a change in pull request #1623: LUCENE-8962: Merge segments on getReader

2020-08-18 Thread GitBox


s1monw commented on a change in pull request #1623:
URL: https://github.com/apache/lucene-solr/pull/1623#discussion_r472337839



##
File path: lucene/core/src/java/org/apache/lucene/index/IndexWriter.java
##
@@ -545,18 +546,54 @@ DirectoryReader getReader(boolean applyAllDeletes, 
boolean writeAllDeletes) thro
 // obtained during this flush are pooled, the first time
 // this method is called:
 readerPool.enableReaderPooling();
-DirectoryReader r = null;
+StandardDirectoryReader r = null;
 doBeforeFlush();
-boolean anyChanges = false;
+boolean anyChanges;
 /*
  * for releasing a NRT reader we must ensure that 
  * DW doesn't add any segments or deletes until we are
  * done with creating the NRT DirectoryReader. 
  * We release the two stage full flush after we are done opening the
  * directory reader!
  */
+MergePolicy.MergeSpecification onGetReaderMerges = null;
+AtomicBoolean stopCollectingMergedReaders = new AtomicBoolean(false);
+Map mergedReaders = new HashMap<>();
+Map openedReadOnlyClones = new HashMap<>();
+// this function is used to control which SR are opened in order to keep 
track of them
+// and to reuse them in the case we wait for merges in this getReader call.
+IOUtils.IOFunction readerFactory = sci 
-> {
+  final ReadersAndUpdates rld = getPooledInstance(sci, true);
+  try {
+assert Thread.holdsLock(IndexWriter.this);
+SegmentReader segmentReader = rld.getReadOnlyClone(IOContext.READ);
+openedReadOnlyClones.put(sci.info.name, segmentReader);

Review comment:
   ++





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] s1monw commented on a change in pull request #1623: LUCENE-8962: Merge segments on getReader

2020-08-18 Thread GitBox


s1monw commented on a change in pull request #1623:
URL: https://github.com/apache/lucene-solr/pull/1623#discussion_r472336456



##
File path: lucene/core/src/java/org/apache/lucene/index/IndexWriter.java
##
@@ -3179,9 +3317,9 @@ private long prepareCommitInternal() throws IOException {
   SegmentInfos toCommit = null;
   boolean anyChanges = false;
   long seqNo;
-  MergePolicy.MergeSpecification onCommitMerges = null;
-  AtomicBoolean includeInCommit = new AtomicBoolean(true);
-  final long maxCommitMergeWaitMillis = 
config.getMaxCommitMergeWaitMillis();
+  MergePolicy.MergeSpecification pointInTimeMerges = null;
+  AtomicBoolean hasTimedOut = new AtomicBoolean(false);

Review comment:
   ++ I change it to a better name





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] dnhatn commented on a change in pull request #1623: LUCENE-8962: Merge segments on getReader

2020-08-18 Thread GitBox


dnhatn commented on a change in pull request #1623:
URL: https://github.com/apache/lucene-solr/pull/1623#discussion_r472332671



##
File path: lucene/core/src/java/org/apache/lucene/index/IndexWriter.java
##
@@ -3179,9 +3317,9 @@ private long prepareCommitInternal() throws IOException {
   SegmentInfos toCommit = null;
   boolean anyChanges = false;
   long seqNo;
-  MergePolicy.MergeSpecification onCommitMerges = null;
-  AtomicBoolean includeInCommit = new AtomicBoolean(true);
-  final long maxCommitMergeWaitMillis = 
config.getMaxCommitMergeWaitMillis();
+  MergePolicy.MergeSpecification pointInTimeMerges = null;
+  AtomicBoolean hasTimedOut = new AtomicBoolean(false);

Review comment:
   Should we rename this to stopCollectingMergedReaders? I find that name 
better.

##
File path: lucene/core/src/java/org/apache/lucene/index/IndexWriter.java
##
@@ -545,18 +546,54 @@ DirectoryReader getReader(boolean applyAllDeletes, 
boolean writeAllDeletes) thro
 // obtained during this flush are pooled, the first time
 // this method is called:
 readerPool.enableReaderPooling();
-DirectoryReader r = null;
+StandardDirectoryReader r = null;
 doBeforeFlush();
-boolean anyChanges = false;
+boolean anyChanges;
 /*
  * for releasing a NRT reader we must ensure that 
  * DW doesn't add any segments or deletes until we are
  * done with creating the NRT DirectoryReader. 
  * We release the two stage full flush after we are done opening the
  * directory reader!
  */
+MergePolicy.MergeSpecification onGetReaderMerges = null;
+AtomicBoolean stopCollectingMergedReaders = new AtomicBoolean(false);
+Map mergedReaders = new HashMap<>();
+Map openedReadOnlyClones = new HashMap<>();
+// this function is used to control which SR are opened in order to keep 
track of them
+// and to reuse them in the case we wait for merges in this getReader call.
+IOUtils.IOFunction readerFactory = sci 
-> {
+  final ReadersAndUpdates rld = getPooledInstance(sci, true);
+  try {
+assert Thread.holdsLock(IndexWriter.this);
+SegmentReader segmentReader = rld.getReadOnlyClone(IOContext.READ);
+openedReadOnlyClones.put(sci.info.name, segmentReader);

Review comment:
   Should we keep track the clones iff `maxFullFlushMergeWaitMillis` is 
positive? I know this is quite trivial.

##
File path: lucene/core/src/java/org/apache/lucene/index/IndexWriter.java
##
@@ -545,18 +546,54 @@ DirectoryReader getReader(boolean applyAllDeletes, 
boolean writeAllDeletes) thro
 // obtained during this flush are pooled, the first time
 // this method is called:
 readerPool.enableReaderPooling();
-DirectoryReader r = null;
+StandardDirectoryReader r = null;
 doBeforeFlush();
-boolean anyChanges = false;
+boolean anyChanges;
 /*
  * for releasing a NRT reader we must ensure that 
  * DW doesn't add any segments or deletes until we are
  * done with creating the NRT DirectoryReader. 
  * We release the two stage full flush after we are done opening the
  * directory reader!
  */
+MergePolicy.MergeSpecification onGetReaderMerges = null;
+AtomicBoolean stopCollectingMergedReaders = new AtomicBoolean(false);
+Map mergedReaders = new HashMap<>();
+Map openedReadOnlyClones = new HashMap<>();
+// this function is used to control which SR are opened in order to keep 
track of them
+// and to reuse them in the case we wait for merges in this getReader call.
+IOUtils.IOFunction readerFactory = sci 
-> {
+  final ReadersAndUpdates rld = getPooledInstance(sci, true);
+  try {
+assert Thread.holdsLock(IndexWriter.this);
+SegmentReader segmentReader = rld.getReadOnlyClone(IOContext.READ);
+openedReadOnlyClones.put(sci.info.name, segmentReader);

Review comment:
   Should we keep track the clones iff `maxFullFlushMergeWaitMillis` is 
positive? I know this is not expensive.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9027) SIMD-based decoding of postings lists

2020-08-18 Thread Michael McCandless (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17179726#comment-17179726
 ] 

Michael McCandless commented on LUCENE-9027:


{quote}_Shouldn't the other postings formats that switched from use of this 
postings write/reader also bump their version suffix?_
{quote}
Hmm, can you give an example?  What other postings formats (besides 
{{Lucene84PostingsFormat}}) are using the new {{Lucene84PostingsWriter/Reader}}?

> SIMD-based decoding of postings lists
> -
>
> Key: LUCENE-9027
> URL: https://issues.apache.org/jira/browse/LUCENE-9027
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Priority: Minor
> Fix For: 8.4
>
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> [~rcmuir] has been mentioning the idea for quite some time that we might be 
> able to write the decoding logic in such a way that Java would use SIMD 
> instructions. More recently [~paul.masurel] wrote a [blog 
> post|https://fulmicoton.com/posts/bitpacking/] that raises the point that 
> Lucene could still do decode multiple ints at once in a single instruction by 
> packing two ints in a long and we've had some discussions about what we could 
> try in Lucene to speed up the decoding of postings. This made me want to look 
> a bit deeper at what we could do.
> Our current decoding logic reads data in a byte[] and decodes packed integers 
> from it. Unfortunately it doesn't make use of SIMD instructions and looks 
> like 
> [this|https://github.com/jpountz/decode-128-ints-benchmark/blob/master/src/main/java/jpountz/NaiveByteDecoder.java].
> I confirmed by looking at the generated assembly that if I take an array of 
> integers and shift them all by the same number of bits then Java will use 
> SIMD instructions to shift multiple integers at once. This led me to writing 
> this 
> [implementation|https://github.com/jpountz/decode-128-ints-benchmark/blob/master/src/main/java/jpountz/SimpleSIMDDecoder.java]
>  that tries as much as possible to shift long sequences of ints by the same 
> number of bits to speed up decoding. It is indeed faster than the current 
> logic we have, up to about 2x faster for some numbers of bits per value.
> Currently the best 
> [implementation|https://github.com/jpountz/decode-128-ints-benchmark/blob/master/src/main/java/jpountz/SIMDDecoder.java]
>  I've been able to come up with combines the above idea with the idea that 
> Paul mentioned in his blog that consists of emulating SIMD from Java by 
> packing multiple integers into a long: 2 ints, 4 shorts or 8 bytes. It is a 
> bit harder to read but gives another speedup on top of the above 
> implementation.
> I have a [JMH 
> benchmark|https://github.com/jpountz/decode-128-ints-benchmark/] available in 
> case someone would like to play with this and maybe even come up with an even 
> faster implementation. It is 2-2.5x faster than our current implementation 
> for most numbers of bits per value. I'm copying results here:
> {noformat}
>  * `readLongs` just reads 2*bitsPerValue longs from the ByteBuffer, it serves 
> as
>a baseline.
>  * `decodeNaiveFromBytes` reads a byte[] and decodes from it. This is what the
>current Lucene codec does.
>  * `decodeNaiveFromLongs` decodes from longs on the fly.
>  * `decodeSimpleSIMD` is a simple implementation that relies on how Java
>recognizes some patterns and uses SIMD instructions.
>  * `decodeSIMD` is a more complex implementation that both relies on the C2
>compiler to generate SIMD instructions and encodes 8 bytes, 4 shorts or
>2 ints in a long in order to decompress multiple values at once.
> Benchmark   (bitsPerValue)  (byteOrder)   
> Mode  Cnt   Score   Error   Units
> PackedIntsDecodeBenchmark.decodeNaiveFromBytes   1   LE  
> thrpt5  12.912 ± 0.393  ops/us
> PackedIntsDecodeBenchmark.decodeNaiveFromBytes   1   BE  
> thrpt5  12.862 ± 0.395  ops/us
> PackedIntsDecodeBenchmark.decodeNaiveFromBytes   2   LE  
> thrpt5  13.040 ± 1.162  ops/us
> PackedIntsDecodeBenchmark.decodeNaiveFromBytes   2   BE  
> thrpt5  13.027 ± 0.270  ops/us
> PackedIntsDecodeBenchmark.decodeNaiveFromBytes   3   LE  
> thrpt5  12.409 ± 0.637  ops/us
> PackedIntsDecodeBenchmark.decodeNaiveFromBytes   3   BE  
> thrpt5  12.268 ± 0.947  ops/us
> PackedIntsDecodeBenchmark.decodeNaiveFromBytes   4   LE  
> thrpt5  14.177 ± 2.263  ops/us
> PackedIntsDecodeBenchmark.decodeNaiveFromBytes   4   BE  
> thrpt5  11.457 ± 0.150  ops/us
> PackedIntsDecodeBenchmark.decodeNaiveFromBytes   5   LE  
> thrpt5  

[jira] [Commented] (LUCENE-9468) "./gradlew -p lucene test" leaks lucene/core/build/tmp/test/jar_extract_XXXX files

2020-08-18 Thread Michael McCandless (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17179723#comment-17179723
 ] 

Michael McCandless commented on LUCENE-9468:


{quote}I don't work (or advocate) for gradle... 
{quote}
OK, good point :)  I added a comment on [that gradle 
issue|https://github.com/gradle/gradle/issues/12020] pointing back to this 
issue.

Unfortunately the snippet you linked to there has now become whitespace ... I 
think you need to link to a stable (githash'd) version of that source?
{quote}LUCENE-9465 is already implemented, Mike.
{quote}
Awesome, thanks!  I will try to switch my stupid beaster to this real beaster!

> "./gradlew -p lucene test" leaks lucene/core/build/tmp/test/jar_extract_ 
> files
> --
>
> Key: LUCENE-9468
> URL: https://issues.apache.org/jira/browse/LUCENE-9468
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael McCandless
>Priority: Major
>
> I have been stupid-beasting ({{while(true) "./gradlew -p lucene test"}}) 
> Lucene core + modules tests, and noticed that I have accumulated many 
> (~10.3K) files like these:
> {noformat}
> -rw-r--r-- 1 mike mike  4954 Aug 17 12:49 
> lucene/core/build/tmp/test/jar_extract_9943749690454255636_tmp
> -rw-r--r-- 1 mike mike 84929 Aug 17 13:24 
> lucene/core/build/tmp/test/jar_extract_9949237584070142535_tmp
> -rw-r--r-- 1 mike mike 14627 Aug 17 14:01 
> lucene/core/build/tmp/test/jar_extract_9950285037002822552_tmp
> -rw-r--r-- 1 mike mike 15935 Aug 17 15:10 
> lucene/core/build/tmp/test/jar_extract_995066590821695944_tmp
> -rw-r--r-- 1 mike mike  5488 Aug 17 11:23 
> lucene/core/build/tmp/test/jar_extract_9952865172929838404_tmp
> -rw-r--r-- 1 mike mike 19078 Aug 17 14:41 
> lucene/core/build/tmp/test/jar_extract_9960016969100835830_tmp
> -rw-r--r-- 1 mike mike  5488 Aug 17 10:07 
> lucene/core/build/tmp/test/jar_extract_9960479662672452908_tmp
> -rw-r--r-- 1 mike mike 28711 Aug 17 13:07 
> lucene/core/build/tmp/test/jar_extract_996420631017213954_tmp
> -rw-r--r-- 1 mike mike 84929 Aug 17 10:30 
> lucene/core/build/tmp/test/jar_extract_9964495910786482810_tmp
> -rw-r--r-- 1 mike mike  8034 Aug 17 17:10 
> lucene/core/build/tmp/test/jar_extract_9965528236930220207_tmp
> -rw-r--r-- 1 mike mike 43565 Aug 17 11:52 
> lucene/core/build/tmp/test/jar_extract_9967892842722777228_tmp
> -rw-r--r-- 1 mike mike 36278 Aug 17 10:36 
> lucene/core/build/tmp/test/jar_extract_996836107828729763_tmp
> -rw-r--r-- 1 mike mike 29997 Aug 17 15:48 
> lucene/core/build/tmp/test/jar_extract_9968527122717193835_tmp
> -rw-r--r-- 1 mike mike 19078 Aug 17 11:11 
> lucene/core/build/tmp/test/jar_extract_9968609693107939092_tmp
> -rw-r--r-- 1 mike mike  9920 Aug 17 14:31 
> lucene/core/build/tmp/test/jar_extract_9968809316564216653_tmp
> -rw-r--r-- 1 mike mike 19078 Aug 17 13:22 
> lucene/core/build/tmp/test/jar_extract_9969318805542859308_tmp
> -rw-r--r-- 1 mike mike 28711 Aug 17 11:44 
> lucene/core/build/tmp/test/jar_extract_9974798403956637924_tmp {noformat}
> It seems to grow by ~64 files (how many cores I configured gradle to use in 
> my {{./gradle.properties}} on each iteration of beasting.
> I think {{gradle}} should be cleaning this up maybe?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] s1monw commented on a change in pull request #1623: LUCENE-8962: Merge segments on getReader

2020-08-18 Thread GitBox


s1monw commented on a change in pull request #1623:
URL: https://github.com/apache/lucene-solr/pull/1623#discussion_r472314458



##
File path: lucene/core/src/java/org/apache/lucene/index/ReaderPool.java
##
@@ -404,7 +404,7 @@ private PendingDeletes newPendingDeletes(SegmentReader 
reader, SegmentCommitInfo
   private boolean noDups() {
 Set seen = new HashSet<>();
 for(SegmentCommitInfo info : readerMap.keySet()) {
-  assert !seen.contains(info.info.name);
+  assert !seen.contains(info.info.name) : "seen twice: " + info.info.name ;

Review comment:
   many fun issues in this PR to be honest. IW is tricky as hell in some 
places like we are incRefing files in StandardDirectoryReader but not in IW for 
NRT readers is mindblowing :D





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] s1monw commented on pull request #1623: LUCENE-8962: Merge segments on getReader

2020-08-18 Thread GitBox


s1monw commented on pull request #1623:
URL: https://github.com/apache/lucene-solr/pull/1623#issuecomment-675571884


   @mikemccand I pushed a fix for the failures. I will add a dedicated test to 
make sure it's covered too



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14413) allow timeAllowed and cursorMark parameters

2020-08-18 Thread Bram Van Dam (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bram Van Dam updated SOLR-14413:

Attachment: SOLR-14413-bram.patch

> allow timeAllowed and cursorMark parameters
> ---
>
> Key: SOLR-14413
> URL: https://issues.apache.org/jira/browse/SOLR-14413
> Project: Solr
>  Issue Type: Improvement
>  Components: search
>Reporter: John Gallagher
>Priority: Minor
> Attachments: SOLR-14413-bram.patch, SOLR-14413.patch, 
> timeallowed_cursormarks_results.txt
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Ever since cursorMarks were introduced in SOLR-5463 in 2014, cursorMark and 
> timeAllowed parameters were not allowed in combination ("Can not search using 
> both cursorMark and timeAllowed")
> , from [QueryComponent.java|#L359]]:
>  
> {code:java}
>  
>  if (null != rb.getCursorMark() && 0 < timeAllowed) {
>   // fundamentally incompatible
>   throw new SolrException(SolrException.ErrorCode.BAD_REQUEST, "Can not 
> search using both " + CursorMarkParams.CURSOR_MARK_PARAM + " and " + 
> CommonParams.TIME_ALLOWED);
> } {code}
> While theoretically impure to use them in combination, it is often desirable 
> to support cursormarks-style deep paging and attempt to protect Solr nodes 
> from runaway queries using timeAllowed, in the hopes that most of the time, 
> the query completes in the allotted time, and there is no conflict.
>  
> However if the query takes too long, it may be preferable to end the query 
> and protect the Solr node and provide the user with a somewhat inaccurate 
> sorted list. As noted in SOLR-6930, SOLR-5986 and others, timeAllowed is 
> frequently used to prevent runaway load.  In fact, cursorMark and 
> shards.tolerant are allowed in combination, so any argument in favor of 
> purity would be a bit muddied in my opinion.
>  
> This was discussed once in the mailing list that I can find: 
> [https://mail-archives.apache.org/mod_mbox/lucene-solr-user/201506.mbox/%3c5591740b.4080...@elyograg.org%3E]
>  It did not look like there was strong support for preventing the combination.
>  
> I have tested cursorMark and timeAllowed combination together, and even when 
> partial results are returned because the timeAllowed is exceeded, the 
> cursorMark response value is still valid and reasonable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-14413) allow timeAllowed and cursorMark parameters

2020-08-18 Thread Bram Van Dam (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17179702#comment-17179702
 ] 

Bram Van Dam edited comment on SOLR-14413 at 8/18/20, 3:31 PM:
---

[~epugh] I created a little test case in CursorPagingTest.java to demonstrate 
the two parameters working together, but in doing so I appear to have done the 
opposite. The test fails roughly 2/3 of the time. There appear to be three 
failure modes:

# nextCursorMark is null (when it should be the same as the previous cursor 
mark, or possibly a new mark)
# partialResults header is missing (this only seems to happen when the response 
does not contain any results)
# not all documents are hit even after the cursor has reached its end (this one 
occurs much less frequently)

When the assertions checking failure mode 1 and 2 are disabled, the test only 
fails 1 time out of 4, when failure more 3 is hit. IMO only the third failure 
mode is unacceptable, because it produces incorrect results. 

The test never fails when minimum timeAllowed is set to 50ms or above. Values 
that low are obviously ridiculous. Perhaps the documentation (or the 
implementation?) should be updated to discourage unreasonably low values.

I'll attach an updated version of [~slackhappy]'s patch which includes my test 
case. And I'll paste the test case here for good measure. Apologies for the 
mess, I'm not familiar with Solr's test harness, so I'm sure there cleaner ways 
of doing this.


{code:java}
  /**
   *  - check whether the cursor can be advanced when using timeAllowed
   *  - ensure partial results are advertised as such
   *  - check the correctness of those results
   */
  @SuppressWarnings({"unchecked", "rawtypes"})
  public void testTimeAllowedAdvancesCursor() throws Exception {
String cursorMark;
ModifiableSolrParams params = null;

// Add 1000 docs, anything less and the requests complete too quickly to be 
interesting
for(int i=1;i<=100;i++) {
  // don't add in order of any field to ensure we aren't inadvertently
  // counting on internal docid ordering
  assertU(adoc("id", i+ "9", "str", "c", "float", "-3.2", "int", "42" + i));
  assertU(adoc("id", i+ "7", "str", "c", "float", "-3.2", "int", "-1976" + 
i));
  assertU(adoc("id", i+ "2", "str", "c", "float", "-3.2", "int", "666" + 
i));
  assertU(adoc("id", i+ "0", "str", "b", "float", "64.5", "int", "-42" + 
i));
  assertU(adoc("id", i+ "5", "str", "b", "float", "64.5", "int", "2001" + 
i));
  assertU(adoc("id", i+ "8", "str", "b", "float", "64.5", "int", "4055" + 
i));
  assertU(adoc("id", i+ "6", "str", "a", "float", "64.5", "int", "7" + i));
  assertU(adoc("id", i+ "1", "str", "a", "float", "64.5", "int", "7" + i));
  assertU(adoc("id", i+ "4", "str", "a", "float", "11.1", "int", "6" + i));
  assertU(adoc("id", i+ "3", "str", "a", "float", "11.1")); // int is 
missing
}
assertU(commit());

// Prepare a list (sorted) of docIds we expect to find, populated without 
using a cursor or timeAllowed
List expectedDocIds = new ArrayList<>();
Map docResponse = (Map) fromJSONString(h.query(req(params("q", "-str:b", 
"rows", "700", "fl", "id", "sort", "id desc";
expectedDocIds.addAll((Collection) 
(((Map)docResponse.get("response")).get("docs")));

cursorMark = CURSOR_MARK_START;
params = params("q", "-str:b",
"rows","40",
"fl", "id",
"sort", "id desc");

List foundDocIds = new ArrayList<>();

boolean cursorAdvanced = false;
do {
  cursorAdvanced = false;

  // If the cursor does not advance, we increase timeAllowed until it's 
long enough to return a result
  for(int timeAllowed=1; timeAllowed<=100; timeAllowed++) { // Keep 
timeAllowed between 1 and 100ms
String json = assertJQ(req(params, CURSOR_MARK_PARAM, cursorMark, 
CommonParams.TIME_ALLOWED, "" + timeAllowed));
Map response = (Map) fromJSONString(json);
String next = (String)response.get(CURSOR_MARK_NEXT);

assertNotNull(CURSOR_MARK_NEXT + " is null", next);

if(null != next && !cursorMark.equals(next)) {
  // Cursor advanced, record foundDocs and move on
  foundDocIds.addAll((Collection) 
(((Map)response.get("response")).get("docs")));
  cursorMark = next;
  cursorAdvanced = true;
  break;
} else if(foundDocIds.size() != 700) {
  // Unless we've found all documents, the result must be partial
  assertNull("Response is missing partialResults header for: "
  + "old cursor " + cursorMark + " new cursor: " + next + " 
timeAllowed " + timeAllowed + " foundDocIds: " + foundDocIds.size(),
  JSONTestUtil.match(json, "/responseHeader/partialResults==true", 
JSONTestUtil.DEFAULT_DELTA));
}
  }
} while(cursorAdvanced);


[jira] [Commented] (SOLR-14413) allow timeAllowed and cursorMark parameters

2020-08-18 Thread Bram Van Dam (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17179702#comment-17179702
 ] 

Bram Van Dam commented on SOLR-14413:
-

[~epugh] I created a little test case to demonstrate the two parameters working 
together, but in doing so I appear to have done the opposite. The test fails 
roughly 2/3 of the time. There appear to be three failure modes:

# nextCursorMark is null (when it should be the same as the previous cursor 
mark, or possibly a new mark)
# partialResults header is missing (this only seems to happen when the response 
does not contain any results)
# not all documents are hit even after the cursor has reached its end (this one 
occurs much less frequently)

When the assertions checking failure mode 1 and 2 are disabled, the test only 
fails 1 time out of 4, when failure more 3 is hit. IMO only the third failure 
mode is unacceptable, because it produces incorrect results. 

The test never fails when minimum timeAllowed is set to 50ms or above. Values 
that low are obviously ridiculous. Perhaps the documentation (or the 
implementation?) should be updated to discourage unreasonably low values.

I'll attach an updated version of [~slackhappy]'s patch which includes my test 
case. And I'll paste the test case here for good measure. Apologies for the 
mess, I'm not familiar with Solr's test harness, so I'm sure there cleaner ways 
of doing this.


{code:java}
  /**
   *  - check whether the cursor can be advanced when using timeAllowed
   *  - ensure partial results are advertised as such
   *  - check the correctness of those results
   */
  @SuppressWarnings({"unchecked", "rawtypes"})
  public void testTimeAllowedAdvancesCursor() throws Exception {
String cursorMark;
ModifiableSolrParams params = null;

// Add 1000 docs, anything less and the requests complete too quickly to be 
interesting
for(int i=1;i<=100;i++) {
  // don't add in order of any field to ensure we aren't inadvertently
  // counting on internal docid ordering
  assertU(adoc("id", i+ "9", "str", "c", "float", "-3.2", "int", "42" + i));
  assertU(adoc("id", i+ "7", "str", "c", "float", "-3.2", "int", "-1976" + 
i));
  assertU(adoc("id", i+ "2", "str", "c", "float", "-3.2", "int", "666" + 
i));
  assertU(adoc("id", i+ "0", "str", "b", "float", "64.5", "int", "-42" + 
i));
  assertU(adoc("id", i+ "5", "str", "b", "float", "64.5", "int", "2001" + 
i));
  assertU(adoc("id", i+ "8", "str", "b", "float", "64.5", "int", "4055" + 
i));
  assertU(adoc("id", i+ "6", "str", "a", "float", "64.5", "int", "7" + i));
  assertU(adoc("id", i+ "1", "str", "a", "float", "64.5", "int", "7" + i));
  assertU(adoc("id", i+ "4", "str", "a", "float", "11.1", "int", "6" + i));
  assertU(adoc("id", i+ "3", "str", "a", "float", "11.1")); // int is 
missing
}
assertU(commit());

// Prepare a list (sorted) of docIds we expect to find, populated without 
using a cursor or timeAllowed
List expectedDocIds = new ArrayList<>();
Map docResponse = (Map) fromJSONString(h.query(req(params("q", "-str:b", 
"rows", "700", "fl", "id", "sort", "id desc";
expectedDocIds.addAll((Collection) 
(((Map)docResponse.get("response")).get("docs")));

cursorMark = CURSOR_MARK_START;
params = params("q", "-str:b",
"rows","40",
"fl", "id",
"sort", "id desc");

List foundDocIds = new ArrayList<>();

boolean cursorAdvanced = false;
do {
  cursorAdvanced = false;

  // If the cursor does not advance, we increase timeAllowed until it's 
long enough to return a result
  for(int timeAllowed=1; timeAllowed<=100; timeAllowed++) { // Keep 
timeAllowed between 1 and 100ms
String json = assertJQ(req(params, CURSOR_MARK_PARAM, cursorMark, 
CommonParams.TIME_ALLOWED, "" + timeAllowed));
Map response = (Map) fromJSONString(json);
String next = (String)response.get(CURSOR_MARK_NEXT);

assertNotNull(CURSOR_MARK_NEXT + " is null", next);

if(null != next && !cursorMark.equals(next)) {
  // Cursor advanced, record foundDocs and move on
  foundDocIds.addAll((Collection) 
(((Map)response.get("response")).get("docs")));
  cursorMark = next;
  cursorAdvanced = true;
  break;
} else if(foundDocIds.size() != 700) {
  // Unless we've found all documents, the result must be partial
  assertNull("Response is missing partialResults header for: "
  + "old cursor " + cursorMark + " new cursor: " + next + " 
timeAllowed " + timeAllowed + " foundDocIds: " + foundDocIds.size(),
  JSONTestUtil.match(json, "/responseHeader/partialResults==true", 
JSONTestUtil.DEFAULT_DELTA));
}
  }
} while(cursorAdvanced);

assertEquals("Should have found 700 documents eventually", expectedDocIds, 

[jira] [Commented] (SOLR-14660) Migrating HDFS into a package

2020-08-18 Thread Ishan Chattopadhyaya (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17179694#comment-17179694
 ] 

Ishan Chattopadhyaya commented on SOLR-14660:
-

[~warper], let us know once you have a PR, even if it is work in progress. As a 
first step, we can isolate all the HDFS support into a contrib package 
(SOLR-13989) such that solr-core doesn't refer to them in any way. Next step, 
Noble and I can help in the actual package manager support (writing a manifest, 
making sure wiring is possible).

> Migrating HDFS into a package
> -
>
> Key: SOLR-14660
> URL: https://issues.apache.org/jira/browse/SOLR-14660
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Ishan Chattopadhyaya
>Priority: Major
>
> Following up on the deprecation of HDFS (SOLR-14021), we need to work on 
> isolating it away from Solr core and making a package for this. This issue is 
> to track the efforts for that.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14021) Deprecate HDFS support from 8x

2020-08-18 Thread Ishan Chattopadhyaya (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17179693#comment-17179693
 ] 

Ishan Chattopadhyaya commented on SOLR-14021:
-

The migration of HDFS into a package is tracked here: 
https://issues.apache.org/jira/browse/SOLR-14660

> Deprecate HDFS support from 8x
> --
>
> Key: SOLR-14021
> URL: https://issues.apache.org/jira/browse/SOLR-14021
> Project: Solr
>  Issue Type: Improvement
>  Components: hdfs
>Reporter: Joel Bernstein
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Fix For: 8.6
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This ticket is to deprecate HDFS support from 8x.
> There appears to be growing consensus among committers that it's time to 
> start removing features so committers can have a manageable system to 
> maintain. HDFS has come up a number of times as needing to be removed. The 
> HDFS tests have not been maintained over the years and fail frequently. We 
> need to start removing features that no one cares about enough to even 
> maintain the tests.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] chatman commented on a change in pull request #1758: SOLR-14749: Provide a clean API for cluster-level event processing, Initial draft.

2020-08-18 Thread GitBox


chatman commented on a change in pull request #1758:
URL: https://github.com/apache/lucene-solr/pull/1758#discussion_r472270151



##
File path: solr/core/src/java/org/apache/solr/cluster/events/ScheduledEvent.java
##
@@ -0,0 +1,30 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.cluster.events;
+
+/**
+ *
+ */
+public interface ScheduledEvent extends ClusterEvent {

Review comment:
   Sorry, this was asked here already: 
https://github.com/apache/lucene-solr/pull/1758#discussion_r471838373





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] chatman commented on a change in pull request #1758: SOLR-14749: Provide a clean API for cluster-level event processing, Initial draft.

2020-08-18 Thread GitBox


chatman commented on a change in pull request #1758:
URL: https://github.com/apache/lucene-solr/pull/1758#discussion_r472269946



##
File path: solr/core/src/java/org/apache/solr/cloud/events/ScheduledEvent.java
##
@@ -0,0 +1,25 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.cloud.events;
+
+/**
+ *
+ */
+public interface ScheduledEvent extends ClusterEvent {
+  String getScheduleName();
+  Object getScheduleParam(String key);

Review comment:
   > There's a need for some periodic maintenance tasks, and currently it 
was implemented as a part of the autoscaling triggers
   
   I'm -1 on having the concept of triggers or scheduled events leaking into 
Solr core. It should be a concept internal to the autoscaling-framework. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] chatman commented on a change in pull request #1758: SOLR-14749: Provide a clean API for cluster-level event processing, Initial draft.

2020-08-18 Thread GitBox


chatman commented on a change in pull request #1758:
URL: https://github.com/apache/lucene-solr/pull/1758#discussion_r472263584



##
File path: 
solr/core/src/java/org/apache/solr/cluster/events/ClusterSingleton.java
##
@@ -0,0 +1,14 @@
+package org.apache.solr.cluster.events;
+
+import java.lang.annotation.Retention;
+import java.lang.annotation.RetentionPolicy;
+
+/**
+ * Intended for {@link org.apache.solr.core.CoreContainer} plugins that should 
be
+ * enabled only one instance per cluster.
+ * Implementation detail: currently these plugins are instantiated on the
+ * Overseer leader, and closed when the current node loses its leadership.
+ */
+@Retention(RetentionPolicy.RUNTIME)
+public @interface ClusterSingleton {

Review comment:
   I'm not sure Solr core code should be responsible for coordinating 
plugins' coordination like this. I feel we should keep this coordination left 
to the plugins. I am NOT saying that the autoscaling plugin should do this on 
its own, but I am just saying that "some plugin" should do it for the 
autoscaling plugin. IOW, this coordination code shouldn't be in solr-core.
   As an example, we can have two plugins:
   "autoscaling-framework-plugin" depends on "lifecycle-coordination-plugin". 
The latter can have these concepts defined in them, including the actual 
implementation.

##
File path: solr/core/src/java/org/apache/solr/cluster/events/ScheduledEvent.java
##
@@ -0,0 +1,30 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.cluster.events;
+
+/**
+ *
+ */
+public interface ScheduledEvent extends ClusterEvent {

Review comment:
   Can you please provide an example of a scheduled event? I am unable to 
see why a scheduled event is needed in Solr core.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9468) "./gradlew -p lucene test" leaks lucene/core/build/tmp/test/jar_extract_XXXX files

2020-08-18 Thread Dawid Weiss (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17179676#comment-17179676
 ] 

Dawid Weiss commented on LUCENE-9468:
-

LUCENE-9465 is already implemented, Mike. 

I don't work (or advocate) for gradle... I use it and I think it's fine most of 
the time. This said, it's like any other software: bugs happen. I am a bit 
disappointed they didn't even respond to that issue where I explicitly pointed 
out where the problem was/ is...

> "./gradlew -p lucene test" leaks lucene/core/build/tmp/test/jar_extract_ 
> files
> --
>
> Key: LUCENE-9468
> URL: https://issues.apache.org/jira/browse/LUCENE-9468
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael McCandless
>Priority: Major
>
> I have been stupid-beasting ({{while(true) "./gradlew -p lucene test"}}) 
> Lucene core + modules tests, and noticed that I have accumulated many 
> (~10.3K) files like these:
> {noformat}
> -rw-r--r-- 1 mike mike  4954 Aug 17 12:49 
> lucene/core/build/tmp/test/jar_extract_9943749690454255636_tmp
> -rw-r--r-- 1 mike mike 84929 Aug 17 13:24 
> lucene/core/build/tmp/test/jar_extract_9949237584070142535_tmp
> -rw-r--r-- 1 mike mike 14627 Aug 17 14:01 
> lucene/core/build/tmp/test/jar_extract_9950285037002822552_tmp
> -rw-r--r-- 1 mike mike 15935 Aug 17 15:10 
> lucene/core/build/tmp/test/jar_extract_995066590821695944_tmp
> -rw-r--r-- 1 mike mike  5488 Aug 17 11:23 
> lucene/core/build/tmp/test/jar_extract_9952865172929838404_tmp
> -rw-r--r-- 1 mike mike 19078 Aug 17 14:41 
> lucene/core/build/tmp/test/jar_extract_9960016969100835830_tmp
> -rw-r--r-- 1 mike mike  5488 Aug 17 10:07 
> lucene/core/build/tmp/test/jar_extract_9960479662672452908_tmp
> -rw-r--r-- 1 mike mike 28711 Aug 17 13:07 
> lucene/core/build/tmp/test/jar_extract_996420631017213954_tmp
> -rw-r--r-- 1 mike mike 84929 Aug 17 10:30 
> lucene/core/build/tmp/test/jar_extract_9964495910786482810_tmp
> -rw-r--r-- 1 mike mike  8034 Aug 17 17:10 
> lucene/core/build/tmp/test/jar_extract_9965528236930220207_tmp
> -rw-r--r-- 1 mike mike 43565 Aug 17 11:52 
> lucene/core/build/tmp/test/jar_extract_9967892842722777228_tmp
> -rw-r--r-- 1 mike mike 36278 Aug 17 10:36 
> lucene/core/build/tmp/test/jar_extract_996836107828729763_tmp
> -rw-r--r-- 1 mike mike 29997 Aug 17 15:48 
> lucene/core/build/tmp/test/jar_extract_9968527122717193835_tmp
> -rw-r--r-- 1 mike mike 19078 Aug 17 11:11 
> lucene/core/build/tmp/test/jar_extract_9968609693107939092_tmp
> -rw-r--r-- 1 mike mike  9920 Aug 17 14:31 
> lucene/core/build/tmp/test/jar_extract_9968809316564216653_tmp
> -rw-r--r-- 1 mike mike 19078 Aug 17 13:22 
> lucene/core/build/tmp/test/jar_extract_9969318805542859308_tmp
> -rw-r--r-- 1 mike mike 28711 Aug 17 11:44 
> lucene/core/build/tmp/test/jar_extract_9974798403956637924_tmp {noformat}
> It seems to grow by ~64 files (how many cores I configured gradle to use in 
> my {{./gradle.properties}} on each iteration of beasting.
> I think {{gradle}} should be cleaning this up maybe?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14636) Provide a reference implementation for SolrCloud that is stable and fast.

2020-08-18 Thread Mark Robert Miller (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17179665#comment-17179665
 ] 

Mark Robert Miller commented on SOLR-14636:
---

A smart guy with similar experiences said to me “I think maybe that’s just the 
way these projects go. There is a shelf life, it burns out and then you need a 
new project.”

On average I’m sure he was right, but if I have anything to say about it I’m 
sure he was wrong. Lucene won’t go that way. Nothing says Solr has to. 

> Provide a reference implementation for SolrCloud that is stable and fast.
> -
>
> Key: SOLR-14636
> URL: https://issues.apache.org/jira/browse/SOLR-14636
> Project: Solr
>  Issue Type: Task
>Reporter: Mark Robert Miller
>Assignee: Mark Robert Miller
>Priority: Major
> Attachments: IMG_5575 (1).jpg, jenkins.png, solr-ref-branch.gif
>
>
> SolrCloud powers critical infrastructure and needs the ability to run quickly 
> with stability. This reference implementation will allow for this.
> *location*: [https://github.com/apache/lucene-solr/tree/reference_impl]
> *status*: alpha
> *speed*: ludicrous
> *tests***:
>  * *core*: {color:#00875a}*extremely stable*{color} with 
> *{color:#de350b}ignores{color}*
>  * *solrj*: {color:#00875a}*extremely stable*{color} with 
> {color:#de350b}*ignores*{color}
>  * *test-framework*: *extremely stable* with {color:#de350b}*ignores*{color}
>  * *contrib/analysis-extras*: *extremely stable* with 
> {color:#de350b}*ignores*{color}
>  * *contrib/analytics*: {color:#00875a}*extremely stable*{color} with 
> {color:#de350b}*ignores*{color}
>  * *contrib/clustering*: {color:#00875a}*extremely stable*{color} with 
> *{color:#de350b}ignores{color}*
>  * *contrib/dataimporthandler*: {color:#00875a}*extremely stable*{color} with 
> {color:#de350b}*ignores*{color}
>  * *contrib/dataimporthandler-extras*: {color:#00875a}*extremely 
> stable*{color} with *{color:#de350b}ignores{color}*
>  * *contrib/extraction*: {color:#00875a}*extremely stable*{color} with 
> {color:#de350b}*ignores*{color}
>  * *contrib/jaegertracer-configurator*: {color:#00875a}*extremely 
> stable*{color} with {color:#de350b}*ignores*{color}
>  * *contrib/langid*: {color:#00875a}*extremely stable*{color} with 
> {color:#de350b}*ignores*{color}
>  * *contrib/prometheus-exporter*: {color:#00875a}*extremely stable*{color} 
> with {color:#de350b}*ignores*{color}
>  * *contrib/velocity*: {color:#00875a}*extremely stable*{color} with 
> {color:#de350b}*ignores*{color}
> _* Running tests quickly and efficiently with strict policing will more 
> frequently find bugs and requires a period of hardening._
>  _** Non Nightly currently, Nightly comes last._



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9468) "./gradlew -p lucene test" leaks lucene/core/build/tmp/test/jar_extract_XXXX files

2020-08-18 Thread Michael McCandless (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17179659#comment-17179659
 ] 

Michael McCandless commented on LUCENE-9468:


Thanks [~dweiss] ... I had ~10K of these gradle buggy tmp files by the time I 
noticed it, but it is easy enough for me to workaround.

I do not want to do a {{clean}} on every beasting run, though.  That'd just 
decrease the already poor test efficiency!

So, I just inserted an explicit {{rm -f 
lucene/core/build/tmp/test/jar_extract*}} between beast iterations.

Hmm, OK, I see, it is far more than just these {{jar_extract_*}} files – even 
after removing those, I have ~15K other files, created by our tests I think, 
totaling ~9.6 GB.  OK, I will change my script to {{rm -rf 
lucene/core/build/tmp}} each time.  They truly are temp files right?  I will 
not slow down subsequent iterations by removing them?

Admittedly this is an exotic use case!  And maybe once we have true beasting 
support with gradle (LUCENE-9465) I can switch to that.  Most users would 
probably remember to insert a {{gradle clean}} now and then.  But it is 
disturbing that if you fail to do that, you can lose GBs to these supposedly 
temporary files!
{quote}There is a related bigger issue with gradle workers leaving their stuff 
in user's temp - this is what I really don't like...

This is the issue:
[https://github.com/gradle/gradle/issues/12020]
{quote}
Thanks for finding/sharing that issue.  That is indeed annoying, even moreso 
than this issue!  In fact, I went and ran {{ls /tmp}} on my 128 core beast box. 
 It took ~8 seconds to come back with (get this!) ~675K supposedly temporary 
gradle files!  Sheesh.  I will also fix my beasting script to remove THOSE ones 
too.

I guess proper garbage collection is truly hard.

 

> "./gradlew -p lucene test" leaks lucene/core/build/tmp/test/jar_extract_ 
> files
> --
>
> Key: LUCENE-9468
> URL: https://issues.apache.org/jira/browse/LUCENE-9468
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael McCandless
>Priority: Major
>
> I have been stupid-beasting ({{while(true) "./gradlew -p lucene test"}}) 
> Lucene core + modules tests, and noticed that I have accumulated many 
> (~10.3K) files like these:
> {noformat}
> -rw-r--r-- 1 mike mike  4954 Aug 17 12:49 
> lucene/core/build/tmp/test/jar_extract_9943749690454255636_tmp
> -rw-r--r-- 1 mike mike 84929 Aug 17 13:24 
> lucene/core/build/tmp/test/jar_extract_9949237584070142535_tmp
> -rw-r--r-- 1 mike mike 14627 Aug 17 14:01 
> lucene/core/build/tmp/test/jar_extract_9950285037002822552_tmp
> -rw-r--r-- 1 mike mike 15935 Aug 17 15:10 
> lucene/core/build/tmp/test/jar_extract_995066590821695944_tmp
> -rw-r--r-- 1 mike mike  5488 Aug 17 11:23 
> lucene/core/build/tmp/test/jar_extract_9952865172929838404_tmp
> -rw-r--r-- 1 mike mike 19078 Aug 17 14:41 
> lucene/core/build/tmp/test/jar_extract_9960016969100835830_tmp
> -rw-r--r-- 1 mike mike  5488 Aug 17 10:07 
> lucene/core/build/tmp/test/jar_extract_9960479662672452908_tmp
> -rw-r--r-- 1 mike mike 28711 Aug 17 13:07 
> lucene/core/build/tmp/test/jar_extract_996420631017213954_tmp
> -rw-r--r-- 1 mike mike 84929 Aug 17 10:30 
> lucene/core/build/tmp/test/jar_extract_9964495910786482810_tmp
> -rw-r--r-- 1 mike mike  8034 Aug 17 17:10 
> lucene/core/build/tmp/test/jar_extract_9965528236930220207_tmp
> -rw-r--r-- 1 mike mike 43565 Aug 17 11:52 
> lucene/core/build/tmp/test/jar_extract_9967892842722777228_tmp
> -rw-r--r-- 1 mike mike 36278 Aug 17 10:36 
> lucene/core/build/tmp/test/jar_extract_996836107828729763_tmp
> -rw-r--r-- 1 mike mike 29997 Aug 17 15:48 
> lucene/core/build/tmp/test/jar_extract_9968527122717193835_tmp
> -rw-r--r-- 1 mike mike 19078 Aug 17 11:11 
> lucene/core/build/tmp/test/jar_extract_9968609693107939092_tmp
> -rw-r--r-- 1 mike mike  9920 Aug 17 14:31 
> lucene/core/build/tmp/test/jar_extract_9968809316564216653_tmp
> -rw-r--r-- 1 mike mike 19078 Aug 17 13:22 
> lucene/core/build/tmp/test/jar_extract_9969318805542859308_tmp
> -rw-r--r-- 1 mike mike 28711 Aug 17 11:44 
> lucene/core/build/tmp/test/jar_extract_9974798403956637924_tmp {noformat}
> It seems to grow by ~64 files (how many cores I configured gradle to use in 
> my {{./gradle.properties}} on each iteration of beasting.
> I think {{gradle}} should be cleaning this up maybe?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] murblanc commented on a change in pull request #1758: SOLR-14749: Provide a clean API for cluster-level event processing, Initial draft.

2020-08-18 Thread GitBox


murblanc commented on a change in pull request #1758:
URL: https://github.com/apache/lucene-solr/pull/1758#discussion_r472239057



##
File path: solr/core/src/java/org/apache/solr/cloud/events/ScheduledEvent.java
##
@@ -0,0 +1,25 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.cloud.events;
+
+/**
+ *
+ */
+public interface ScheduledEvent extends ClusterEvent {
+  String getScheduleName();
+  Object getScheduleParam(String key);

Review comment:
   I'm all for these events to be just like the other ones. Just define 
what a schedule is and strong type whatever can be... 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14636) Provide a reference implementation for SolrCloud that is stable and fast.

2020-08-18 Thread Mark Robert Miller (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17179638#comment-17179638
 ] 

Mark Robert Miller commented on SOLR-14636:
---

And you know, Robert and I never would have put up with this shit for this 
long. But it all fell apart, and while everyone thinks they have no 
responsibility and nothing to do with it, you are all wrong. Stop accepting pmc 
initiations! Stop accepting committer invitations! They come with 
responsibility, and I’d have fired most of you at this point. 

> Provide a reference implementation for SolrCloud that is stable and fast.
> -
>
> Key: SOLR-14636
> URL: https://issues.apache.org/jira/browse/SOLR-14636
> Project: Solr
>  Issue Type: Task
>Reporter: Mark Robert Miller
>Assignee: Mark Robert Miller
>Priority: Major
> Attachments: IMG_5575 (1).jpg, jenkins.png, solr-ref-branch.gif
>
>
> SolrCloud powers critical infrastructure and needs the ability to run quickly 
> with stability. This reference implementation will allow for this.
> *location*: [https://github.com/apache/lucene-solr/tree/reference_impl]
> *status*: alpha
> *speed*: ludicrous
> *tests***:
>  * *core*: {color:#00875a}*extremely stable*{color} with 
> *{color:#de350b}ignores{color}*
>  * *solrj*: {color:#00875a}*extremely stable*{color} with 
> {color:#de350b}*ignores*{color}
>  * *test-framework*: *extremely stable* with {color:#de350b}*ignores*{color}
>  * *contrib/analysis-extras*: *extremely stable* with 
> {color:#de350b}*ignores*{color}
>  * *contrib/analytics*: {color:#00875a}*extremely stable*{color} with 
> {color:#de350b}*ignores*{color}
>  * *contrib/clustering*: {color:#00875a}*extremely stable*{color} with 
> *{color:#de350b}ignores{color}*
>  * *contrib/dataimporthandler*: {color:#00875a}*extremely stable*{color} with 
> {color:#de350b}*ignores*{color}
>  * *contrib/dataimporthandler-extras*: {color:#00875a}*extremely 
> stable*{color} with *{color:#de350b}ignores{color}*
>  * *contrib/extraction*: {color:#00875a}*extremely stable*{color} with 
> {color:#de350b}*ignores*{color}
>  * *contrib/jaegertracer-configurator*: {color:#00875a}*extremely 
> stable*{color} with {color:#de350b}*ignores*{color}
>  * *contrib/langid*: {color:#00875a}*extremely stable*{color} with 
> {color:#de350b}*ignores*{color}
>  * *contrib/prometheus-exporter*: {color:#00875a}*extremely stable*{color} 
> with {color:#de350b}*ignores*{color}
>  * *contrib/velocity*: {color:#00875a}*extremely stable*{color} with 
> {color:#de350b}*ignores*{color}
> _* Running tests quickly and efficiently with strict policing will more 
> frequently find bugs and requires a period of hardening._
>  _** Non Nightly currently, Nightly comes last._



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14662) Elevation with distributed search causes NPE

2020-08-18 Thread Thomas Schmiereck (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17179636#comment-17179636
 ] 

Thomas Schmiereck commented on SOLR-14662:
--

[~erickerickson] Thanks for the hint. I will keep that in mind in the future.

A hint from me, the Bug also appears in 8.6 and I've tested the Fix also 
successfully in 8.6.

> Elevation with distributed search causes NPE
> 
>
> Key: SOLR-14662
> URL: https://issues.apache.org/jira/browse/SOLR-14662
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 8.2
>Reporter: Marc Linden
>Priority: Major
>  Labels: distributed_search, elevation
> Attachments: 
> SOLR-14662-Elevation-with-distributed-search-causes-.patch
>
>
> When performing a distributed search with multiple shards having elevation 
> configured where one ore more shards do have elevated results but others do 
> not a NullPointerException is thrown.
> We are using Solr 8.2 and have the QueryElevationComponent configured with 
> "last-components" of the default search handler "/select". But the problem 
> also occurs when using the explicit "/elevate" search handler.
> {code:xml}
>   
>   ...
>   
> elevator
>
>   
>   ...
>   
>   
>   string
>   elevate.xml
>   
> {code}
> h3. Steps to reproduce:
>  (1) Add entries to the elevate.xml of each core to elevate a specific 
> document for the text "elevatedTerm
> {code:xml}
> core1:
>   
> ...
> 
>   
> core2:
>   
> ...
> 
>   
> {code}
>  (2) Execute query (we use port 9983)
> {noformat}
> http://localhost:9983/solr/core1/select?q=elevatedTerm=false=text_en=edismax=lang:en=localhost:9983/solr/core1,localhost:9983/solr/core2=[elevated],[shard],area,id=10=0
> {noformat}
>  As both shards have elevated documents for the requested "elevatedTerm" the 
> search results are as expected:
> {noformat}
> response: {
>   numFound: 5192,
>   start: 0,
>   maxScore: 1.9032197,
>   docs: [{
> area: "press",
> id: "core1docId1",
> [elevated]: true,
> [shard]: "localhost:9983/solr/core1"
>   }, {
> area: "products",
> id: "core2docId1",
> [elevated]: true,
> [shard]: "localhost:9983/solr/core2"
>   }, {
> area: "press",
> id: "core1docId2",
> [elevated]: false,
> [shard]: "localhost:9983/solr/core1"
>   },
>   ...
> {noformat}
>  (3) Remove the elevation entry for that "elevatedTerm" from one of the 
> cores, e.g. via comment
> {code:xml}
> core2:
>   
> ...
> 
>   
> {code}
> (4) Reload the modified core: 
> [http://localhost:9983/solr/admin/cores?action=RELOAD=core2]
> (5) Request same query again and you get the NPE:
> {noformat}
> error: {
>   trace: "java.lang.NullPointerException
>  at 
> org.apache.solr.handler.component.QueryComponent.unmarshalSortValues(QueryComponent.java:1068)
>  at 
> org.apache.solr.handler.component.QueryComponent.mergeIds(QueryComponent.java:917)
>  at 
> org.apache.solr.handler.component.QueryComponent.handleRegularResponses(QueryComponent.java:613)
>  at 
> org.apache.solr.handler.component.QueryComponent.handleResponses(QueryComponent.java:592)
>  at 
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:431)
>  at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:199)
>  at org.apache.solr.core.SolrCore.execute(SolrCore.java:2578)
>  at 
> org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:780)
>  at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:566)
>  at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:423)
>  at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:350)
>  at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1602)
>  at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:540)
>  ...
> {noformat}
> When adding the {{sort}} parameter with {{forceElevation=true}} to the query 
> then a ClassCastException is thrown
> {noformat}
> http://localhost:9983/solr/core1/select?q=elevatedTerm=false=text_en=edismax=lang:en=localhost:9983/solr/core1,localhost:9983/solr/core2=[elevated],[shard],area,id=10=0=area%20asc=true{noformat}
> {noformat}
> java.lang.ClassCastException: java.lang.Integer cannot be cast to 
> java.lang.String
>   at 
> org.apache.solr.schema.FieldType.unmarshalStringSortValue(FieldType.java:1229)
>   at org.apache.solr.schema.StrField.unmarshalSortValue(StrField.java:122)
>   at 
> org.apache.solr.handler.component.QueryComponent.unmarshalSortValues(QueryComponent.java:1092)
>   at 

[jira] [Updated] (SOLR-14695) Support loading of unsigned jars

2020-08-18 Thread Noble Paul (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-14695:
--
Description: 
Solr distribution can keep a set of sha512 hashes of already trusted jars. This 
helps loading first party jars without signing.

The file may look as follows and this is placed at 
{{/server/resources/artifacts.json}}
{code:json}
{
  "file-sha512" : {
"dih-8.6.1.jar" : 
"d01b51de67ae1680a84a813983b1de3b592fc32f1a22b662fc9057da5953abd1b72476388ba342cad21671cd0b805503c78ab9075ff2f3951fdf75fa16981420"
  }
}
{code}
 * if the sha512 of a certain file is trusted, it does not have to be signed 
with any keys.
 * There is no API to create or modify this. The Solr build scripts create this 
file at build time and add this to the distro

see the 
[document|https://docs.google.com/document/d/1n7gB2JAdZhlJKFrCd4Txcw4HDkdk7hlULyAZBS-wXrE/edit#]
 for more details

  was:
Solr distribution can keep a set of sha512 hashes of already trusted jars. This 
helps loading first party jars without signing.

The file may look as follows and this is placed at 
{{/filestore/\_trusted_/artifacts.json}}
{code:json}
{
  "file-sha512" : {
"dih-8.6.1.jar" : 
"d01b51de67ae1680a84a813983b1de3b592fc32f1a22b662fc9057da5953abd1b72476388ba342cad21671cd0b805503c78ab9075ff2f3951fdf75fa16981420"
  }
}
{code}
 * if the sha512 of a certain file is trusted, it does not have to be signed 
with any keys.
 * There is no API to create or modify this. The Solr build scripts create this 
file at build time and add this to the distro

see the 
[document|https://docs.google.com/document/d/1n7gB2JAdZhlJKFrCd4Txcw4HDkdk7hlULyAZBS-wXrE/edit#]
 for more details


> Support loading of unsigned jars
> 
>
> Key: SOLR-14695
> URL: https://issues.apache.org/jira/browse/SOLR-14695
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Package Manager, packages
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>
> Solr distribution can keep a set of sha512 hashes of already trusted jars. 
> This helps loading first party jars without signing.
> The file may look as follows and this is placed at 
> {{/server/resources/artifacts.json}}
> {code:json}
> {
>   "file-sha512" : {
> "dih-8.6.1.jar" : 
> "d01b51de67ae1680a84a813983b1de3b592fc32f1a22b662fc9057da5953abd1b72476388ba342cad21671cd0b805503c78ab9075ff2f3951fdf75fa16981420"
>   }
> }
> {code}
>  * if the sha512 of a certain file is trusted, it does not have to be signed 
> with any keys.
>  * There is no API to create or modify this. The Solr build scripts create 
> this file at build time and add this to the distro
> see the 
> [document|https://docs.google.com/document/d/1n7gB2JAdZhlJKFrCd4Txcw4HDkdk7hlULyAZBS-wXrE/edit#]
>  for more details



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9468) "./gradlew -p lucene test" leaks lucene/core/build/tmp/test/jar_extract_XXXX files

2020-08-18 Thread Dawid Weiss (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17179627#comment-17179627
 ] 

Dawid Weiss commented on LUCENE-9468:
-

This is the issue:
https://github.com/gradle/gradle/issues/12020

> "./gradlew -p lucene test" leaks lucene/core/build/tmp/test/jar_extract_ 
> files
> --
>
> Key: LUCENE-9468
> URL: https://issues.apache.org/jira/browse/LUCENE-9468
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael McCandless
>Priority: Major
>
> I have been stupid-beasting ({{while(true) "./gradlew -p lucene test"}}) 
> Lucene core + modules tests, and noticed that I have accumulated many 
> (~10.3K) files like these:
> {noformat}
> -rw-r--r-- 1 mike mike  4954 Aug 17 12:49 
> lucene/core/build/tmp/test/jar_extract_9943749690454255636_tmp
> -rw-r--r-- 1 mike mike 84929 Aug 17 13:24 
> lucene/core/build/tmp/test/jar_extract_9949237584070142535_tmp
> -rw-r--r-- 1 mike mike 14627 Aug 17 14:01 
> lucene/core/build/tmp/test/jar_extract_9950285037002822552_tmp
> -rw-r--r-- 1 mike mike 15935 Aug 17 15:10 
> lucene/core/build/tmp/test/jar_extract_995066590821695944_tmp
> -rw-r--r-- 1 mike mike  5488 Aug 17 11:23 
> lucene/core/build/tmp/test/jar_extract_9952865172929838404_tmp
> -rw-r--r-- 1 mike mike 19078 Aug 17 14:41 
> lucene/core/build/tmp/test/jar_extract_9960016969100835830_tmp
> -rw-r--r-- 1 mike mike  5488 Aug 17 10:07 
> lucene/core/build/tmp/test/jar_extract_9960479662672452908_tmp
> -rw-r--r-- 1 mike mike 28711 Aug 17 13:07 
> lucene/core/build/tmp/test/jar_extract_996420631017213954_tmp
> -rw-r--r-- 1 mike mike 84929 Aug 17 10:30 
> lucene/core/build/tmp/test/jar_extract_9964495910786482810_tmp
> -rw-r--r-- 1 mike mike  8034 Aug 17 17:10 
> lucene/core/build/tmp/test/jar_extract_9965528236930220207_tmp
> -rw-r--r-- 1 mike mike 43565 Aug 17 11:52 
> lucene/core/build/tmp/test/jar_extract_9967892842722777228_tmp
> -rw-r--r-- 1 mike mike 36278 Aug 17 10:36 
> lucene/core/build/tmp/test/jar_extract_996836107828729763_tmp
> -rw-r--r-- 1 mike mike 29997 Aug 17 15:48 
> lucene/core/build/tmp/test/jar_extract_9968527122717193835_tmp
> -rw-r--r-- 1 mike mike 19078 Aug 17 11:11 
> lucene/core/build/tmp/test/jar_extract_9968609693107939092_tmp
> -rw-r--r-- 1 mike mike  9920 Aug 17 14:31 
> lucene/core/build/tmp/test/jar_extract_9968809316564216653_tmp
> -rw-r--r-- 1 mike mike 19078 Aug 17 13:22 
> lucene/core/build/tmp/test/jar_extract_9969318805542859308_tmp
> -rw-r--r-- 1 mike mike 28711 Aug 17 11:44 
> lucene/core/build/tmp/test/jar_extract_9974798403956637924_tmp {noformat}
> It seems to grow by ~64 files (how many cores I configured gradle to use in 
> my {{./gradle.properties}} on each iteration of beasting.
> I think {{gradle}} should be cleaning this up maybe?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9468) "./gradlew -p lucene test" leaks lucene/core/build/tmp/test/jar_extract_XXXX files

2020-08-18 Thread Dawid Weiss (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17179624#comment-17179624
 ] 

Dawid Weiss commented on LUCENE-9468:
-

This 'tmp' folder under build holds all sorts of task-specific temporary data. 
We actually use it to in renderJavadoc, for example. It is a convenient 
mechanism to provide common "temporary space" for all gradle tasks. Do you care 
much about what's under build/tmp, Mike? These are fairly small files and 
eventually a clean will remove all of these.

There is a related bigger issue with gradle workers leaving their stuff in 
user's temp - this is what I really don't like... 

> "./gradlew -p lucene test" leaks lucene/core/build/tmp/test/jar_extract_ 
> files
> --
>
> Key: LUCENE-9468
> URL: https://issues.apache.org/jira/browse/LUCENE-9468
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael McCandless
>Priority: Major
>
> I have been stupid-beasting ({{while(true) "./gradlew -p lucene test"}}) 
> Lucene core + modules tests, and noticed that I have accumulated many 
> (~10.3K) files like these:
> {noformat}
> -rw-r--r-- 1 mike mike  4954 Aug 17 12:49 
> lucene/core/build/tmp/test/jar_extract_9943749690454255636_tmp
> -rw-r--r-- 1 mike mike 84929 Aug 17 13:24 
> lucene/core/build/tmp/test/jar_extract_9949237584070142535_tmp
> -rw-r--r-- 1 mike mike 14627 Aug 17 14:01 
> lucene/core/build/tmp/test/jar_extract_9950285037002822552_tmp
> -rw-r--r-- 1 mike mike 15935 Aug 17 15:10 
> lucene/core/build/tmp/test/jar_extract_995066590821695944_tmp
> -rw-r--r-- 1 mike mike  5488 Aug 17 11:23 
> lucene/core/build/tmp/test/jar_extract_9952865172929838404_tmp
> -rw-r--r-- 1 mike mike 19078 Aug 17 14:41 
> lucene/core/build/tmp/test/jar_extract_9960016969100835830_tmp
> -rw-r--r-- 1 mike mike  5488 Aug 17 10:07 
> lucene/core/build/tmp/test/jar_extract_9960479662672452908_tmp
> -rw-r--r-- 1 mike mike 28711 Aug 17 13:07 
> lucene/core/build/tmp/test/jar_extract_996420631017213954_tmp
> -rw-r--r-- 1 mike mike 84929 Aug 17 10:30 
> lucene/core/build/tmp/test/jar_extract_9964495910786482810_tmp
> -rw-r--r-- 1 mike mike  8034 Aug 17 17:10 
> lucene/core/build/tmp/test/jar_extract_9965528236930220207_tmp
> -rw-r--r-- 1 mike mike 43565 Aug 17 11:52 
> lucene/core/build/tmp/test/jar_extract_9967892842722777228_tmp
> -rw-r--r-- 1 mike mike 36278 Aug 17 10:36 
> lucene/core/build/tmp/test/jar_extract_996836107828729763_tmp
> -rw-r--r-- 1 mike mike 29997 Aug 17 15:48 
> lucene/core/build/tmp/test/jar_extract_9968527122717193835_tmp
> -rw-r--r-- 1 mike mike 19078 Aug 17 11:11 
> lucene/core/build/tmp/test/jar_extract_9968609693107939092_tmp
> -rw-r--r-- 1 mike mike  9920 Aug 17 14:31 
> lucene/core/build/tmp/test/jar_extract_9968809316564216653_tmp
> -rw-r--r-- 1 mike mike 19078 Aug 17 13:22 
> lucene/core/build/tmp/test/jar_extract_9969318805542859308_tmp
> -rw-r--r-- 1 mike mike 28711 Aug 17 11:44 
> lucene/core/build/tmp/test/jar_extract_9974798403956637924_tmp {noformat}
> It seems to grow by ~64 files (how many cores I configured gradle to use in 
> my {{./gradle.properties}} on each iteration of beasting.
> I think {{gradle}} should be cleaning this up maybe?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9468) "./gradlew -p lucene test" leaks lucene/core/build/tmp/test/jar_extract_XXXX files

2020-08-18 Thread Michael McCandless (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17179598#comment-17179598
 ] 

Michael McCandless commented on LUCENE-9468:


OK, I did not realize this is a {{gradle}} bug, sigh.

Yeah we should not try to work around it.  I can easily rm -rf these files in 
between beast runs.

I'll try to open an issue upstream for this then.  Thanks for looking [~dweiss].

> "./gradlew -p lucene test" leaks lucene/core/build/tmp/test/jar_extract_ 
> files
> --
>
> Key: LUCENE-9468
> URL: https://issues.apache.org/jira/browse/LUCENE-9468
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael McCandless
>Priority: Major
>
> I have been stupid-beasting ({{while(true) "./gradlew -p lucene test"}}) 
> Lucene core + modules tests, and noticed that I have accumulated many 
> (~10.3K) files like these:
> {noformat}
> -rw-r--r-- 1 mike mike  4954 Aug 17 12:49 
> lucene/core/build/tmp/test/jar_extract_9943749690454255636_tmp
> -rw-r--r-- 1 mike mike 84929 Aug 17 13:24 
> lucene/core/build/tmp/test/jar_extract_9949237584070142535_tmp
> -rw-r--r-- 1 mike mike 14627 Aug 17 14:01 
> lucene/core/build/tmp/test/jar_extract_9950285037002822552_tmp
> -rw-r--r-- 1 mike mike 15935 Aug 17 15:10 
> lucene/core/build/tmp/test/jar_extract_995066590821695944_tmp
> -rw-r--r-- 1 mike mike  5488 Aug 17 11:23 
> lucene/core/build/tmp/test/jar_extract_9952865172929838404_tmp
> -rw-r--r-- 1 mike mike 19078 Aug 17 14:41 
> lucene/core/build/tmp/test/jar_extract_9960016969100835830_tmp
> -rw-r--r-- 1 mike mike  5488 Aug 17 10:07 
> lucene/core/build/tmp/test/jar_extract_9960479662672452908_tmp
> -rw-r--r-- 1 mike mike 28711 Aug 17 13:07 
> lucene/core/build/tmp/test/jar_extract_996420631017213954_tmp
> -rw-r--r-- 1 mike mike 84929 Aug 17 10:30 
> lucene/core/build/tmp/test/jar_extract_9964495910786482810_tmp
> -rw-r--r-- 1 mike mike  8034 Aug 17 17:10 
> lucene/core/build/tmp/test/jar_extract_9965528236930220207_tmp
> -rw-r--r-- 1 mike mike 43565 Aug 17 11:52 
> lucene/core/build/tmp/test/jar_extract_9967892842722777228_tmp
> -rw-r--r-- 1 mike mike 36278 Aug 17 10:36 
> lucene/core/build/tmp/test/jar_extract_996836107828729763_tmp
> -rw-r--r-- 1 mike mike 29997 Aug 17 15:48 
> lucene/core/build/tmp/test/jar_extract_9968527122717193835_tmp
> -rw-r--r-- 1 mike mike 19078 Aug 17 11:11 
> lucene/core/build/tmp/test/jar_extract_9968609693107939092_tmp
> -rw-r--r-- 1 mike mike  9920 Aug 17 14:31 
> lucene/core/build/tmp/test/jar_extract_9968809316564216653_tmp
> -rw-r--r-- 1 mike mike 19078 Aug 17 13:22 
> lucene/core/build/tmp/test/jar_extract_9969318805542859308_tmp
> -rw-r--r-- 1 mike mike 28711 Aug 17 11:44 
> lucene/core/build/tmp/test/jar_extract_9974798403956637924_tmp {noformat}
> It seems to grow by ~64 files (how many cores I configured gradle to use in 
> my {{./gradle.properties}} on each iteration of beasting.
> I think {{gradle}} should be cleaning this up maybe?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-9468) "./gradlew -p lucene test" leaks lucene/core/build/tmp/test/jar_extract_XXXX files

2020-08-18 Thread Michael McCandless (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless resolved LUCENE-9468.

Resolution: Not A Bug

> "./gradlew -p lucene test" leaks lucene/core/build/tmp/test/jar_extract_ 
> files
> --
>
> Key: LUCENE-9468
> URL: https://issues.apache.org/jira/browse/LUCENE-9468
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael McCandless
>Priority: Major
>
> I have been stupid-beasting ({{while(true) "./gradlew -p lucene test"}}) 
> Lucene core + modules tests, and noticed that I have accumulated many 
> (~10.3K) files like these:
> {noformat}
> -rw-r--r-- 1 mike mike  4954 Aug 17 12:49 
> lucene/core/build/tmp/test/jar_extract_9943749690454255636_tmp
> -rw-r--r-- 1 mike mike 84929 Aug 17 13:24 
> lucene/core/build/tmp/test/jar_extract_9949237584070142535_tmp
> -rw-r--r-- 1 mike mike 14627 Aug 17 14:01 
> lucene/core/build/tmp/test/jar_extract_9950285037002822552_tmp
> -rw-r--r-- 1 mike mike 15935 Aug 17 15:10 
> lucene/core/build/tmp/test/jar_extract_995066590821695944_tmp
> -rw-r--r-- 1 mike mike  5488 Aug 17 11:23 
> lucene/core/build/tmp/test/jar_extract_9952865172929838404_tmp
> -rw-r--r-- 1 mike mike 19078 Aug 17 14:41 
> lucene/core/build/tmp/test/jar_extract_9960016969100835830_tmp
> -rw-r--r-- 1 mike mike  5488 Aug 17 10:07 
> lucene/core/build/tmp/test/jar_extract_9960479662672452908_tmp
> -rw-r--r-- 1 mike mike 28711 Aug 17 13:07 
> lucene/core/build/tmp/test/jar_extract_996420631017213954_tmp
> -rw-r--r-- 1 mike mike 84929 Aug 17 10:30 
> lucene/core/build/tmp/test/jar_extract_9964495910786482810_tmp
> -rw-r--r-- 1 mike mike  8034 Aug 17 17:10 
> lucene/core/build/tmp/test/jar_extract_9965528236930220207_tmp
> -rw-r--r-- 1 mike mike 43565 Aug 17 11:52 
> lucene/core/build/tmp/test/jar_extract_9967892842722777228_tmp
> -rw-r--r-- 1 mike mike 36278 Aug 17 10:36 
> lucene/core/build/tmp/test/jar_extract_996836107828729763_tmp
> -rw-r--r-- 1 mike mike 29997 Aug 17 15:48 
> lucene/core/build/tmp/test/jar_extract_9968527122717193835_tmp
> -rw-r--r-- 1 mike mike 19078 Aug 17 11:11 
> lucene/core/build/tmp/test/jar_extract_9968609693107939092_tmp
> -rw-r--r-- 1 mike mike  9920 Aug 17 14:31 
> lucene/core/build/tmp/test/jar_extract_9968809316564216653_tmp
> -rw-r--r-- 1 mike mike 19078 Aug 17 13:22 
> lucene/core/build/tmp/test/jar_extract_9969318805542859308_tmp
> -rw-r--r-- 1 mike mike 28711 Aug 17 11:44 
> lucene/core/build/tmp/test/jar_extract_9974798403956637924_tmp {noformat}
> It seems to grow by ~64 files (how many cores I configured gradle to use in 
> my {{./gradle.properties}} on each iteration of beasting.
> I think {{gradle}} should be cleaning this up maybe?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] mikemccand commented on a change in pull request #1623: LUCENE-8962: Merge segments on getReader

2020-08-18 Thread GitBox


mikemccand commented on a change in pull request #1623:
URL: https://github.com/apache/lucene-solr/pull/1623#discussion_r472162610



##
File path: lucene/core/src/java/org/apache/lucene/index/IndexWriter.java
##
@@ -607,6 +633,57 @@ DirectoryReader getReader(boolean applyAllDeletes, boolean 
writeAllDeletes) thro
   }
 }
   }
+  if (onCommitMerges != null) { // only relevant if we do merge on 
getReader
+boolean replaceReaderSuccess = false;
+try {
+  mergeScheduler.merge(mergeSource, MergeTrigger.GET_READER);
+  onCommitMerges.await(maxCommitMergeWaitMillis, 
TimeUnit.MILLISECONDS);

Review comment:
   OK let's leave this be for now.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] mikemccand commented on a change in pull request #1623: LUCENE-8962: Merge segments on getReader

2020-08-18 Thread GitBox


mikemccand commented on a change in pull request #1623:
URL: https://github.com/apache/lucene-solr/pull/1623#discussion_r472154682



##
File path: lucene/core/src/java/org/apache/lucene/index/ReaderPool.java
##
@@ -404,7 +404,7 @@ private PendingDeletes newPendingDeletes(SegmentReader 
reader, SegmentCommitInfo
   private boolean noDups() {
 Set seen = new HashSet<>();
 for(SegmentCommitInfo info : readerMap.keySet()) {
-  assert !seen.contains(info.info.name);
+  assert !seen.contains(info.info.name) : "seen twice: " + info.info.name ;

Review comment:
   I love seeing diffs like this one, adding a `String` message to an 
otherwise cryptic `assert`!  It makes me realize you must have had a hellacious 
debugging session!





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] mikemccand commented on pull request #1623: LUCENE-8962: Merge segments on getReader

2020-08-18 Thread GitBox


mikemccand commented on pull request #1623:
URL: https://github.com/apache/lucene-solr/pull/1623#issuecomment-67545


   Those were the only two failures found after 1033 test iterations!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] mikemccand commented on pull request #1623: LUCENE-8962: Merge segments on getReader

2020-08-18 Thread GitBox


mikemccand commented on pull request #1623:
URL: https://github.com/apache/lucene-solr/pull/1623#issuecomment-675459732


   Another failure that reproduces only on the PR:
   
   ```
   [junit4:pickseed] Seed property 'tests.seed' already defined: 
CADC6D7945159855
  [junit4]  says jolly good day! Master seed: CADC6D7945159855
  [junit4] Executing 1 suite with 1 JVM.
  [junit4]
  [junit4] Started J0 PID(1047056@localhost).
  [junit4] Suite: org.apache.lucene.spatial.prefix.NumberRangeFacetsTest
  [junit4] OK  0.37s | NumberRangeFacetsTest.test 
{seed=[CADC6D7945159855:428852A3EBE9F5AD]}
  [junit4] OK  0.07s | NumberRangeFacetsTest.test 
{seed=[CADC6D7945159855:F6DEEE5FDF2B3E81]}
  [junit4] OK  0.02s | NumberRangeFacetsTest.test 
{seed=[CADC6D7945159855:783778838EEF764A]}
  [junit4]   2> NOTE: reproduce with: ant test  
-Dtestcase=NumberRangeFacetsTest -Dtests.method=test 
-Dtests.seed=CADC6D7945159855 -Dtests.slow=true -Dtests.badapples=true 
-Dtests.locale=lu-CD -Dtests.\
   timezone=America/Iqaluit -Dtests.asserts=true -Dtests.file.encoding=UTF-8
  [junit4] ERROR   0.10s | NumberRangeFacetsTest.test 
{seed=[CADC6D7945159855:49D9D366E2112D63]} <<<
  [junit4]> Throwable #1: java.nio.file.NoSuchFileException: _3.fdx
  [junit4]>at 
org.apache.lucene.store.ByteBuffersDirectory.deleteFile(ByteBuffersDirectory.java:148)
  [junit4]>at 
org.apache.lucene.store.MockDirectoryWrapper.deleteFile(MockDirectoryWrapper.java:607)
  [junit4]>at 
org.apache.lucene.store.LockValidatingDirectoryWrapper.deleteFile(LockValidatingDirectoryWrapper.java:38)
  [junit4]>at 
org.apache.lucene.index.IndexFileDeleter.deleteFile(IndexFileDeleter.java:696)
  [junit4]>at 
org.apache.lucene.index.IndexFileDeleter.deleteFiles(IndexFileDeleter.java:690)
  [junit4]>at 
org.apache.lucene.index.IndexFileDeleter.decRef(IndexFileDeleter.java:589)
  [junit4]>at 
org.apache.lucene.index.IndexFileDeleter.decRef(IndexFileDeleter.java:620)
  [junit4]>at 
org.apache.lucene.index.IndexWriter.decRefDeleter(IndexWriter.java:5354)
  [junit4]>at 
org.apache.lucene.index.StandardDirectoryReader.lambda$doClose$1(StandardDirectoryReader.java:370)
  [junit4]>at 
org.apache.lucene.index.StandardDirectoryReader.doClose(StandardDirectoryReader.java:384)
  [junit4]>at 
org.apache.lucene.index.IndexReader.decRef(IndexReader.java:244)
  [junit4]>at 
org.apache.lucene.index.IndexReader.close(IndexReader.java:385)
  [junit4]>at 
org.apache.lucene.util.IOUtils.close(IOUtils.java:89)
  [junit4]>at 
org.apache.lucene.util.IOUtils.close(IOUtils.java:77)
  [junit4]>at 
org.apache.lucene.spatial.SpatialTestCase.commit(SpatialTestCase.java:97)
  [junit4]>at 
org.apache.lucene.spatial.prefix.NumberRangeFacetsTest.test(NumberRangeFacetsTest.java:92)
  [junit4]>at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  [junit4]>at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
  [junit4]>at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  [junit4]>at 
java.base/java.lang.reflect.Method.invoke(Method.java:566)
  [junit4]>at 
java.base/java.lang.Thread.run(Thread.java:834)Throwable #2: 
java.lang.RuntimeException: MockDirectoryWrapper: cannot close: there are still 
2 open files: {_5.cfs=1, _4.cfs=1}
  [junit4]>at 
org.apache.lucene.store.MockDirectoryWrapper.close(MockDirectoryWrapper.java:812)
  [junit4]>at 
org.apache.lucene.util.IOUtils.close(IOUtils.java:89)
  [junit4]>at 
org.apache.lucene.util.IOUtils.close(IOUtils.java:77)
  [junit4]>at 
org.apache.lucene.spatial.SpatialTestCase.tearDown(SpatialTestCase.java:72)
  [junit4]>at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  [junit4]>at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
  [junit4]>at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  [junit4]>at 
java.base/java.lang.reflect.Method.invoke(Method.java:566)
  [junit4]>at java.base/java.lang.Thread.run(Thread.java:834)
  [junit4]> Caused by: java.lang.RuntimeException: unclosed IndexInput: 
_4.cfs
  [junit4]>at 
org.apache.lucene.store.MockDirectoryWrapper.addFileHandle(MockDirectoryWrapper.java:730)
  [junit4]>at 
org.apache.lucene.store.MockDirectoryWrapper.openInput(MockDirectoryWrapper.java:773)
  [junit4]>  

[GitHub] [lucene-solr] mikemccand commented on pull request #1623: LUCENE-8962: Merge segments on getReader

2020-08-18 Thread GitBox


mikemccand commented on pull request #1623:
URL: https://github.com/apache/lucene-solr/pull/1623#issuecomment-675458466


   Beasting ran all night (1031 iterations) and uncovered this failure, which 
reproduces with this PR but not on mainline:
   
   ```
   [mkdir] Created dir: /l/simon/lucene/build/core/test
   [junit4:pickseed] Seed property 'tests.seed' already defined: 
313A1CF00C235D4F
   [mkdir] Created dir: /l/simon/lucene/build/core/test/temp
   [mkdir] Created dir: /l/simon/.caches/test-stats/core
  [junit4]  says hi! Master seed: 313A1CF00C235D4F
  [junit4] Executing 1 suite with 1 JVM.
  [junit4]
  [junit4] Started J0 PID(972126@localhost).
  [junit4] Suite: 
org.apache.lucene.codecs.perfield.TestPerFieldDocValuesFormat
  [junit4]   2> NOTE: reproduce with: ant test  
-Dtestcase=TestPerFieldDocValuesFormat 
-Dtests.method=testSparseBinaryVariableLengthVsStoredFields 
-Dtests.seed=313A1CF00C235D4F -Dtests.slow=true -Dtest\
   s.badapples=true -Dtests.locale=ur -Dtests.timezone=Mexico/BajaNorte 
-Dtests.asserts=true -Dtests.file.encoding=UTF-8
  [junit4] ERROR   0.43s | 
TestPerFieldDocValuesFormat.testSparseBinaryVariableLengthVsStoredFields <<<
  [junit4]> Throwable #1: java.nio.file.NoSuchFileException: _j.si
  [junit4]>at 
__randomizedtesting.SeedInfo.seed([313A1CF00C235D4F:5C414B294A4A9C4D]:0)
  [junit4]>at 
org.apache.lucene.store.ByteBuffersDirectory.deleteFile(ByteBuffersDirectory.java:148)
  [junit4]>at 
org.apache.lucene.store.MockDirectoryWrapper.deleteFile(MockDirectoryWrapper.java:607)
  [junit4]>at 
org.apache.lucene.store.LockValidatingDirectoryWrapper.deleteFile(LockValidatingDirectoryWrapper.java:38)
  [junit4]>at 
org.apache.lucene.index.IndexFileDeleter.deleteFile(IndexFileDeleter.java:696)
  [junit4]>at 
org.apache.lucene.index.IndexFileDeleter.deleteFiles(IndexFileDeleter.java:690)
  [junit4]>at 
org.apache.lucene.index.IndexFileDeleter.decRef(IndexFileDeleter.java:589)
  [junit4]>at 
org.apache.lucene.index.IndexFileDeleter.decRef(IndexFileDeleter.java:620)
  [junit4]>at 
org.apache.lucene.index.IndexWriter.decRefDeleter(IndexWriter.java:5354)
  [junit4]>at 
org.apache.lucene.index.StandardDirectoryReader.lambda$doClose$1(StandardDirectoryReader.java:370)
  [junit4]>at 
org.apache.lucene.index.StandardDirectoryReader.doClose(StandardDirectoryReader.java:384)
  [junit4]>at 
org.apache.lucene.index.IndexReader.decRef(IndexReader.java:244)
  [junit4]>at 
org.apache.lucene.index.IndexReader.close(IndexReader.java:385)
  [junit4]>at 
org.apache.lucene.index.BaseDocValuesFormatTestCase.doTestBinaryVsStoredFields(BaseDocValuesFormatTestCase.java:1527)
  [junit4]>at 
org.apache.lucene.index.BaseDocValuesFormatTestCase.doTestBinaryVariableLengthVsStoredFields(BaseDocValuesFormatTestCase.java:1585)
  [junit4]>at 
org.apache.lucene.index.BaseDocValuesFormatTestCase.testSparseBinaryVariableLengthVsStoredFields(BaseDocValuesFormatTestCase.java:1579)
  [junit4]>at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  [junit4]>at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
  [junit4]>at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  [junit4]>at 
java.base/java.lang.reflect.Method.invoke(Method.java:566)
  [junit4]>at java.base/java.lang.Thread.run(Thread.java:834)
  [junit4]   2> NOTE: test params are: codec=Asserting(Lucene86): {}, 
docValues:{}, maxPointsInLeafNode=1657, maxMBSortInHeap=6.829528977246159, 
sim=Asserting(RandomSimilarity(queryNorm=true): {}), loc\
   ale=ur, timezone=Mexico/BajaNorte
  [junit4]   2> NOTE: Linux 5.5.6-arch1-1 amd64/Oracle Corporation 11.0.6 
(64-bit)/cpus=128,threads=1,free=516818968,total=536870912
  [junit4]   2> NOTE: All tests run in this JVM: 
[TestPerFieldDocValuesFormat]
  [junit4] Completed [1/1 (1!)] in 0.62s, 1 test, 1 error <<< FAILURES!
  [junit4]
  [junit4]
  [junit4] Tests with failures [seed: 313A1CF00C235D4F]:
  [junit4]   - 
org.apache.lucene.codecs.perfield.TestPerFieldDocValuesFormat.testSparseBinaryVariableLengthVsStoredFields
  [junit4]
  [junit4]
   ```



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: 

[GitHub] [lucene-solr] atris commented on a change in pull request #1737: SOLR-14615: Implement CPU Utilization Based Circuit Breaker

2020-08-18 Thread GitBox


atris commented on a change in pull request #1737:
URL: https://github.com/apache/lucene-solr/pull/1737#discussion_r472104145



##
File path: 
solr/core/src/java/org/apache/solr/util/circuitbreaker/MemoryCircuitBreaker.java
##
@@ -76,6 +79,10 @@ public boolean isTripped() {
   return false;
 }
 
+if (!enabled) {
+  return false;
+}
+
 long localAllowedMemory = getCurrentMemoryThreshold();

Review comment:
   So we do not need to cache here.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] markharwood commented on pull request #1708: LUCENE-9445 Add support for case insensitive regex searches in QueryParser

2020-08-18 Thread GitBox


markharwood commented on pull request #1708:
URL: https://github.com/apache/lucene-solr/pull/1708#issuecomment-675420267


   OK. @romseygeek suggested the BWC flag is called "allow_modifiers" and, if 
false, legacy behaviour is used ie there would be no errors for characters 
after trailing `/` and `/regex/i` is still interpreted as `/regex/` and term 
`i`.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] jimczi commented on pull request #1708: LUCENE-9445 Add support for case insensitive regex searches in QueryParser

2020-08-18 Thread GitBox


jimczi commented on pull request #1708:
URL: https://github.com/apache/lucene-solr/pull/1708#issuecomment-675416033


   > Should /regex/iterm throw an error or be interpreted as /regex/ and iterm?
   
   I'd prefer that we throw an error if any of the character attached to a 
regexp is not recognized.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] chatman commented on pull request #1755: SOLR-14753: Improve thread annotation name

2020-08-18 Thread GitBox


chatman commented on pull request #1755:
URL: https://github.com/apache/lucene-solr/pull/1755#issuecomment-675411238


   Thanks for the PR, Marcus. I committed this change as part of SOLR-14731. 
Can you please close both the PRs (as usual, I forgot to do it via commit 
message and my github credentials are still not in order, sorry)?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Resolved] (SOLR-14753) More Consistent Thread Safety Annotations.

2020-08-18 Thread Ishan Chattopadhyaya (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ishan Chattopadhyaya resolved SOLR-14753.
-
Fix Version/s: 8.7
   Resolution: Fixed

I committed this as part of SOLR-14731.

> More Consistent Thread Safety Annotations.
> --
>
> Key: SOLR-14753
> URL: https://issues.apache.org/jira/browse/SOLR-14753
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: master (9.0)
>Reporter: Marcus Eagan
>Priority: Major
> Fix For: 8.7
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> This is a simple but important step to adopt more consistent thread safety 
> annotations originally created by Mark Miller and added into Solr by Anshum 
> from an internal Apple fork.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Resolved] (SOLR-14731) Make use of @SolrSingleThreaded Implementation

2020-08-18 Thread Ishan Chattopadhyaya (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ishan Chattopadhyaya resolved SOLR-14731.
-
Resolution: Fixed

Thanks [~marcussorealheis]. As [~mdrob] said, Rome wasn't built in a day. I 
think we should leverage this annotation wherever we see the opportunity for 
benefit of new developers reading our code.

> Make use of @SolrSingleThreaded Implementation 
> ---
>
> Key: SOLR-14731
> URL: https://issues.apache.org/jira/browse/SOLR-14731
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
> Environment: {color:red}colored text{color}
>Reporter: Marcus Eagan
>Priority: Major
> Fix For: 8.7
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> This change may viewed as minor today, but making a habit of this annotation 
> should prove very beneficial in the long run when I forget things that I 
> worked on 3 years ago, 3 years from now.
> This is my first attempt to leverage [~anshum]'s work from: 
> https://issues.apache.org/jira/browse/SOLR-13998
> [~anshum] please let me know if I am screwing something up! :) and thanks for 
> adding this a while back.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14731) Make use of @SolrSingleThreaded Implementation

2020-08-18 Thread Ishan Chattopadhyaya (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ishan Chattopadhyaya updated SOLR-14731:

Fix Version/s: 8.7

> Make use of @SolrSingleThreaded Implementation 
> ---
>
> Key: SOLR-14731
> URL: https://issues.apache.org/jira/browse/SOLR-14731
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
> Environment: {color:red}colored text{color}
>Reporter: Marcus Eagan
>Priority: Major
> Fix For: 8.7
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> This change may viewed as minor today, but making a habit of this annotation 
> should prove very beneficial in the long run when I forget things that I 
> worked on 3 years ago, 3 years from now.
> This is my first attempt to leverage [~anshum]'s work from: 
> https://issues.apache.org/jira/browse/SOLR-13998
> [~anshum] please let me know if I am screwing something up! :) and thanks for 
> adding this a while back.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14731) Make use of @SolrSingleThreaded Implementation

2020-08-18 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17179544#comment-17179544
 ] 

ASF subversion and git services commented on SOLR-14731:


Commit dc37f02980857ed6a75efd60674992c22c012499 in lucene-solr's branch 
refs/heads/branch_8x from Ishan Chattopadhyaya
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=dc37f02 ]

SOLR-14731: Rename @SolrSingleThreaded to @SolrThreadUnsafe, mark 
DistribPackageStore with the annotation

Co-authored-by: Marcus 


> Make use of @SolrSingleThreaded Implementation 
> ---
>
> Key: SOLR-14731
> URL: https://issues.apache.org/jira/browse/SOLR-14731
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
> Environment: {color:red}colored text{color}
>Reporter: Marcus Eagan
>Priority: Major
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> This change may viewed as minor today, but making a habit of this annotation 
> should prove very beneficial in the long run when I forget things that I 
> worked on 3 years ago, 3 years from now.
> This is my first attempt to leverage [~anshum]'s work from: 
> https://issues.apache.org/jira/browse/SOLR-13998
> [~anshum] please let me know if I am screwing something up! :) and thanks for 
> adding this a while back.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14731) Make use of @SolrSingleThreaded Implementation

2020-08-18 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17179543#comment-17179543
 ] 

ASF subversion and git services commented on SOLR-14731:


Commit 77a4d495cc553ec80001346376fd87d6b73a6059 in lucene-solr's branch 
refs/heads/master from Ishan Chattopadhyaya
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=77a4d49 ]

SOLR-14731: Rename @SolrSingleThreaded to @SolrThreadUnsafe, mark 
DistribPackageStore with the annotation

Co-authored-by: Marcus 


> Make use of @SolrSingleThreaded Implementation 
> ---
>
> Key: SOLR-14731
> URL: https://issues.apache.org/jira/browse/SOLR-14731
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
> Environment: {color:red}colored text{color}
>Reporter: Marcus Eagan
>Priority: Major
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> This change may viewed as minor today, but making a habit of this annotation 
> should prove very beneficial in the long run when I forget things that I 
> worked on 3 years ago, 3 years from now.
> This is my first attempt to leverage [~anshum]'s work from: 
> https://issues.apache.org/jira/browse/SOLR-13998
> [~anshum] please let me know if I am screwing something up! :) and thanks for 
> adding this a while back.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] chatman commented on pull request #1744: SOLR-14731: Add SingleThreaded Annotation to Class

2020-08-18 Thread GitBox


chatman commented on pull request #1744:
URL: https://github.com/apache/lucene-solr/pull/1744#issuecomment-675403326


   Looks good. I'll merge this soon. Thanks @MarcusSorealheis.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] sigram commented on a change in pull request #1737: SOLR-14615: Implement CPU Utilization Based Circuit Breaker

2020-08-18 Thread GitBox


sigram commented on a change in pull request #1737:
URL: https://github.com/apache/lucene-solr/pull/1737#discussion_r472060120



##
File path: solr/core/src/java/org/apache/solr/core/SolrConfig.java
##
@@ -229,9 +229,13 @@ private SolrConfig(SolrResourceLoader loader, String name, 
boolean isConfigsetTr
 enableLazyFieldLoading = getBool("query/enableLazyFieldLoading", false);
 
 useCircuitBreakers = getBool("circuitBreaker/useCircuitBreakers", false);
+cpuCircuitBreakerEnabled = 
getBool("circuitBreaker/cpuCircuitBreakerEnabled", false);

Review comment:
   Can we please simplify these names? they are awfully verbose and 
repeating the parts that are already unique and obvious.

##
File path: 
solr/core/src/java/org/apache/solr/util/circuitbreaker/MemoryCircuitBreaker.java
##
@@ -88,13 +95,19 @@ public boolean isTripped() {
 
   @Override
   public String getDebugInfo() {
-if (seenMemory.get() == 0L || allowedMemory.get() == 0L) {
+if (seenMemory.get() == 0.0 || allowedMemory.get() == 0.0) {
   log.warn("MemoryCircuitBreaker's monitored values (seenMemory, 
allowedMemory) not set");
 }
 
 return "seenMemory=" + seenMemory.get() + " allowedMemory=" + 
allowedMemory.get();
   }
 
+  @Override
+  public String getErrorMessage() {
+return "Memory Circuit Breaker Triggered. Seen JVM heap memory usage " + 
seenMemory.get() + " and allocated threshold " +

Review comment:
   Similarly, "greater than allocated threshold" ?

##
File path: 
solr/core/src/java/org/apache/solr/util/circuitbreaker/CPUCircuitBreaker.java
##
@@ -0,0 +1,108 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.util.circuitbreaker;
+
+import java.lang.invoke.MethodHandles;
+import java.lang.management.ManagementFactory;
+import java.lang.management.OperatingSystemMXBean;
+
+import org.apache.solr.core.SolrConfig;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * 
+ * Tracks current CPU usage and triggers if the specified threshold is 
breached.
+ *
+ * This circuit breaker gets the average CPU load over the last minute and uses
+ * that data to take a decision. Ideally, we should be able to cache the value
+ * locally and only query once the minute has elapsed. However, that will 
introduce
+ * more complexity than the current structure and might not get us major 
performance
+ * wins. If this ever becomes a performance bottleneck, that can be considered.

Review comment:
   Uh ... I see there's still some misunderstanding about this. The call 
itself is directly passed to the native method that invokes stdlib 
`getloadavg`, which in turn reads these values from the /proc pseudo-fs. So, 
the cost is truly minimal and the call doesn't block - if it turns out that 
it's still too costly to call for every request then we can introduce some 
timeout-based caching.
   
   These averages are so called exponentially weighted moving averages, so 
indeed a 1-min average has traces of past load values from outside the 1-min 
window, which helps in smoothing it. This may turn out to be sufficient to 
avoid false positives due to short-term spikes (such as large merges). Linux 
loadavg represents to some degree a combined CPU + disk IO load, so indeed 
intensive IO operations will affect it.
   
   We always have an option to use Codahale Meter to easily calculate 5- and 
15-min EWMAs if it turns out that we're getting too many false positives. Until 
then users can configure higher thresholds, thus reducing the number of false 
positives at the cost of higher contention.

##
File path: solr/solr-ref-guide/src/circuit-breakers.adoc
##
@@ -35,33 +35,68 @@ will be disabled globally. Per circuit breaker 
configurations are specified in t
 false
 
 
+This flag acts as the highest authority and global controller of circuit 
breakers. For using specific circuit breakers, each one
+needs to be individually enabled in addition to this flag being enabled.

Review comment:
   If we change the format of the config this section needs to be updated 
too.

##
File path: 

[GitHub] [lucene-solr] markharwood commented on pull request #1708: LUCENE-9445 Add support for case insensitive regex searches in QueryParser

2020-08-18 Thread GitBox


markharwood commented on pull request #1708:
URL: https://github.com/apache/lucene-solr/pull/1708#issuecomment-675386274


   >I wonder if the parsing should be more strict and only matches if there is 
a separator or ends after the i ?
   
   @jimczi What is the behaviour for a non-match?
   Should `/regex/iterm` throw an error or be interpreted as `/regex/` and 
`iterm`?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] sigram commented on a change in pull request #1758: SOLR-14749: Provide a clean API for cluster-level event processing, Initial draft.

2020-08-18 Thread GitBox


sigram commented on a change in pull request #1758:
URL: https://github.com/apache/lucene-solr/pull/1758#discussion_r472058053



##
File path: solr/core/src/java/org/apache/solr/cloud/events/ScheduledEvent.java
##
@@ -0,0 +1,25 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.cloud.events;
+
+/**
+ *
+ */
+public interface ScheduledEvent extends ClusterEvent {
+  String getScheduleName();
+  Object getScheduleParam(String key);

Review comment:
   Thinking more about it, we should use a strongly-typed Schedule.
   
   Scheduled event is something that Solr generates on a predefined schedule ;) 
It's basically a general-purpose scheduler that is available via this API. 
There's a need for some periodic maintenance tasks, and currently it was 
implemented as a part of the autoscaling triggers (ScheduledTrigger, which in 
turn invoked eg. inactive shard cleanup and collection repair tasks).
   
   I'm open to other suggestions - we could make it a separate API, but 
modelling it as a cluster event has benefits too (uniform way to register for 
and to process events, and it avoids adding yet another API).





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] sigram commented on a change in pull request #1758: SOLR-14749: Provide a clean API for cluster-level event processing, Initial draft.

2020-08-18 Thread GitBox


sigram commented on a change in pull request #1758:
URL: https://github.com/apache/lucene-solr/pull/1758#discussion_r472053144



##
File path: solr/core/src/java/org/apache/solr/cloud/events/NodeDownEvent.java
##
@@ -0,0 +1,24 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.cloud.events;
+
+/**
+ *
+ */
+public interface NodeDownEvent extends ClusterEvent {

Review comment:
   +1.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (SOLR-14756) Sync of replicas are failing for two collection among 3

2020-08-18 Thread neha kumari (Jira)
neha kumari created SOLR-14756:
--

 Summary: Sync of replicas are failing for two collection among 3
 Key: SOLR-14756
 URL: https://issues.apache.org/jira/browse/SOLR-14756
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: neha kumari


We have 3 collection (collection_document,collection_product,collection_pages) 
on one linux server, each has 1 shard and 3 replica.

two replica for (collection_document and collection_pages) are showing as 
recovery failed.

We tried to stop and start solr and zookeeper but that doesn't help. Some logs 
says peersync() failed.

some log is as :

RecoveryStrategy Error while trying to 
recover:org.apache.solr.common.SolrException: Replication for recovery failed. 
at org.apache.solr.cloud.RecoveryStrategy.replicate(RecoveryStrategy.java:222) 
at org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:471) 
at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:284) at 
com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:176)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at 
java.util.concurrent.FutureTask.run(FutureTask.java:266) at 
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:229)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
at java.lang.Thread.run(Thread.java:748)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] s1monw commented on a change in pull request #1623: LUCENE-8962: Merge segments on getReader

2020-08-18 Thread GitBox


s1monw commented on a change in pull request #1623:
URL: https://github.com/apache/lucene-solr/pull/1623#discussion_r471991665



##
File path: lucene/core/src/java/org/apache/lucene/index/IndexWriter.java
##
@@ -607,6 +633,57 @@ DirectoryReader getReader(boolean applyAllDeletes, boolean 
writeAllDeletes) thro
   }
 }
   }
+  if (onCommitMerges != null) { // only relevant if we do merge on 
getReader
+boolean replaceReaderSuccess = false;
+try {
+  mergeScheduler.merge(mergeSource, MergeTrigger.GET_READER);
+  onCommitMerges.await(maxCommitMergeWaitMillis, 
TimeUnit.MILLISECONDS);

Review comment:
   I'd not want to add another option to IWC unless absolutely necessary. 
Maybe we can just keep one for now and if somebody has a good usecase we can 
still add? I think we have the ability to disable it entirely for one or the 
other trigger which should be enough in most cases?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9465) Add functionality to do test re-runs (suite duplication, so-called 'beasting')

2020-08-18 Thread Dawid Weiss (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17179423#comment-17179423
 ] 

Dawid Weiss commented on LUCENE-9465:
-

I allowed myself to add it in since there's been no feedback. I think it's 
better than nothing and it is nicely decoupled from anything else. If somebody 
has an idea for improving parallelism (maybe ask gradle folks) then it'd be 
great to know how to achieve it (pass an entire task to a gradle worker API).

> Add functionality to do test re-runs (suite duplication, so-called 'beasting')
> --
>
> Key: LUCENE-9465
> URL: https://issues.apache.org/jira/browse/LUCENE-9465
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Major
> Fix For: master (9.0)
>
> Attachments: LUCENE-9465.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9465) Add functionality to do test re-runs (suite duplication, so-called 'beasting')

2020-08-18 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17179422#comment-17179422
 ] 

ASF subversion and git services commented on LUCENE-9465:
-

Commit 83ed210fd066914c7622da58664ae5e89cce3900 in lucene-solr's branch 
refs/heads/master from Dawid Weiss
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=83ed210 ]

LUCENE-9465: 'beast' task from within gradle (#1757)



> Add functionality to do test re-runs (suite duplication, so-called 'beasting')
> --
>
> Key: LUCENE-9465
> URL: https://issues.apache.org/jira/browse/LUCENE-9465
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Major
> Attachments: LUCENE-9465.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-9465) Add functionality to do test re-runs (suite duplication, so-called 'beasting')

2020-08-18 Thread Dawid Weiss (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Weiss resolved LUCENE-9465.
-
Fix Version/s: master (9.0)
   Resolution: Fixed

> Add functionality to do test re-runs (suite duplication, so-called 'beasting')
> --
>
> Key: LUCENE-9465
> URL: https://issues.apache.org/jira/browse/LUCENE-9465
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Major
> Fix For: master (9.0)
>
> Attachments: LUCENE-9465.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] dweiss merged pull request #1757: LUCENE-9465: 'beast' task from within gradle

2020-08-18 Thread GitBox


dweiss merged pull request #1757:
URL: https://github.com/apache/lucene-solr/pull/1757


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Reopened] (SOLR-14151) Make schema components load from packages

2020-08-18 Thread Noble Paul (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul reopened SOLR-14151:
---

> Make schema components load from packages
> -
>
> Key: SOLR-14151
> URL: https://issues.apache.org/jira/browse/SOLR-14151
> Project: Solr
>  Issue Type: Sub-task
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>  Labels: packagemanager
> Fix For: 8.7
>
>  Time Spent: 10h 10m
>  Remaining Estimate: 0h
>
> Example:
> {code:xml}
>  
> 
>   
>generateNumberParts="0" catenateWords="0"
>   catenateNumbers="0" catenateAll="0"/>
>   
>   
> 
>   
> {code}
> * When a package is updated, the entire {{IndexSchema}} object is refreshed, 
> but the SolrCore object is not reloaded
> * Any component can be prefixed with the package name
> * The semantics of loading plugins remain the same as that of the components 
> in {{solrconfig.xml}}
> * Plugins can be registered using schema API



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] noblepaul opened a new pull request #1760: SOLR-14750: TestBulkSchemaConcurrent passes but schema plugins fail

2020-08-18 Thread GitBox


noblepaul opened a new pull request #1760:
URL: https://github.com/apache/lucene-solr/pull/1760


   I've isolated the changes that caused schema `TestBulkSchemaConcurrent` 
tests to fail
   
   This bug was introduced by 
[SOLR-14151](https://issues.apache.org/jira/browse/SOLR-14680) .
   
   So, in this PR, I have disabled those tests



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] s1monw commented on a change in pull request #1623: LUCENE-8962: Merge segments on getReader

2020-08-18 Thread GitBox


s1monw commented on a change in pull request #1623:
URL: https://github.com/apache/lucene-solr/pull/1623#discussion_r471960276



##
File path: lucene/core/src/java/org/apache/lucene/index/IndexWriter.java
##
@@ -630,6 +694,64 @@ DirectoryReader getReader(boolean applyAllDeletes, boolean 
writeAllDeletes) thro
 return r;
   }
 
+  private StandardDirectoryReader finishGetReaderMerge(AtomicBoolean 
hasTimedOut, Map mergedReaders,
+   Map openedReadOnlyClones, SegmentInfos openingSegmentInfos,
+   boolean 
applyAllDeletes, boolean writeAllDeletes,
+   
MergePolicy.MergeSpecification onCommitMerges, long maxCommitMergeWaitMillis) 
throws IOException {
+boolean replaceReaderSuccess = false;
+try {
+  mergeScheduler.merge(mergeSource, MergeTrigger.GET_READER);
+  onCommitMerges.await(maxCommitMergeWaitMillis, TimeUnit.MILLISECONDS);
+  assert openingSegmentInfos != null;
+  synchronized (this) {
+hasTimedOut.set(true);

Review comment:
   I renamed it to `stopCollectingMergedReaders`





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] s1monw commented on a change in pull request #1623: LUCENE-8962: Merge segments on getReader

2020-08-18 Thread GitBox


s1monw commented on a change in pull request #1623:
URL: https://github.com/apache/lucene-solr/pull/1623#discussion_r471958847



##
File path: lucene/core/src/java/org/apache/lucene/index/IndexWriter.java
##
@@ -545,18 +546,54 @@ DirectoryReader getReader(boolean applyAllDeletes, 
boolean writeAllDeletes) thro
 // obtained during this flush are pooled, the first time
 // this method is called:
 readerPool.enableReaderPooling();
-DirectoryReader r = null;
+StandardDirectoryReader r = null;
 doBeforeFlush();
-boolean anyChanges = false;
+boolean anyChanges;
 /*
  * for releasing a NRT reader we must ensure that 
  * DW doesn't add any segments or deletes until we are
  * done with creating the NRT DirectoryReader. 
  * We release the two stage full flush after we are done opening the
  * directory reader!
  */
+MergePolicy.MergeSpecification onGetReaderMerges = null;
+AtomicBoolean hasTimedOut = new AtomicBoolean(false);
+Map mergedReaders = new HashMap<>();
+Map openedReadOnlyClones = new HashMap<>();
+// this function is used to control which SR are opened in order to keep 
track of them
+// and to reuse them in the case we wait for merges in this getReader call.
+IOUtils.IOFunction readerFactory = sci 
-> {
+  final ReadersAndUpdates rld = getPooledInstance(sci, true);
+  try {
+assert Thread.holdsLock(IndexWriter.this);
+SegmentReader segmentReader = rld.getReadOnlyClone(IOContext.READ);
+openedReadOnlyClones.put(sci.info.name, segmentReader);
+return segmentReader;
+  } finally {
+release(rld);
+  }
+};
+SegmentInfos openingSegmentInfos = null;
+final long maxFullFlushMergeWaitMillis = 
config.getMaxFullFlushMergeWaitMillis();
 boolean success2 = false;
 try {
+  /* this is the essential part of the getReader method. We need to take 
care of the following things:

Review comment:
   yeah I think it makes sense to have these details in these complex 
methods





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] s1monw commented on a change in pull request #1623: LUCENE-8962: Merge segments on getReader

2020-08-18 Thread GitBox


s1monw commented on a change in pull request #1623:
URL: https://github.com/apache/lucene-solr/pull/1623#discussion_r471957713



##
File path: lucene/core/src/java/org/apache/lucene/index/IndexWriter.java
##
@@ -3335,49 +3416,60 @@ public void mergeFinished(boolean committed, boolean 
segmentDropped) throws IOEx
 // includedInCommit will be set (above, by our caller) to false if 
the allowed max wall clock
 // time (IWC.getMaxCommitMergeWaitMillis()) has elapsed, which 
means we did not make the timeout
 // and will not commit our merge to the to-be-commited SegmentInfos
-
 if (segmentDropped == false
 && committed
-&& includeInCommit.get()) {
+&& includeMergeResult.get()) {
+
+  // make sure onMergeComplete really was called:
+  assert origInfo != null;
 
   if (infoStream.isEnabled("IW")) {
 infoStream.message("IW", "now apply merge during commit: " + 
toWrap.segString());
   }
 
-  // make sure onMergeComplete really was called:
-  assert origInfo != null;
-
-  deleter.incRef(origInfo.files());
+  if (trigger == MergeTrigger.COMMIT) { // if we do this in a 
getReader call here this is obsolete
+deleter.incRef(origInfo.files());

Review comment:
   I extended the comment





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] s1monw commented on a change in pull request #1623: LUCENE-8962: Merge segments on getReader

2020-08-18 Thread GitBox


s1monw commented on a change in pull request #1623:
URL: https://github.com/apache/lucene-solr/pull/1623#discussion_r471955980



##
File path: lucene/core/src/java/org/apache/lucene/index/IndexWriter.java
##
@@ -607,6 +633,57 @@ DirectoryReader getReader(boolean applyAllDeletes, boolean 
writeAllDeletes) thro
   }
 }
   }
+  if (onCommitMerges != null) { // only relevant if we do merge on 
getReader
+boolean replaceReaderSuccess = false;

Review comment:
    





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14750) Harden TestBulkSchemaConcurrent

2020-08-18 Thread Noble Paul (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17179394#comment-17179394
 ] 

Noble Paul commented on SOLR-14750:
---

{quote}When did I ever claim it was the correct fix? I explicitly said it wasn't
{quote}
My apologies.

 

TBH, I was in the middle of something else and I didn't read the whole thread. 
I just applied the patch as a I thought it was the fix

Anyway, I'm debugging it now. There is something really spooky that is 
happening with core reload

> Harden TestBulkSchemaConcurrent
> ---
>
> Key: SOLR-14750
> URL: https://issues.apache.org/jira/browse/SOLR-14750
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Tests
>Reporter: Erick Erickson
>Assignee: Noble Paul
>Priority: Major
> Attachments: SOLR-14750.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> This test has been failing quite often lately. I poked around a bit and see 
> what I _think_ is evidence of a race condition in CoreContainer.reload where 
> a reload on the same core is happening from two places in close succession. 
> I'll attach a preliminary patch soon.
> Without this patch I had 25 failures out of 1,000 runs, with it 0.
> I consider this patch a WIP, putting up for comment. Well, it has nocommits 
> so... But In particular, I have to review some changes I made about which 
> name we're using for PendingCoreOps. I also want to back out my changes and 
> beast it again with some more logging to see if I can nail down that multiple 
> reloads are happening before declaring victory.
> What this does is put the name of the core we're reloading in pendingCoreOps 
> earlier in the reload process. Then the second call to reload will wait until 
> the first is completed. I also restructured it a bit because I don't like if 
> clauses that go on forever and a small else clause way down the code. I 
> inverted the test and bailed out of the method rather than fall off the end 
> after the else clause.
> One thing I don't like about this is two reloads in such rapid succession 
> seems wasteful. Even so, I can imagine that one reload gets through far 
> enough to load the schema, then a schema update changes the schema _then_ 
> calls reload. So I don't think just returning if there's a reload happening 
> on that core already is valid.
> More to come.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Assigned] (SOLR-14750) Harden TestBulkSchemaConcurrent

2020-08-18 Thread Noble Paul (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul reassigned SOLR-14750:
-

Assignee: Noble Paul  (was: Erick Erickson)

> Harden TestBulkSchemaConcurrent
> ---
>
> Key: SOLR-14750
> URL: https://issues.apache.org/jira/browse/SOLR-14750
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Tests
>Reporter: Erick Erickson
>Assignee: Noble Paul
>Priority: Major
> Attachments: SOLR-14750.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> This test has been failing quite often lately. I poked around a bit and see 
> what I _think_ is evidence of a race condition in CoreContainer.reload where 
> a reload on the same core is happening from two places in close succession. 
> I'll attach a preliminary patch soon.
> Without this patch I had 25 failures out of 1,000 runs, with it 0.
> I consider this patch a WIP, putting up for comment. Well, it has nocommits 
> so... But In particular, I have to review some changes I made about which 
> name we're using for PendingCoreOps. I also want to back out my changes and 
> beast it again with some more logging to see if I can nail down that multiple 
> reloads are happening before declaring victory.
> What this does is put the name of the core we're reloading in pendingCoreOps 
> earlier in the reload process. Then the second call to reload will wait until 
> the first is completed. I also restructured it a bit because I don't like if 
> clauses that go on forever and a small else clause way down the code. I 
> inverted the test and bailed out of the method rather than fall off the end 
> after the else clause.
> One thing I don't like about this is two reloads in such rapid succession 
> seems wasteful. Even so, I can imagine that one reload gets through far 
> enough to load the schema, then a schema update changes the schema _then_ 
> calls reload. So I don't think just returning if there's a reload happening 
> on that core already is valid.
> More to come.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9468) "./gradlew -p lucene test" leaks lucene/core/build/tmp/test/jar_extract_XXXX files

2020-08-18 Thread Dawid Weiss (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17179377#comment-17179377
 ] 

Dawid Weiss commented on LUCENE-9468:
-

These are temporary files internal to gradle's test task. It's not something we 
do. They should be cleaned, but it's a bug in gradle - I don't think we should 
cater for this as it'll make the build more complex and it may prove difficult 
to track on gradle upgrades?

> "./gradlew -p lucene test" leaks lucene/core/build/tmp/test/jar_extract_ 
> files
> --
>
> Key: LUCENE-9468
> URL: https://issues.apache.org/jira/browse/LUCENE-9468
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael McCandless
>Priority: Major
>
> I have been stupid-beasting ({{while(true) "./gradlew -p lucene test"}}) 
> Lucene core + modules tests, and noticed that I have accumulated many 
> (~10.3K) files like these:
> {noformat}
> -rw-r--r-- 1 mike mike  4954 Aug 17 12:49 
> lucene/core/build/tmp/test/jar_extract_9943749690454255636_tmp
> -rw-r--r-- 1 mike mike 84929 Aug 17 13:24 
> lucene/core/build/tmp/test/jar_extract_9949237584070142535_tmp
> -rw-r--r-- 1 mike mike 14627 Aug 17 14:01 
> lucene/core/build/tmp/test/jar_extract_9950285037002822552_tmp
> -rw-r--r-- 1 mike mike 15935 Aug 17 15:10 
> lucene/core/build/tmp/test/jar_extract_995066590821695944_tmp
> -rw-r--r-- 1 mike mike  5488 Aug 17 11:23 
> lucene/core/build/tmp/test/jar_extract_9952865172929838404_tmp
> -rw-r--r-- 1 mike mike 19078 Aug 17 14:41 
> lucene/core/build/tmp/test/jar_extract_9960016969100835830_tmp
> -rw-r--r-- 1 mike mike  5488 Aug 17 10:07 
> lucene/core/build/tmp/test/jar_extract_9960479662672452908_tmp
> -rw-r--r-- 1 mike mike 28711 Aug 17 13:07 
> lucene/core/build/tmp/test/jar_extract_996420631017213954_tmp
> -rw-r--r-- 1 mike mike 84929 Aug 17 10:30 
> lucene/core/build/tmp/test/jar_extract_9964495910786482810_tmp
> -rw-r--r-- 1 mike mike  8034 Aug 17 17:10 
> lucene/core/build/tmp/test/jar_extract_9965528236930220207_tmp
> -rw-r--r-- 1 mike mike 43565 Aug 17 11:52 
> lucene/core/build/tmp/test/jar_extract_9967892842722777228_tmp
> -rw-r--r-- 1 mike mike 36278 Aug 17 10:36 
> lucene/core/build/tmp/test/jar_extract_996836107828729763_tmp
> -rw-r--r-- 1 mike mike 29997 Aug 17 15:48 
> lucene/core/build/tmp/test/jar_extract_9968527122717193835_tmp
> -rw-r--r-- 1 mike mike 19078 Aug 17 11:11 
> lucene/core/build/tmp/test/jar_extract_9968609693107939092_tmp
> -rw-r--r-- 1 mike mike  9920 Aug 17 14:31 
> lucene/core/build/tmp/test/jar_extract_9968809316564216653_tmp
> -rw-r--r-- 1 mike mike 19078 Aug 17 13:22 
> lucene/core/build/tmp/test/jar_extract_9969318805542859308_tmp
> -rw-r--r-- 1 mike mike 28711 Aug 17 11:44 
> lucene/core/build/tmp/test/jar_extract_9974798403956637924_tmp {noformat}
> It seems to grow by ~64 files (how many cores I configured gradle to use in 
> my {{./gradle.properties}} on each iteration of beasting.
> I think {{gradle}} should be cleaning this up maybe?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org