I'm going to try to fix SOLR-15135 and then kick-off RC2 later today.

If you're in the middle of the RC1 smoke test, it's still valuable to let
it finish, otherwise, please hold off on testing RC1

Cheers,
Tim

On Tue, Feb 16, 2021 at 10:53 AM Timothy Potter <thelabd...@gmail.com>
wrote:

> Ha! I pulled in Ishan's fixes for 15138 and now AutoscalingHistoryHandlerTest
> behaves the same as in 8.7! Beasted 10 out of 10 passed, so
> no @BadApple'ing needed ;-)
>
> On Tue, Feb 16, 2021 at 10:29 AM Anshum Gupta <ans...@anshumgupta.net>
> wrote:
>
>> Yes, doing a single 8.8.2 release that has all the fixes, especially as
>> we have the fix already is much better for the users.
>>
>> Thanks for your patience, Tim :)
>>
>> On Tue, Feb 16, 2021 at 9:05 AM Timothy Potter <thelabd...@gmail.com>
>> wrote:
>>
>>> @Ishan ~  Can you look at the question Mike raised about
>>> https://issues.apache.org/jira/browse/SOLR-15135 please?
>>>
>>> So the AutoscalingHistoryHandlerTest has a number of hard-coded wait
>>> times in it. While I can appreciate the need for waiting to see state
>>> changes occur, tests like this aren't great for CI and RC smoke tests given
>>> the variability of hardware. Case in point, I made this change:
>>>
>>> ```
>>>
>>> *diff --git
>>> a/solr/core/src/test/org/apache/solr/handler/admin/AutoscalingHistoryHandlerTest.java
>>> b/solr/core/src/test/org/apache/solr/handler/admin/AutoscalingHistoryHandlerTest.java*
>>>
>>> *index a9eea7f7ca5..3b2d39c3317 100644*
>>>
>>> *---
>>> a/solr/core/src/test/org/apache/solr/handler/admin/AutoscalingHistoryHandlerTest.java*
>>>
>>> *+++
>>> b/solr/core/src/test/org/apache/solr/handler/admin/AutoscalingHistoryHandlerTest.java*
>>>
>>> @@ -282,7 +282,7 @@ public class AutoscalingHistoryHandlerTest extends
>>> SolrCloudTestCase {
>>>
>>>      boolean await = actionFiredLatch.await(60, TimeUnit.SECONDS);
>>>
>>>      assertTrue("action did not execute", await);
>>>
>>>
>>>
>>> -    await = listenerFiredLatch.await(60, TimeUnit.SECONDS);
>>>
>>> +    await = listenerFiredLatch.await(120, TimeUnit.SECONDS);
>>>
>>>      assertTrue("listener did not execute", await);
>>>
>>>
>>>
>>>      waitForRecovery(COLL_NAME);
>>> ```
>>>
>>> And of course, beasting passes 5 out of 5; it fails pretty consistently
>>> on the first run w/o this change. So I vote we @BadApple this test for
>>> 8.8.1 and move forward with RC2 now that Ishan's changes are in. Moreover,
>>> since we removed auto-scaling from master, holding up a critical bug fix
>>> for a test that fails intermittently b/c of timing seems imprudent. I'm
>>> also biased in that I want to get the fix for 15145 out ASAP ;-)
>>>
>>> On Tue, Feb 16, 2021 at 9:08 AM Ishan Chattopadhyaya <
>>> ichattopadhy...@gmail.com> wrote:
>>>
>>>> Sounds good, Tim. I've ported the fix to the release branch. Just ran
>>>> the tests to make sure it works fine.
>>>> Thanks for the extra work you'll have to do (RC2) in order to save me
>>>> future work (8.8.2). Really owe you one!
>>>>
>>>> > Are there other fixes you're aware of that are slated for 8.8.2 @Ishan
>>>> Chattopadhyaya <ichattopadhy...@gmail.com>?
>>>> I am not aware of anything else.
>>>>
>>>> On Tue, Feb 16, 2021 at 9:19 PM Timothy Potter <thelabd...@gmail.com>
>>>> wrote:
>>>>
>>>>> I'm beasting AutoscalingHistoryHandlerTest locally now, I haven't seen
>>>>> that one fail on my side yet.
>>>>>
>>>>> As far as respin 8.8.1 RC, it's not a problem for me and I prefer that
>>>>> to doing an 8.8.2 soon after 8.8.1 comes out. Are there other fixes you're
>>>>> aware of that are slated for 8.8.2 @Ishan Chattopadhyaya
>>>>> <ichattopadhy...@gmail.com>? In other words, if the fix for 15138 is
>>>>> all that will be in 8.8.2, let's just include it in 8.8.1 and hopefully we
>>>>> won't need an 8.8.2 ;-)
>>>>>
>>>>> Tim
>>>>>
>>>>> On Tue, Feb 16, 2021 at 7:01 AM Michael Sokolov <msoko...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hmm, I got a failure on
>>>>>>
>>>>>> org.apache.solr.handler.admin.AutoscalingHistoryHandlerTest.testHistory,
>>>>>> but it did not reproduce (tried twice). Would that possibly also be
>>>>>> addressed by those fixes?
>>>>>>
>>>>>> On Tue, Feb 16, 2021 at 7:38 AM Ishan Chattopadhyaya
>>>>>> <ichattopadhy...@gmail.com> wrote:
>>>>>> >
>>>>>> > > The failure seems to be because of a timeout during collection
>>>>>> > > creation
>>>>>> >
>>>>>> > Thanks for digging in. Seems like that is the exact class of fix
>>>>>> that we did for SOLR-15138 and are planning for 8.8.2. Shall we backport
>>>>>> that fix to the release branch now (for RC2 or 8.8.2)?
>>>>>> >
>>>>>> > > My h/w is really fast and beefy and may be that's why it doesn't
>>>>>> get reproduced.
>>>>>> > Same here, Ryzen 9 5950X (fastest mainstream CPU out there).
>>>>>> >
>>>>>> > On Tue, Feb 16, 2021 at 5:36 PM Michael McCandless <
>>>>>> luc...@mikemccandless.com> wrote:
>>>>>> >>
>>>>>> >> Curious, the smoke tester passed for me on the first try:
>>>>>> >>
>>>>>> >> SUCCESS! [0:44:29.979512]
>>>>>> >>
>>>>>> >>
>>>>>> >> Mike McCandless
>>>>>> >>
>>>>>> >> http://blog.mikemccandless.com
>>>>>> >>
>>>>>> >>
>>>>>> >> On Sun, Feb 14, 2021 at 11:26 AM Timothy Potter <
>>>>>> thelabd...@apache.org> wrote:
>>>>>> >>>
>>>>>> >>> Please vote for release candidate 1 for Lucene/Solr 8.8.1
>>>>>> >>>
>>>>>> >>>
>>>>>> >>> The artifacts can be downloaded from:
>>>>>> >>>
>>>>>> >>>
>>>>>> https://dist.apache.org/repos/dist/dev/lucene/lucene-solr-8.8.1-RC1-rev6a50a0315ac7e4979abb0b530857c7795bb3b928
>>>>>> >>>
>>>>>> >>>
>>>>>> >>> You can run the smoke tester directly with this command:
>>>>>> >>>
>>>>>> >>>
>>>>>> >>> python3 -u dev-tools/scripts/smokeTestRelease.py \
>>>>>> >>>
>>>>>> >>>
>>>>>> https://dist.apache.org/repos/dist/dev/lucene/lucene-solr-8.8.1-RC1-rev6a50a0315ac7e4979abb0b530857c7795bb3b928
>>>>>> >>>
>>>>>> >>>
>>>>>> >>> The vote will be open for at least 72 hours i.e. until 2021-02-17
>>>>>> 17:00 UTC.
>>>>>> >>>
>>>>>> >>>
>>>>>> >>> Here is my +1 ~ SUCCESS! [0:50:06.728441]
>>>>>> >>>
>>>>>> >>>
>>>>>> >>> In addition to the smoke test, I built a Docker image from
>>>>>> solr-8.8.1.tgz locally and verified:
>>>>>> >>>
>>>>>> >>>
>>>>>> >>> a. A rolling upgrade of a 3-node 8.7.0 cluster to the 8.8.1 RC
>>>>>> completes successfully w/o any NPEs or weirdness with leader election /
>>>>>> recoveries.
>>>>>> >>>
>>>>>> >>>
>>>>>> >>> b. The base_url property is stored in replica state after the
>>>>>> upgrade
>>>>>> >>>
>>>>>> >>>
>>>>>> >>> c. A basic client application built with SolrJ 8.7.0 can load
>>>>>> cluster state info directly from ZK and query the 8.8.1 RC1 servers.
>>>>>> >>>
>>>>>> >>>
>>>>>> >>> d. Same client app built with SolrJ 8.8.0 works as well.
>>>>>> >>>
>>>>>> >>>
>>>>>> >>> As this bug-fix release is primarily needed to address a SolrJ
>>>>>> back-compat break (SOLR-15145) and unfortunately our smoke tester 
>>>>>> framework
>>>>>> does not test for backcompat of older SolrJ against the RC, I ask others 
>>>>>> to
>>>>>> please test rolling upgrades of servers (ideally multi-node clusters)
>>>>>> running pre-8.8.0 to this RC if possible. Also, please try client
>>>>>> applications that are using an older SolrJ, esp. those that load cluster
>>>>>> state directly from ZK.
>>>>>> >>>
>>>>>> >>>
>>>>>> >>> Best regards,
>>>>>> >>>
>>>>>> >>> Tim
>>>>>> >>>
>>>>>> >>>
>>>>>> >>>
>>>>>> >>>
>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>>>>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>>>>>
>>>>>>
>>
>> --
>> Anshum Gupta
>>
>

Reply via email to