Re: Performance testing is necessary now

Ishan Chattopadhyaya Wed, 12 Aug 2020 05:33:21 -0700

Hi All,
I went through all the concerns voiced above and took a step back and
re-assessed my position. I am withdrawing my initial proposal to
request/nag/demand/veto issues without a performance test.
I shall not insist on that and apologize for using such language as I did
above. I hope, though, that we all do our best to preserve (and only
improve) the performance characteristics of Solr for the sake of our users.
Thanks to everyone for your inputs.
Regards,
Ishan


On Wed, Aug 12, 2020 at 10:36 AM Ishan Chattopadhyaya <
[email protected]> wrote:

> > Maybe if we have a common benchmarking suite, such efforts will be less
> effort and can actually be contributed back so that we can potentially
> monitor the matter.
>
> I am +1 to contributing this to an Apache repository, the moment this is
> stable. The moment periodic numbers start getting published, the risk of
> the suite being abandoned is reduced. Two more things to do before this
> happens: 1. identifying datasets and queries (I'm making progress) and 2. a
> web UI that plots charts based on those numbers. Help welcome.
>
> > Whatever we do or not do is imperfect.  I hope some "mandate" doesn't
> stop progress.
> > We don't go changing code just for the heck of it; we do it for a
> variety of matters.
>
> We sometimes do: https://issues.apache.org/jira/browse/SOLR-12845. I
> don't want to stop progress, but I want to avoid situations where someone
> commits an issue (e.g. SOLR-12845), it causes a massive regression
> (SOLR-14665), and others have to come and fix the situation (
> https://issues.apache.org/jira/browse/SOLR-14706 and releases) with very
> little help or support from the original committer. Just because there was
> no mandate in place, hours and hours of effort has already been wasted on
> that issue, let aside the users who are suffering as well.
>
> Requesting a performance testing for all features affecting critical code
> paths seemed like the most constructive way to tackle this situation, but
> if there is any other solution that comes to mind to address this
> situation, please suggest.
>
> >  If those
> > things are blocked, we'll be trading the opportunity cost of the change
> for the performance
> > risk.  Each issue is different -- has its own risk-reward trade-off.
> Just keep this in mind, Ishan.
>
> I totally understand.
>
> On Wed, Aug 12, 2020 at 10:18 AM Ishan Chattopadhyaya <
> [email protected]> wrote:
>
>> > I don't think that the problem is nobody cares, more likely the problem
>> is it's hard and there's always a tug of war between getting things done
>> and out there where people can benefit from the feature/fix etc vs the risk
>> that they stall out waiting for one more thing to do.
>> I have tried desperately to stay constructive in this effort and in my
>> intention, so I will not repeat what I have said in the past.
>>
>> > If the time to complete a task grows the likelihood that real life, and
>> jobs interrupt it grows, and the chance it lingers indefinitely or is
>> abandoned goes up.
>> I'm afraid that shouldn't be an excuse to not do the due diligence. It is
>> better to not commit something that is not performance tested (and affects
>> default code paths for every user) than to commit it, cause a regression
>> and have other people come clean up the performance mess after you.
>>
>>
>>
>> On Wed, Aug 12, 2020 at 10:03 AM Ishan Chattopadhyaya <
>> [email protected]> wrote:
>>
>>> > I was going to use the data set the Mike uses for the lucene nightly
>>> benchmarks
>>> I've gone with the same in the suite to begin with:
>>> https://github.com/TheSearchStack/solr-bench/blob/master/small-data/small-enwiki.tsv.gz
>>> The larger file can be downloaded and used as well.
>>>
>>> The suite is also capable of using .jsonl files, and I'm building
>>> another dataset (based on Hacker News articles) for that at the moment.
>>>
>>> On Wed, Aug 12, 2020 at 10:00 AM Ishan Chattopadhyaya <
>>> [email protected]> wrote:
>>>
>>>> Here's the local mode example:
>>>> https://github.com/TheSearchStack/solr-bench/blob/master/config-local.json
>>>> (Here, please ignore the JDK URL, it is downloaded but the system JDK
>>>> is used)
>>>>
>>>> A pre-built Solr can be used as per
>>>> https://github.com/TheSearchStack/solr-bench/blob/master/config-prebuilt.json
>>>> (I just added this).
>>>> In this example, Solr is downloaded from the given URL and used.
>>>> Alternatively, you can build Solr a tarball and place it in the solr-bench
>>>> directory and specify its name (not the full path) in the "solr-package".
>>>>
>>>> When both "solr-package" and "repository" are specified, the former is
>>>> used and the latter is ignored. If only the latter is specified
>>>> ("repository"), Solr is compiled/built using the specified commit point.
>>>>
>>>>
>>>>
>>>>
>>>> On Wed, Aug 12, 2020 at 6:17 AM Mike Drob <[email protected]> wrote:
>>>>
>>>>> Can you give examples of this? I don’t see them in the repo.
>>>>>
>>>>> On Tue, Aug 11, 2020 at 4:30 PM Ishan Chattopadhyaya <
>>>>> [email protected]> wrote:
>>>>>
>>>>>> Local mode uses the installed JDK. GCP mode can pick up a JDK url as
>>>>>> configured. It is just a configuration, one among many, that can be 
>>>>>> changed
>>>>>> as per needs of the benchmark. The benchmarks can be used with almost any
>>>>>> branch (just specify the commit sha in the repository section, or
>>>>>> alternatively build Solr tgz separately and refer to it in the 
>>>>>> solr-package
>>>>>> parameter).
>>>>>>
>>>>>>
>>>>>> On Wed, 12 Aug, 2020, 2:39 am Mike Drob, <[email protected]> wrote:
>>>>>>
>>>>>>> Hi Ishan,
>>>>>>>
>>>>>>> Thanks for starting this conversation! I think it's important to pay
>>>>>>> attention to performance, but I also have some concerns with coming out
>>>>>>> with such a strong mandate. In the repository, I'm looking at how to 
>>>>>>> run in
>>>>>>> local mode, and see that it looks like it will try to download a jdk 
>>>>>>> from
>>>>>>> some university website? That seems overly restrictive to me, why can't 
>>>>>>> we
>>>>>>> use the already installed JDK?
>>>>>>>
>>>>>>> Is the benchmark suite designed for master? Or for branch_8x?
>>>>>>>
>>>>>>> Mike
>>>>>>>
>>>>>>> On Tue, Aug 11, 2020 at 9:04 AM Ishan Chattopadhyaya <
>>>>>>> [email protected]> wrote:
>>>>>>>
>>>>>>>> Hi Everyone!
>>>>>>>>    From now on, I intend to request/nag/demand/veto code changes,
>>>>>>>> which affect default code paths for most users, be accompanied by
>>>>>>>> performance testing numbers for it (e.g. [1]). Opt in features are 
>>>>>>>> fine, I
>>>>>>>> won't personally bother about them (but if you'd like to perf test 
>>>>>>>> them, it
>>>>>>>> would set a great precedent anyway).
>>>>>>>>
>>>>>>>> I will also work on setting up automated performance and stress
>>>>>>>> testing [2], but in the absence of that, let us do performance test
>>>>>>>> manually and report them in the JIRA. Unless we don't hold ourselves 
>>>>>>>> to a
>>>>>>>> high standards, performance will be a joke whereby performance 
>>>>>>>> regressions
>>>>>>>> can creep in without the committer(s) taking any responsibility towards
>>>>>>>> those users affected by it (SOLR-14665).
>>>>>>>>
>>>>>>>> A benchmarking suite that I am working on is at
>>>>>>>> https://github.com/thesearchstack/solr-bench (SOLR-10317). A
>>>>>>>> stress test suite is under development (SOLR-13933). If you wish to use
>>>>>>>> either of these, I shall offer help and support (please ping me on 
>>>>>>>> Slack
>>>>>>>> directly or #solr-dev, or open a Github Issue on that repo).
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Ishan
>>>>>>>>
>>>>>>>> [1] -
>>>>>>>> https://issues.apache.org/jira/browse/SOLR-14354?focusedCommentId=17174221&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17174221
>>>>>>>> [2] -
>>>>>>>> https://issues.apache.org/jira/browse/SOLR-14354?focusedCommentId=17174234&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17174234
>>>>>>>> (edited)
>>>>>>>>
>>>>>>>>
>>>>>>>>

Re: Performance testing is necessary now

Reply via email to