Re: Performance testing is necessary now

2020-08-12 Thread David Smiley
On Wed, Aug 12, 2020 at 8:56 AM Jan Høydahl wrote: > I was glad to read your last mail with a softer tone, Ishan. Respect! I > really appreciate all your hard work and passion for Lucene/Solr, and > thanks for putting time into benchmarking, it really shows that you care > about the project!

Re: Performance testing is necessary now

2020-08-12 Thread Ishan Chattopadhyaya
> So let’s not suggest I was uncooperative just for the heck of it, or because I didn’t care, ok? Sure, Andrzej, I do think that you care. I just assume you missed my ping here:

Re: Performance testing is necessary now

2020-08-12 Thread Ishan Chattopadhyaya
Sure Andrzej, I *sincerely* apologize. I felt Houston and Gus could've received some more help from you while they were working on the release. But, I take my words back. I know that you care deeply about performance ( https://issues.apache.org/jira/browse/SOLR-14691), and please accept all my

Re: Performance testing is necessary now

2020-08-12 Thread Jan Høydahl
I was glad to read your last mail with a softer tone, Ishan. Respect! I really appreciate all your hard work and passion for Lucene/Solr, and thanks for putting time into benchmarking, it really shows that you care about the project! What I think we need more than anything else going forward

Re: Performance testing is necessary now

2020-08-12 Thread Andrzej Białecki
> On 12 Aug 2020, at 07:06, Ishan Chattopadhyaya > wrote: > > > Whatever we do or not do is imperfect. I hope some "mandate" doesn't stop > > progress. > > We don't go changing code just for the heck of it; we do it for a variety > > of matters. > > We sometimes do:

Re: Performance testing is necessary now

2020-08-12 Thread Ishan Chattopadhyaya
Hi All, I went through all the concerns voiced above and took a step back and re-assessed my position. I am withdrawing my initial proposal to request/nag/demand/veto issues without a performance test. I shall not insist on that and apologize for using such language as I did above. I hope, though,

Re: Performance testing is necessary now

2020-08-11 Thread Ishan Chattopadhyaya
> Maybe if we have a common benchmarking suite, such efforts will be less effort and can actually be contributed back so that we can potentially monitor the matter. I am +1 to contributing this to an Apache repository, the moment this is stable. The moment periodic numbers start getting

Re: Performance testing is necessary now

2020-08-11 Thread Ishan Chattopadhyaya
> I don't think that the problem is nobody cares, more likely the problem is it's hard and there's always a tug of war between getting things done and out there where people can benefit from the feature/fix etc vs the risk that they stall out waiting for one more thing to do. I have tried

Re: Performance testing is necessary now

2020-08-11 Thread David Smiley
I haven't tried "solr-bench" https://github.com/thesearchstack/solr-bench closely but I sure hope we can rally around something that's pretty good; maybe this is it. I really need to give this one a shot. I've noticed on occasion some of us will throw together dedicated utilities to do

Re: Performance testing is necessary now

2020-08-11 Thread Ishan Chattopadhyaya
> I was going to use the data set the Mike uses for the lucene nightly benchmarks I've gone with the same in the suite to begin with: https://github.com/TheSearchStack/solr-bench/blob/master/small-data/small-enwiki.tsv.gz The larger file can be downloaded and used as well. The suite is also

Re: Performance testing is necessary now

2020-08-11 Thread Ishan Chattopadhyaya
Here's the local mode example: https://github.com/TheSearchStack/solr-bench/blob/master/config-local.json (Here, please ignore the JDK URL, it is downloaded but the system JDK is used) A pre-built Solr can be used as per

Re: Performance testing is necessary now

2020-08-11 Thread Mike Drob
Can you give examples of this? I don’t see them in the repo. On Tue, Aug 11, 2020 at 4:30 PM Ishan Chattopadhyaya < ichattopadhy...@gmail.com> wrote: > Local mode uses the installed JDK. GCP mode can pick up a JDK url as > configured. It is just a configuration, one among many, that can be

Re: Performance testing is necessary now

2020-08-11 Thread Đạt Cao Mạnh
> > Another note here is problems come and go away in unpredictable ways. Before SOLR-14665 I never thought about doing a performance test of creating thousands of collections. The problem here is the same with our tests, even Solr has a huge number of tests, bugs still happen here and there,

Re: Performance testing is necessary now

2020-08-11 Thread Gus Heck
Not going to agree with your analogy. The difference is that everyone knows murder is wrong, degredations in performance are indadvertent and happen while folks are attempting to capture other benefits for the good of the project. I don't see as how you should take the blame (any more than the

Re: Performance testing is necessary now

2020-08-11 Thread Ishan Chattopadhyaya
I mostly disagree, Gus. Barring isolated efforts, no one ever stepped up to wrap up automated performance benchmarks (I take the blame on this). And after every major regression, we hear the same excuse: there are no automated performance benchmarks. Is that a valid excuse? It is like saying,

Re: Performance testing is necessary now

2020-08-11 Thread Ishan Chattopadhyaya
> but I also have some concerns with coming out with such a strong mandate If you have any alternate suggestions to prevent situations like SOLR-14665, please let us know. I'm open to any suggestion that can enable us collectively to prevent such regressions. Noble and I built that perf tool

Re: Performance testing is necessary now

2020-08-11 Thread Gus Heck
I think we need to get the system for measuring performance in place before we can issue a mandate. The analogy is "test the application functionality carefully before chcking in" vs "run these unit tests before checking in." Even if everyone does their own microbenchmarks they likely won't be

Re: Performance testing is necessary now

2020-08-11 Thread Ishan Chattopadhyaya
Local mode uses the installed JDK. GCP mode can pick up a JDK url as configured. It is just a configuration, one among many, that can be changed as per needs of the benchmark. The benchmarks can be used with almost any branch (just specify the commit sha in the repository section, or alternatively

Re: Performance testing is necessary now

2020-08-11 Thread Mike Drob
Hi Ishan, Thanks for starting this conversation! I think it's important to pay attention to performance, but I also have some concerns with coming out with such a strong mandate. In the repository, I'm looking at how to run in local mode, and see that it looks like it will try to download a jdk

Performance testing is necessary now

2020-08-11 Thread Ishan Chattopadhyaya
Hi Everyone! From now on, I intend to request/nag/demand/veto code changes, which affect default code paths for most users, be accompanied by performance testing numbers for it (e.g. [1]). Opt in features are fine, I won't personally bother about them (but if you'd like to perf test them, it