[ 
https://issues.apache.org/jira/browse/SOLR-10317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16027030#comment-16027030
 ] 

Michael Sun commented on SOLR-10317:
------------------------------------

bq. motivation behind creating yet another benchmarking utility

That's a good question. In fact it's one of the reasons that I suggested to 
start a scoping doc to start conversation early on. 
(https://issues.apache.org/jira/browse/SOLR-10317?focusedCommentId=16011107&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16011107)

Here are my two cents for a few areas that can be improved in addition to 
increasing test coverage. [~vivek.nar...@uga.edu] can articulate. 

1. Currently benchmark tells us how Solr perform but it can also help to tell 
why Solr perform in this way. A good example of effort in this direction is the 
telemetry (https://esrally.readthedocs.io/en/latest/telemetry.html) in rally 
framework. 
2. Provide baseline data for capacity planning. For capacity planning, it 
requires some data such as CPU, disk etc. for specific workloads and benchmark 
can provide that.
3. Extensibility: a benchmark can be easily extended to include new components. 
For example, JMeter can be a good load generator for scalability study for Solr 
cluster with hundreds of nodes and it should be easy to extend current test 
case to use JMeter to replace existing load generator. This may require an 
object model at different abstraction level compared to existing benchmarks.
4. Support more Solr setup and data type. For example, wiki data is a good but 
tweets data can be better in studying Solr performance for real time analytics 
use cases.
5. Last but not least, as any engineering tool, I was hoping the benchmark 
suite can standardize Solr performance effort, promote code reuse and 
facilitate collaboration. This requires good understanding for all use cases 
and careful design.

Of course, this doesn't need to be all done for GoC project. Not to scare 
[~vivek.nar...@uga.edu] :)

Overall, this project is a good initiative and a good venue to continue this 
discussion.



 

> Solr Nightly Benchmarks
> -----------------------
>
>                 Key: SOLR-10317
>                 URL: https://issues.apache.org/jira/browse/SOLR-10317
>             Project: Solr
>          Issue Type: Task
>            Reporter: Ishan Chattopadhyaya
>              Labels: gsoc2017, mentor
>         Attachments: changes-lucene-20160907.json, 
> changes-solr-20160907.json, managed-schema, 
> Narang-Vivek-SOLR-10317-Solr-Nightly-Benchmarks.docx, 
> Narang-Vivek-SOLR-10317-Solr-Nightly-Benchmarks-FINAL-PROPOSAL.pdf, 
> solrconfig.xml
>
>
> Solr needs nightly benchmarks reporting. Similar Lucene benchmarks can be 
> found here, https://home.apache.org/~mikemccand/lucenebench/.
> Preferably, we need:
> # A suite of benchmarks that build Solr from a commit point, start Solr 
> nodes, both in SolrCloud and standalone mode, and record timing information 
> of various operations like indexing, querying, faceting, grouping, 
> replication etc.
> # It should be possible to run them either as an independent suite or as a 
> Jenkins job, and we should be able to report timings as graphs (Jenkins has 
> some charting plugins).
> # The code should eventually be integrated in the Solr codebase, so that it 
> never goes out of date.
> There is some prior work / discussion:
> # https://github.com/shalinmangar/solr-perf-tools (Shalin)
> # https://github.com/chatman/solr-upgrade-tests/blob/master/BENCHMARKS.md 
> (Ishan/Vivek)
> # SOLR-2646 & SOLR-9863 (Mark Miller)
> # https://home.apache.org/~mikemccand/lucenebench/ (Mike McCandless)
> # https://github.com/lucidworks/solr-scale-tk (Tim Potter)
> There is support for building, starting, indexing/querying and stopping Solr 
> in some of these frameworks above. However, the benchmarks run are very 
> limited. Any of these can be a starting point, or a new framework can as well 
> be used. The motivation is to be able to cover every functionality of Solr 
> with a corresponding benchmark that is run every night.
> Proposing this as a GSoC 2017 project. I'm willing to mentor, and I'm sure 
> [~shalinmangar] and [~markrmil...@gmail.com] would help here.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to