[
https://issues.apache.org/jira/browse/SOLR-10317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16011141#comment-16011141
]
Vivek Narang edited comment on SOLR-10317 at 5/15/17 7:12 PM:
--------------------------------------------------------------
Hi [~ichattopadhyaya] I also wanted to share that I was trying to figure a way
out to get the Solr commit history for indexing purposes. After searching I
came up with a script
[https://gist.github.com/viveknarang/141ab289789b0fe55b09409f99d84c75] and with
this, created a JSON file [http://162.243.101.83/solrcommit.log] having around
25K documents. Please let me know what you think about the data structure.
Thanks!
--- Sample Record ---
{
"commit": "9c6279d439a231d9ec8c9564b0ab76f616d10076",
"author": "Joel Bernstein",
"date": "Sun May 14 15:54:32 2017 -0400",
"message": "SOLR-10663-Add-distance-Stream-Evaluator",
"author email": "[email protected]",
"timestamp": "1494791672",
"committer name": "Joel Bernstein",
"committer email": "[email protected]",
"commit date": "Mon May 15 11:26:05 2017 -0400"
}
was (Author: [email protected]):
Hi [~ichattopadhyaya] I also wanted to share that I was trying to figure a way
out to get the Solr commit history for indexing purposes. After searching I
came up with a script
[https://gist.github.com/viveknarang/141ab289789b0fe55b09409f99d84c75] and with
this, created a JSON file [http://162.243.101.83/solrcommit.log] having around
25K documents. Please let me know what you think about the data structure.
Thanks!
> Solr Nightly Benchmarks
> -----------------------
>
> Key: SOLR-10317
> URL: https://issues.apache.org/jira/browse/SOLR-10317
> Project: Solr
> Issue Type: Task
> Reporter: Ishan Chattopadhyaya
> Labels: gsoc2017, mentor
> Attachments: Narang-Vivek-SOLR-10317-Solr-Nightly-Benchmarks.docx,
> Narang-Vivek-SOLR-10317-Solr-Nightly-Benchmarks-FINAL-PROPOSAL.pdf
>
>
> Solr needs nightly benchmarks reporting. Similar Lucene benchmarks can be
> found here, https://home.apache.org/~mikemccand/lucenebench/.
> Preferably, we need:
> # A suite of benchmarks that build Solr from a commit point, start Solr
> nodes, both in SolrCloud and standalone mode, and record timing information
> of various operations like indexing, querying, faceting, grouping,
> replication etc.
> # It should be possible to run them either as an independent suite or as a
> Jenkins job, and we should be able to report timings as graphs (Jenkins has
> some charting plugins).
> # The code should eventually be integrated in the Solr codebase, so that it
> never goes out of date.
> There is some prior work / discussion:
> # https://github.com/shalinmangar/solr-perf-tools (Shalin)
> # https://github.com/chatman/solr-upgrade-tests/blob/master/BENCHMARKS.md
> (Ishan/Vivek)
> # SOLR-2646 & SOLR-9863 (Mark Miller)
> # https://home.apache.org/~mikemccand/lucenebench/ (Mike McCandless)
> # https://github.com/lucidworks/solr-scale-tk (Tim Potter)
> There is support for building, starting, indexing/querying and stopping Solr
> in some of these frameworks above. However, the benchmarks run are very
> limited. Any of these can be a starting point, or a new framework can as well
> be used. The motivation is to be able to cover every functionality of Solr
> with a corresponding benchmark that is run every night.
> Proposing this as a GSoC 2017 project. I'm willing to mentor, and I'm sure
> [~shalinmangar] and [[email protected]] would help here.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]