as Deron mentioned, running all experiments up to 80GB is a good
compromise. Over the weekend, I ran exactly that on Spark 1.6.1 and it took
less than a day. This approach would allow us to run MR and different Spark
versions instead.

Regarding the original mail, I think we can deduplicate the list a bit -
running the performance tests over all algorithms obviously covers the
checks of our algorithms as well. Other than that, +1 for creating these
documents for the release process.

Regards,
Matthias



From:   Luciano Resende <[email protected]>
To:     [email protected]
Date:   05/23/2016 12:15 PM
Subject:        Re: Formalize a release candidate review process?



On Mon, May 23, 2016 at 11:34 AM, Niketan Pansare <[email protected]>
wrote:

> +1 for formalizing the release candidate process. Please note: the point
9
> and 10 (i.e. performance suite) on 6-node cluster including XS (0.08 GB)
to
> XL (800 GB) datasets takes 12-15 days. This estimate only includes
> following algorithms: l2svm, GLM binomial probit, linregcg, linregds,
> multilogreg, msvm, naive bayes and kmeans. This does not include time to
> re-execute failed cases (if any) and sparse experiments. So, if we
include
> point 9 and 10 in our release process, we need to be aware that it would
> take additional two weeks.
>
> An alternative would be to run this code in trunk, when we start
preparing
for the release. Another approach would be to release it, and provide a
minor release if there are performance fixes that needs to go on top of the
release.








--
Luciano Resende
http://twitter.com/lresende1975
http://lresende.blogspot.com/

Reply via email to