i have no problem w/storing all of the logs. :) i also have no problem w/donated S3 buckets. :)
On Mon, Dec 15, 2014 at 2:39 PM, Nicholas Chammas < nicholas.cham...@gmail.com> wrote: > > How about all of them <https://amplab.cs.berkeley.edu/jenkins/view/Spark/>? > How > much data per day would it roughly be if we uploaded all the logs for all > these builds? > > Also, would Databricks be willing to offer up an S3 bucket for this > purpose? > > Nick > > On Mon Dec 15 2014 at 11:48:44 AM shane knapp <skn...@berkeley.edu> wrote: > >> right now, the following logs are archived on to the master: >> >> local log_files=$( >> find .\ >> -name "unit-tests.log" -o\ >> -path "./sql/hive/target/HiveCompatibilitySuite.failed" -o\ >> -path "./sql/hive/target/HiveCompatibilitySuite.hiveFailed" -o\ >> -path "./sql/hive/target/HiveCompatibilitySuite.wrong" >> ) >> >> regarding dumping stuff to S3 -- thankfully, since we're not looking at a >> lot of disk usage, i don't see a problem w/this. we could tar/zip up the >> XML for each build and just dump it there. >> >> what builds are we thinking about? spark pull request builder? what >> others? >> >> On Mon, Dec 15, 2014 at 1:33 AM, Nicholas Chammas < >> nicholas.cham...@gmail.com> wrote: >>> >>> Every time we run a test cycle on our Jenkins cluster, we generate >>> hundreds >>> of XML reports covering all the tests we have (e.g. >>> >>> `streaming/target/test-reports/org.apache.spark.streaming.util.WriteAheadLogSuite.xml`). >>> >>> These reports contain interesting information about whether tests >>> succeeded >>> or failed, and how long they took to complete. There is also detailed >>> information about the environment they ran in. >>> >>> It might be valuable to have a window into all these reports across all >>> Jenkins builds and across all time, and use that to track basic >>> statistics >>> about our tests. That could give us basic insight into what tests are >>> flaky >>> or slow, and perhaps drive other improvements to our testing >>> infrastructure >>> that we can't see just yet. >>> >>> Do people think that would be valuable? Do we already have something like >>> this? >>> >>> I'm thinking for starters it might be cool if we automatically uploaded >>> all >>> the XML test reports from the Master and the Pull Request builders to an >>> S3 >>> bucket and just opened it up for the dev community to analyze. >>> >>> Nick >>> >>