Hi, All.
We discussed the correctness/dataloss policies for two weeks.
According to our practice, I want to revise our policy in our website
explicitly.
- Correctness and data loss issues should be considered Blockers
+ Correctness and data loss issues should be considered Blockers for their
targ
...whenever i get the word. :)
FWIW they will all be identical to the current group of master builds/tests.
shane
--
Shane Knapp
Computer Guy / Voice of Reason
UC Berkeley EECS Research / RISELab Staff Technical Lead
https://rise.cs.berkeley.edu
Thank you always, Shane!
Xiao
On Fri, Jan 31, 2020 at 11:19 AM shane knapp ☠ wrote:
> ...whenever i get the word. :)
>
> FWIW they will all be identical to the current group of master
> builds/tests.
>
> shane
> --
> Shane Knapp
> Computer Guy / Voice of Reason
> UC Berkeley EECS Research / RI
Thank you, Shane.
BTW, we need to enable JDK11 unit run on Python and R. (Currently, it's
only tested in PRBuilder.)
https://issues.apache.org/jira/browse/SPARK-28900
Today, Thomas and I'm hitting Python UT failure on JDK11 environment in
independent PRs.
ERROR [32.750s]: test_parameter_acc
Oops. I found this flaky test fails event in `Hadoop 2.7 with Hive 1.2`.
https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-sbt-hadoop-2.7-hive-1.2/lastCompletedBuild/testReport/pyspark.mllib.tests.test_streaming_algorithms/StreamingLogisticRegression
I'd like to start a discussion on caching SparkPlan
>From what I benchmark, if sql execution time is less than 1 second, then we
cannot ignore the following overheads , especially if we cache data in
memory
1. Paring, analysing, optimizing SQL
2. Generating Physical Plan (SparkPlan)
3. G