Re: [VOTE] Apache Spark 3.0.0 RC1

Jungtaek Lim Thu, 09 Apr 2020 01:51:26 -0700

I went through some manually tests for the new features of Structured
Streaming in Spark 3.0.0. (Please let me know if there're more features
we'd like to test manually.)


* file source cleanup - both “archive" and “delete" work. Query fails as
expected when the input directory is the output directory of file sink.
* kafka source/sink - “header” works for both source and sink, "group id
prefix" and “static group id” work, confirmed start offset by timestamp
works for streaming case
* event log stuffs with streaming query - enabled it, confirmed compaction
works, and SHS can read compacted event logs, and downloading event log in
SHS works as zipping the event log directory. original functionalities with
single event log file work as well.

Looks good, though there're still plenty of commits pushed to branch-3.0
after RC1 which feels me that it may not be safe to carry over the
test result for RC1 to RC2.

On Sat, Apr 4, 2020 at 12:49 AM Sean Owen <sro...@apache.org> wrote:

> Aside from the other issues mentioned here, which probably do require
> another RC, this looks pretty good to me.
>
> I built on Ubuntu 19 and ran with Java 11, -Pspark-ganglia-lgpl
> -Pkinesis-asl -Phadoop-3.2 -Phive-2.3 -Pyarn -Pmesos -Pkubernetes
> -Phive-thriftserver -Djava.version=11
>
> I did see the following test failures, but as usual, I'm not sure
> whether it's specific to me. Anyone else see these, particularly the R
> warnings?
>
>
> PythonUDFSuite:
> org.apache.spark.sql.execution.python.PythonUDFSuite *** ABORTED ***
>   java.lang.RuntimeException: Unable to load a Suite class that was
> discovered in the runpath:
> org.apache.spark.sql.execution.python.PythonUDFSuite
>   at
> org.scalatest.tools.DiscoverySuite$.getSuiteInstance(DiscoverySuite.scala:81)
>   at
> org.scalatest.tools.DiscoverySuite.$anonfun$nestedSuites$1(DiscoverySuite.scala:38)
>   at
> scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:238)
>   at scala.collection.Iterator.foreach(Iterator.scala:941)
>   at scala.collection.Iterator.foreach$(Iterator.scala:941)
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1429)
>   at scala.collection.IterableLike.foreach(IterableLike.scala:74)
>   at scala.collection.IterableLike.foreach$(IterableLike.scala:73)
>   at scala.collection.AbstractIterable.foreach(Iterable.scala:56)
>   at scala.collection.TraversableLike.map(TraversableLike.scala:238)
>
>
> - SPARK-25158: Executor accidentally exit because
> ScriptTransformationWriterThread throw Exception *** FAILED ***
>   Expected exception org.apache.spark.SparkException to be thrown, but
> no exception was thrown (SQLQuerySuite.scala:2384)
>
>
> * checking for missing documentation entries ... WARNING
> Undocumented code objects:
>   ‘%<=>%’ ‘add_months’ ‘agg’ ‘approxCountDistinct’ ‘approxQuantile’
>   ‘approx_count_distinct’ ‘arrange’ ‘array_contains’ ‘array_distinct’
> ...
>  WARNING
> ‘qpdf’ is needed for checks on size reduction of PDFs
>
> On Tue, Mar 31, 2020 at 10:04 PM Reynold Xin <r...@databricks.com> wrote:
> >
> > Please vote on releasing the following candidate as Apache Spark version
> 3.0.0.
> >
> > The vote is open until 11:59pm Pacific time Fri Apr 3, and passes if a
> majority +1 PMC votes are cast, with a minimum of 3 +1 votes.
> >
> > [ ] +1 Release this package as Apache Spark 3.0.0
> > [ ] -1 Do not release this package because ...
> >
> > To learn more about Apache Spark, please see http://spark.apache.org/
> >
> > The tag to be voted on is v3.0.0-rc1 (commit
> 6550d0d5283efdbbd838f3aeaf0476c7f52a0fb1):
> > https://github.com/apache/spark/tree/v3.0.0-rc1
> >
> > The release files, including signatures, digests, etc. can be found at:
> > https://dist.apache.org/repos/dist/dev/spark/v3.0.0-rc1-bin/
> >
> > Signatures used for Spark RCs can be found in this file:
> > https://dist.apache.org/repos/dist/dev/spark/KEYS
> >
> > The staging repository for this release can be found at:
> > https://repository.apache.org/content/repositories/orgapachespark-1341/
> >
> > The documentation corresponding to this release can be found at:
> > https://dist.apache.org/repos/dist/dev/spark/v3.0.0-rc1-docs/
> >
> > The list of bug fixes going into 2.4.5 can be found at the following URL:
> > https://issues.apache.org/jira/projects/SPARK/versions/12339177
> >
> > This release is using the release script of the tag v3.0.0-rc1.
> >
> >
> > FAQ
> >
> > =========================
> > How can I help test this release?
> > =========================
> > If you are a Spark user, you can help us test this release by taking
> > an existing Spark workload and running on this release candidate, then
> > reporting any regressions.
> >
> > If you're working in PySpark you can set up a virtual env and install
> > the current RC and see if anything important breaks, in the Java/Scala
> > you can add the staging repository to your projects resolvers and test
> > with the RC (make sure to clean up the artifact cache before/after so
> > you don't end up building with a out of date RC going forward).
> >
> > ===========================================
> > What should happen to JIRA tickets still targeting 3.0.0?
> > ===========================================
> > The current list of open tickets targeted at 3.0.0 can be found at:
> > https://issues.apache.org/jira/projects/SPARK and search for "Target
> Version/s" = 3.0.0
> >
> > Committers should look at those and triage. Extremely important bug
> > fixes, documentation, and API tweaks that impact compatibility should
> > be worked on immediately. Everything else please retarget to an
> > appropriate release.
> >
> > ==================
> > But my bug isn't fixed?
> > ==================
> > In order to make timely releases, we will typically not hold the
> > release unless the bug in question is a regression from the previous
> > release. That being said, if there is something which is a regression
> > that has not been correctly targeted please ping me or a committer to
> > help target the issue.
> >
> >
> > Note: I fully expect this RC to fail.
> >
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>

Re: [VOTE] Apache Spark 3.0.0 RC1

Reply via email to