Re: 回复： [DISCUSS] Apache Spark 3.0.1 Release

Takeshi Yamamuro Sun, 16 Aug 2020 08:41:59 -0700

I've checked the Jenkins log and It seems the commit from
https://github.com/apache/spark/pull/29404 caused the failure.



On Sat, Aug 15, 2020 at 10:43 PM Koert Kuipers <[email protected]> wrote:

> i noticed commit today that seems to prepare for 3.0.1-rc1:
> commit 05144a5c10cd37ebdbb55fde37d677def49af11f
> Author: Ruifeng Zheng <[email protected]>
> Date:   Sat Aug 15 01:37:47 2020 +0000
>
>     Preparing Spark release v3.0.1-rc1
>
> so i tried to build spark on that commit and i get failure in sql:
>
> 09:36:57.371 ERROR org.apache.spark.scheduler.TaskSetManager: Task 0 in
> stage 77.0 failed 1 times; aborting job
> [info] - SPARK-28224: Aggregate sum big decimal overflow *** FAILED ***
> (306 milliseconds)
> [info]   org.apache.spark.SparkException: Job aborted due to stage
> failure: Task 0 in stage 77.0 failed 1 times, most recent failure: Lost
> task 0.0 in stage 77.0 (TID 197, 192.168.11.17, executor driver):
> java.lang.ArithmeticException:
> Decimal(expanded,111111111111111111110.246000000000000000,39,18}) cannot be
> represented as Decimal(38, 18).
> [info] at org.apache.spark.sql.types.Decimal.toPrecision(Decimal.scala:369)
> [info] at
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage2.agg_doAggregate_sum_0$(Unknown
> Source)
> [info] at
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage2.agg_doConsume_0$(Unknown
> Source)
> [info] at
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage2.agg_doAggregateWithoutKey_0$(Unknown
> Source)
> [info] at
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage2.processNext(Unknown
> Source)
> [info] at
> org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
> [info] at
> org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:729)
> [info] at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:458)
> [info] at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:458)
> [info] at org.apache.spark.util.Utils$.getIteratorSize(Utils.scala:1804)
> [info] at org.apache.spark.rdd.RDD.$anonfun$count$1(RDD.scala:1227)
> [info] at org.apache.spark.rdd.RDD.$anonfun$count$1$adapted(RDD.scala:1227)
> [info] at
> org.apache.spark.SparkContext.$anonfun$runJob$5(SparkContext.scala:2138)
> [info] at
> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
> [info] at org.apache.spark.scheduler.Task.run(Task.scala:127)
> [info] at
> org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:446)
> [info] at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1377)
> [info] at
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:449)
> [info] at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> [info] at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> [info] at java.lang.Thread.run(Thread.java:748)
>
> [error] Failed tests:
> [error] org.apache.spark.sql.DataFrameSuite
>
> On Thu, Aug 13, 2020 at 8:19 PM Jason Moore
> <[email protected]> wrote:
>
>> Thank you so much!  Any update on getting the RC1 up for vote?
>>
>> Jason.
>>
>>
>> ------------------------------
>> *From:* 郑瑞峰 <[email protected]>
>> *Sent:* Wednesday, 5 August 2020 12:54 PM
>> *To:* Jason Moore <[email protected]>; Spark dev list <
>> [email protected]>
>> *Subject:* 回复： [DISCUSS] Apache Spark 3.0.1 Release
>>
>> Hi all,
>> I am going to prepare the realease of 3.0.1 RC1, with the help of Wenchen.
>>
>>
>> ------------------ 原始邮件 ------------------
>> *发件人:* "Jason Moore" <[email protected]>;
>> *发送时间:* 2020年7月30日(星期四) 上午10:35
>> *收件人:* "dev"<[email protected]>;
>> *主题:* Re: [DISCUSS] Apache Spark 3.0.1 Release
>>
>> Hi all,
>>
>>
>>
>> Discussion around 3.0.1 seems to have trickled away.  What was blocking
>> the release process kicking off?  I can see some unresolved bugs raised
>> against 3.0.0, but conversely there were quite a few critical correctness
>> fixes waiting to be released.
>>
>>
>>
>> Cheers,
>>
>> Jason.
>>
>>
>>
>> *From: *Takeshi Yamamuro <[email protected]>
>> *Date: *Wednesday, 15 July 2020 at 9:00 am
>> *To: *Shivaram Venkataraman <[email protected]>
>> *Cc: *"[email protected]" <[email protected]>
>> *Subject: *Re: [DISCUSS] Apache Spark 3.0.1 Release
>>
>>
>>
>> > Just wanted to check if there are any blockers that we are still
>> waiting for to start the new release process.
>>
>> I don't see any on-going blocker in my area.
>>
>> Thanks for the notification.
>>
>>
>>
>> Bests,
>>
>> Tkaeshi
>>
>>
>>
>> On Wed, Jul 15, 2020 at 4:03 AM Dongjoon Hyun <[email protected]>
>> wrote:
>>
>> Hi, Yi.
>>
>>
>>
>> Could you explain why you think that is a blocker? For the given example
>> from the JIRA description,
>>
>>
>>
>> spark.udf.register("key", udf((m: Map[String, String]) => m.keys.head.toInt))
>>
>> Seq(Map("1" -> "one", "2" -> "two")).toDF("a").createOrReplaceTempView("t")
>>
>> checkAnswer(sql("SELECT key(a) AS k FROM t GROUP BY key(a)"), Row(1) :: Nil)
>>
>>
>>
>> Apache Spark 3.0.0 seems to work like the following.
>>
>>
>>
>> scala> spark.version
>>
>> res0: String = 3.0.0
>>
>>
>>
>> scala> spark.udf.register("key", udf((m: Map[String, String]) =>
>> m.keys.head.toInt))
>>
>> res1: org.apache.spark.sql.expressions.UserDefinedFunction =
>> SparkUserDefinedFunction($Lambda$1958/948653928@5d6bed7b,IntegerType,List(Some(class[value[0]:
>> map<string,string>])),None,false,true)
>>
>>
>>
>> scala> Seq(Map("1" -> "one", "2" ->
>> "two")).toDF("a").createOrReplaceTempView("t")
>>
>>
>>
>> scala> sql("SELECT key(a) AS k FROM t GROUP BY key(a)").collect
>>
>> res3: Array[org.apache.spark.sql.Row] = Array([1])
>>
>>
>>
>> Could you provide a reproducible example?
>>
>>
>>
>> Bests,
>>
>> Dongjoon.
>>
>>
>>
>>
>>
>> On Tue, Jul 14, 2020 at 10:04 AM Yi Wu <[email protected]> wrote:
>>
>> This probably be a blocker:
>> https://issues.apache.org/jira/browse/SPARK-32307
>>
>>
>>
>> On Tue, Jul 14, 2020 at 11:13 PM Sean Owen <[email protected]> wrote:
>>
>> https://issues.apache.org/jira/browse/SPARK-32234 ?
>>
>> On Tue, Jul 14, 2020 at 9:57 AM Shivaram Venkataraman
>> <[email protected]> wrote:
>> >
>> > Hi all
>> >
>> > Just wanted to check if there are any blockers that we are still
>> waiting for to start the new release process.
>> >
>> > Thanks
>> > Shivaram
>> >
>>
>>
>>
>>
>> --
>>
>> ---
>> Takeshi Yamamuro
>>
>

-- 
---
Takeshi Yamamuro

Re: 回复： [DISCUSS] Apache Spark 3.0.1 Release

Reply via email to