[
https://issues.apache.org/jira/browse/SPARK-50631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Costas Piliotis updated SPARK-50631:
------------------------------------
Description:
This may be nothing. I apologize if it is.
Testing out spark 4.0.0 preview 2 and finding an issue with a local integration
test writing to local disk.
Env:
scala 2.13.15
jdk 17.0.11 (Azul)
sbt 1.10.6
I tried jdk 21 as well; no love either.
build.sbt dependencies:
{code:scala}
libraryDependencies ++= Seq(
"org.apache.spark" %% "spark-core" % sparkVersion % Provided,
"org.apache.spark" %% "spark-sql" % sparkVersion % Provided,
"org.scalatest" %% "scalatest" % scalatestVersion % Test
)
{code}
Here's my test case:
{code:scala}
test("this is a test") {
val spark: SparkSession = SparkSession
.builder()
.master("local[*]")
.config("spark.executor.memory", "1g") // Adjust as needed
.config("spark.driver.memory", "1g")
.getOrCreate()
import spark.implicits._
spark.sparkContext.setLogLevel("TRACE")
val x = List(
(1, 2L,"a", "2022-01-01"),
(1, 2L, "b", "2022-01-02"),
(1, 2L, "c", "2022-01-03")
).toDF("a", "b", "c", "dt")
x.show(10)
x.write
.mode("append")
.csv("/tmp/sandbox")
}
{code}
It hangs. No job or stage in the UI. No activity in the console. TRACE loglevel
just gives me this:
{code:java}
{"ts":"2024-12-19T23:49:48.764Z","level":"TRACE","msg":"Checking for hosts with
no recent heartbeats in HeartbeatReceiver.","logger":"HeartbeatReceiver"}
{code}
I've confirmed the path is writeable and even at TRACE loglevel I can't seem to
get it to write and no clue why.
When I ctrl-c I get some spam afterwards from scalatest.
Maybe I'm daft, it's probable. Spark 3.5 and lower I've had no issues with
this, so it's quite a bit more likely that it's me.
was:
This may be nothing. I apologize if it is.
Testing out spark 4.0.0 preview 2 and finding an issue with a local integration
test writing to local disk.
Env:
scala 2.13.15
jdk 17.0.11 (Azul)
sbt 1.10.6
I tried jdk 21 as well; no love either.
build.sbt dependencies:
{code:scala}
libraryDependencies ++= Seq(
"org.apache.spark" %% "spark-core" % sparkVersion % Provided,
"org.apache.spark" %% "spark-sql" % sparkVersion % Provided,
"org.scalatest" %% "scalatest" % scalatestVersion % Test
)
{code}
Here's my test case:
{code:scala}
test("this is a test") {
val spark: SparkSession = SparkSession
.builder()
.master("local[*]")
.config("spark.executor.memory", "1g") // Adjust as needed
.config("spark.driver.memory", "1g")
.getOrCreate()
import spark.implicits._
spark.sparkContext.setLogLevel("TRACE")
val x = List(
(1, 2L,"a", "2022-01-01"),
(1, 2L, "b", "2022-01-02"),
(1, 2L, "c", "2022-01-03")
).toDF("a", "b", "c", "dt")
x.show(10)
x.write
.mode("append")
.csv("/tmp/sandbox")
}
{code}
It hangs. No job or stage in the UI. No activity in the console. TRACE
loglevel just gives me this:
{code}
{"ts":"2024-12-19T23:49:48.764Z","level":"TRACE","msg":"Checking for hosts with
no recent heartbeats in HeartbeatReceiver.","logger":"HeartbeatReceiver"}
{code}
I've confirmed the path is writeable and even at TRACE loglevel I can't seem
to get it to write and no clue why.
When I ctrl-c I get some spam afterwards from scalatest.
Maybe I'm daft, it's probable. Spark 3.5 and lower I've had no issues with
this, so it's quite a bit more likely that it's me.
> Local spark under scalatest hangs writing to local disk
> -------------------------------------------------------
>
> Key: SPARK-50631
> URL: https://issues.apache.org/jira/browse/SPARK-50631
> Project: Spark
> Issue Type: Bug
> Components: Spark Core
> Affects Versions: 4.0.0
> Reporter: Costas Piliotis
> Priority: Minor
>
> This may be nothing. I apologize if it is.
> Testing out spark 4.0.0 preview 2 and finding an issue with a local
> integration test writing to local disk.
> Env:
> scala 2.13.15
> jdk 17.0.11 (Azul)
> sbt 1.10.6
> I tried jdk 21 as well; no love either.
> build.sbt dependencies:
> {code:scala}
> libraryDependencies ++= Seq(
> "org.apache.spark" %% "spark-core" % sparkVersion % Provided,
> "org.apache.spark" %% "spark-sql" % sparkVersion % Provided,
> "org.scalatest" %% "scalatest" % scalatestVersion % Test
> )
> {code}
> Here's my test case:
> {code:scala}
> test("this is a test") {
> val spark: SparkSession = SparkSession
> .builder()
> .master("local[*]")
> .config("spark.executor.memory", "1g") // Adjust as needed
> .config("spark.driver.memory", "1g")
> .getOrCreate()
> import spark.implicits._
> spark.sparkContext.setLogLevel("TRACE")
> val x = List(
> (1, 2L,"a", "2022-01-01"),
> (1, 2L, "b", "2022-01-02"),
> (1, 2L, "c", "2022-01-03")
> ).toDF("a", "b", "c", "dt")
> x.show(10)
> x.write
> .mode("append")
> .csv("/tmp/sandbox")
> }
> {code}
> It hangs. No job or stage in the UI. No activity in the console. TRACE
> loglevel just gives me this:
> {code:java}
> {"ts":"2024-12-19T23:49:48.764Z","level":"TRACE","msg":"Checking for hosts
> with no recent heartbeats in HeartbeatReceiver.","logger":"HeartbeatReceiver"}
> {code}
> I've confirmed the path is writeable and even at TRACE loglevel I can't seem
> to get it to write and no clue why.
> When I ctrl-c I get some spam afterwards from scalatest.
> Maybe I'm daft, it's probable. Spark 3.5 and lower I've had no issues with
> this, so it's quite a bit more likely that it's me.
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]