[PR] [MINOR][Test][Connect] Discard stdout / stderr of test Spark connect server if not isDebug [spark]

via GitHub Mon, 22 Jan 2024 10:48:51 -0800


EnricoMi opened a new pull request, #44836:
URL: https://github.com/apache/spark/pull/44836


   ### What changes were proposed in this pull request?
   The stdout and stderr output of the test Spark connect server process used 
throughout E2E tests should be discarded if not in debug mode.
   
   ### Why are the changes needed?
   Running the E2E tests only works for me in debug mode:
   
   ```
   SPARK_DEBUG_SC_JVM_CLIENT=true JAVA_HOME=/usr/lib/jvm/java-17-openjdk-amd64 
SPARK_LOCAL_IP=localhost SKIP_UNIDOC=true SKIP_MIMA=true SERIAL_SBT_TESTS=1 
build/sbt -Phadoop-3 -Pspark-ganglia-lgpl -Phadoop-cloud -Pkinesis-asl 
-Pkubernetes -Pconnect -Pvolcano -Pyarn package connect-client-jvm/test
   ```
   works just fine:
   ```
   [info] Run completed in 5 minutes, 47 seconds.
   [info] Total number of tests run: 1259
   [info] Suites: completed 25, aborted 0
   [info] Tests: succeeded 1259, failed 0, canceled 6, ignored 2, pending 0
   [info] All tests passed.
   [info] Passed: Total 1261, Failed 0, Errors 0, Passed 1261, Ignored 2, 
Canceled 6
   [success] Total time: 426 s (07:06), completed 22.01.2024, 19:46:21
   ```
   
   Not in debug mode, the service does not seem to be able to start:
   ```
   JAVA_HOME=/usr/lib/jvm/java-17-openjdk-amd64 SPARK_LOCAL_IP=localhost 
SKIP_UNIDOC=true SKIP_MIMA=true SERIAL_SBT_TESTS=1 build/sbt -Phadoop-3 
-Pspark-ganglia-lgpl -Phadoop-cloud -Pkinesis-asl -Pkubernetes -Pconnect 
-Pvolcano -Pyarn package connect-client-jvm/test
   ```
   ```
   [info] ClientStreamingQuerySuite:
   Will start Spark Connect server with 
`spark.sql.catalogImplementation=in-memory`, some tests that rely on Hive will 
be ignored. If you don't want to skip them:
   1. Test with maven: run `build/mvn install -DskipTests -Phive` before testing
   2. Test with sbt: run test with `-Phive` profile
   [info] org.apache.spark.sql.streaming.ClientStreamingQuerySuite *** ABORTED 
*** (35 seconds, 823 milliseconds)
   [info]   org.apache.spark.sql.connect.client.RetriesExceeded:
   [info]   at 
org.apache.spark.sql.connect.client.GrpcRetryHandler$Retrying.waitAfterAttempt(GrpcRetryHandler.scala:213)
   [info]   at 
org.apache.spark.sql.connect.client.GrpcRetryHandler$Retrying.retry(GrpcRetryHandler.scala:222)
   [info]   at 
org.apache.spark.sql.connect.client.GrpcRetryHandler.retry(GrpcRetryHandler.scala:36)
   [info]   at 
org.apache.spark.sql.connect.client.CustomSparkConnectBlockingStub.$anonfun$analyzePlan$1(CustomSparkConnectBlockingStub.scala:76)
   [info]   at 
org.apache.spark.sql.connect.client.GrpcExceptionConverter.convert(GrpcExceptionConverter.scala:58)
   [info]   at 
org.apache.spark.sql.connect.client.CustomSparkConnectBlockingStub.analyzePlan(CustomSparkConnectBlockingStub.scala:75)
   [info]   at 
org.apache.spark.sql.connect.client.SparkConnectClient.analyze(SparkConnectClient.scala:83)
   [info]   at 
org.apache.spark.sql.connect.client.SparkConnectClient.analyze(SparkConnectClient.scala:211)
   [info]   at 
org.apache.spark.sql.connect.client.SparkConnectClient.analyze(SparkConnectClient.scala:182)
   [info]   at 
org.apache.spark.sql.SparkSession.version$lzycompute(SparkSession.scala:80)
   [info]   at org.apache.spark.sql.SparkSession.version(SparkSession.scala:79)
   [info]   at 
org.apache.spark.sql.test.SparkConnectServerUtils$.createSparkSession(RemoteSparkSession.scala:198)
   [info]   at 
org.apache.spark.sql.test.RemoteSparkSession.beforeAll(RemoteSparkSession.scala:214)
   [info]   at 
org.apache.spark.sql.test.RemoteSparkSession.beforeAll$(RemoteSparkSession.scala:212)
   [info]   at org.apache.spark.sql.test.QueryTest.beforeAll(QueryTest.scala:28)
   [info]   at 
org.scalatest.BeforeAndAfterAll.liftedTree1$1(BeforeAndAfterAll.scala:212)
   [info]   at org.scalatest.BeforeAndAfterAll.run(BeforeAndAfterAll.scala:210)
   [info]   at org.scalatest.BeforeAndAfterAll.run$(BeforeAndAfterAll.scala:208)
   [info]   at org.apache.spark.sql.test.QueryTest.run(QueryTest.scala:28)
   [info]   at 
org.scalatest.tools.Framework.org$scalatest$tools$Framework$$runSuite(Framework.scala:321)
   [info]   at 
org.scalatest.tools.Framework$ScalaTestTask.execute(Framework.scala:517)
   Warning: Unable to serialize throwable of type 
org.apache.spark.sql.connect.client.RetriesExceeded for SuiteAborted(Ordinal(0, 
2),org.apache.spark.sql.connect.client.RetriesExceeded encountered when 
attempting to run suite 
org.apache.spark.sql.streaming.ClientStreamingQuerySuite,ClientStreamingQuerySuite,org.apache.spark.sql.streaming.ClientStreamingQuerySuite,Some(ClientStreamingQuerySuite),Some(org.apache.spark.sql.connect.client.RetriesExceeded),Some(35823),Some(IndentedText(org.apache.spark.sql.streaming.ClientStreamingQuerySuite,org.apache.spark.sql.connect.client.RetriesExceeded
 encountered when attempting to run suite 
org.apache.spark.sql.streaming.ClientStreamingQuerySuite,0)),Some(SeeStackDepthException),None,None,pool-1-thread-1,1705938098933),
 setting it as NotSerializableWrapperException.
   Warning: Unable to read from client, please check on client for further 
details of the problem.
   [info]   at sbt.ForkMain$Run.lambda$runTest$1(ForkMain.java:414)
   [info]   at 
java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
   [info]   at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
   [info]   at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
   [info]   at java.base/java.lang.Thread.run(Thread.java:840)
   [info] FlatMapGroupsWithStateStreamingSuite:
   <waits forever>
   ```
   
   Discarding stdout and stderr when not in debug mode makes these tests work 
for me as expected.
   
   ### Does this PR introduce _any_ user-facing change?
   No.
   
   ### How was this patch tested?
   No.
   
   ### Was this patch authored or co-authored using generative AI tooling?
   No.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[PR] [MINOR][Test][Connect] Discard stdout / stderr of test Spark connect server if not isDebug [spark]

Reply via email to