Abacn opened a new issue, #24257: URL: https://github.com/apache/beam/issues/24257
### What needs to happen? SparkReceiverIO performance tests works well for the test number of row=600k and took a couple of second to run. However when bump the number of row to 5M (#24211) the test start to fail with different exceptions: Build 71 ``` 21:48:27 org.apache.beam.sdk.io.sparkreceiver.SparkReceiverIOIT > testSparkReceiverIOReadsInStreamingWithOffset FAILED 21:48:27 java.lang.AssertionError: expected:<5000000> but was:<-1> 21:48:27 at org.junit.Assert.fail(Assert.java:89) 21:48:27 at org.junit.Assert.failNotEquals(Assert.java:835) 21:48:27 at org.junit.Assert.assertEquals(Assert.java:647) 21:48:27 at org.junit.Assert.assertEquals(Assert.java:633) 21:48:27 at org.apache.beam.sdk.io.sparkreceiver.SparkReceiverIOIT.testSparkReceiverIOReadsInStreamingWithOffset(SparkReceiverIOIT.java:347) ``` Build 72: ``` 09:46:52 [AMQP Connection 34.170.36.4:5672] WARN com.rabbitmq.client.impl.ForgivingExceptionHandler - An unexpected connection driver error occurred (Exception message: Socket closed) 09:47:00 java.util.concurrent.TimeoutException 09:47:02 at com.rabbitmq.utility.BlockingCell.get(BlockingCell.java:77) 09:47:02 at com.rabbitmq.utility.BlockingCell.uninterruptibleGet(BlockingCell.java:120) 09:47:02 at com.rabbitmq.utility.BlockingValueOrException.uninterruptibleGetValue(BlockingValueOrException.java:36) 09:47:02 at com.rabbitmq.client.impl.AMQChannel$BlockingRpcContinuation.getReply(AMQChannel.java:502) 09:47:04 at com.rabbitmq.client.impl.AMQConnection.start(AMQConnection.java:326) 09:47:04 at com.rabbitmq.client.impl.recovery.RecoveryAwareAMQConnectionFactory.newConnection(RecoveryAwareAMQConnectionFactory.java:65) 09:47:04 at com.rabbitmq.client.impl.recovery.AutorecoveringConnection.init(AutorecoveringConnection.java:160) 09:47:04 at com.rabbitmq.client.ConnectionFactory.newConnection(ConnectionFactory.java:1216) 09:47:04 at com.rabbitmq.client.ConnectionFactory.newConnection(ConnectionFactory.java:1173) 09:47:06 at com.rabbitmq.client.ConnectionFactory.newConnection(ConnectionFactory.java:1131) 09:47:06 at com.rabbitmq.client.ConnectionFactory.newConnection(ConnectionFactory.java:1294) 09:47:06 at org.apache.beam.sdk.io.sparkreceiver.RabbitMqReceiverWithOffset.receive(RabbitMqReceiverWithOffset.java:98) 09:47:06 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 09:47:06 at java.util.concurrent.FutureTask.run(FutureTask.java:266) 09:47:06 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 09:47:06 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 09:47:06 at java.lang.Thread.run(Thread.java:750) 09:47:06 [Test ****] ERROR org.apache.beam.sdk.testutils.metrics.MetricsReader - Failed to get metric spark_read_element_count, from namespace org.apache.beam.sdk.io.sparkreceiver.SparkReceiverIOIT 09:47:13 09:47:13 org.apache.beam.sdk.io.sparkreceiver.SparkReceiverIOIT > testSparkReceiverIOReadsInStreamingWithOffset FAILED 09:47:13 java.lang.AssertionError: expected:<5000000> but was:<-1> 09:47:13 at org.junit.Assert.fail(Assert.java:89) 09:47:13 at org.junit.Assert.failNotEquals(Assert.java:835) 09:47:13 at org.junit.Assert.assertEquals(Assert.java:647) 09:47:13 at org.junit.Assert.assertEquals(Assert.java:633) 09:47:13 at org.apache.beam.sdk.io.sparkreceiver.SparkReceiverIOIT.testSparkReceiverIOReadsInStreamingWithOffset(SparkReceiverIOIT.java:347) ``` We should not roll back #24211 to make the test pass as this reflects real issue might seen in production when processing data at scale. ### Issue Priority Priority: 1 ### Issue Component Component: test-failures -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
