[GitHub] marcoabreu opened a new issue #11249: Flaky Scala test: IllegalArgumentException: requirement failed: Failed to start ps scheduler
marcoabreu opened a new issue #11249: Flaky Scala test: IllegalArgumentException: requirement failed: Failed to start ps scheduler URL: https://github.com/apache/incubator-mxnet/issues/11249 http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/incubator-mxnet/detail/PR-11246/1/pipeline/ ``Exception in thread "Thread-113" java.lang.IllegalArgumentException: requirement failed: Failed to start ps scheduler process with exit code 134`` ``` 18/06/12 17:16:53 INFO Utils: /work/mxnet/scala-package/assembly/linux-x86_64-cpu/target/mxnet-full_2.11-linux-x86_64-cpu-1.3.0-SNAPSHOT.jar has been previously copied to /tmp/spark-202cac2e-62d9-4ac4-a99a-27bca2aaf87f/userFiles-2f1680a1-aea4-4de0-8784-3300ad265be7/mxnet-full_2.11-linux-x86_64-cpu-1.3.0-SNAPSHOT.jar 18/06/12 17:16:53 INFO Executor: Fetching file:/work/mxnet/scala-package/spark/target/mxnet-spark_2.11-1.3.0-SNAPSHOT.jar with timestamp 1528823813012 18/06/12 17:16:53 INFO Utils: /work/mxnet/scala-package/spark/target/mxnet-spark_2.11-1.3.0-SNAPSHOT.jar has been previously copied to /tmp/spark-202cac2e-62d9-4ac4-a99a-27bca2aaf87f/userFiles-2f1680a1-aea4-4de0-8784-3300ad265be7/mxnet-spark_2.11-1.3.0-SNAPSHOT.jar 18/06/12 17:16:53 INFO MXNet: Starting server ... 18/06/12 17:16:53 INFO HadoopRDD: Input split: file:/tmp/mxnet-spark-test-15288237675593208593691818310660/train.txt:234881024+8528911 18/06/12 17:16:53 INFO HadoopRDD: Input split: file:/tmp/mxnet-spark-test-15288237675593208593691818310660/train.txt:67108864+33554432 18/06/12 17:16:53 INFO HadoopRDD: Input split: file:/tmp/mxnet-spark-test-15288237675593208593691818310660/train.txt:167772160+33554432 18/06/12 17:16:53 INFO HadoopRDD: Input split: file:/tmp/mxnet-spark-test-15288237675593208593691818310660/train.txt:201326592+33554432 18/06/12 17:16:53 INFO HadoopRDD: Input split: file:/tmp/mxnet-spark-test-15288237675593208593691818310660/train.txt:134217728+33554432 18/06/12 17:16:53 INFO HadoopRDD: Input split: file:/tmp/mxnet-spark-test-15288237675593208593691818310660/train.txt:33554432+33554432 18/06/12 17:16:53 INFO HadoopRDD: Input split: file:/tmp/mxnet-spark-test-15288237675593208593691818310660/train.txt:0+33554432 18/06/12 17:16:53 INFO HadoopRDD: Input split: file:/tmp/mxnet-spark-test-15288237675593208593691818310660/train.txt:100663296+33554432 18/06/12 17:16:53 INFO ParameterServer: Started process: java -cp /tmp/spark-202cac2e-62d9-4ac4-a99a-27bca2aaf87f/userFiles-2f1680a1-aea4-4de0-8784-3300ad265be7/mxnet-full_2.11-linux-x86_64-cpu-1.3.0-SNAPSHOT.jar:/tmp/spark-202cac2e-62d9-4ac4-a99a-27bca2aaf87f/userFiles-2f1680a1-aea4-4de0-8784-3300ad265be7/mxnet-spark_2.11-1.3.0-SNAPSHOT.jar org.apache.mxnet.spark.ParameterServer --role=server --root-uri=172.17.0.4 --root-port=45669 --num-server=1 --num-worker=2 --timeout=300 at 172.17.0.4:45669 18/06/12 17:16:53 INFO ParameterServer: Starting InputStream-Redirecter Thread for 172.17.0.4:45669 18/06/12 17:16:53 INFO ParameterServer: Starting ErrorStream-Redirecter Thread for 172.17.0.4:45669 SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details. 18/06/12 17:16:53 INFO Executor: Finished task 7.0 in stage 1.0 (TID 8). 2254 bytes result sent to driver 18/06/12 17:16:53 INFO TaskSetManager: Finished task 7.0 in stage 1.0 (TID 8) in 530 ms on localhost (1/8) 18/06/12 17:16:54 INFO Executor: Finished task 0.0 in stage 1.0 (TID 1). 2254 bytes result sent to driver 18/06/12 17:16:54 INFO TaskSetManager: Finished task 0.0 in stage 1.0 (TID 1) in 1083 ms on localhost (2/8) 18/06/12 17:16:54 INFO Executor: Finished task 5.0 in stage 1.0 (TID 6). 2254 bytes result sent to driver 18/06/12 17:16:54 INFO TaskSetManager: Finished task 5.0 in stage 1.0 (TID 6) in 1092 ms on localhost (3/8) 18/06/12 17:16:54 INFO Executor: Finished task 2.0 in stage 1.0 (TID 3). 2254 bytes result sent to driver 18/06/12 17:16:54 INFO Executor: Finished task 1.0 in stage 1.0 (TID 2). 2254 bytes result sent to driver 18/06/12 17:16:54 INFO TaskSetManager: Finished task 2.0 in stage 1.0 (TID 3) in 1093 ms on localhost (4/8) 18/06/12 17:16:54 INFO TaskSetManager: Finished task 1.0 in stage 1.0 (TID 2) in 1094 ms on localhost (5/8) 18/06/12 17:16:54 INFO Executor: Finished task 6.0 in stage 1.0 (TID 7). 2254 bytes result sent to driver 18/06/12 17:16:54 INFO TaskSetManager: Finished task 6.0 in stage 1.0 (TID 7) in 1097 ms on localhost (6/8) 18/06/12 17:16:54 INFO Executor: Finished task 3.0 in stage 1.0 (TID 4). 2254 bytes result sent to driver 18/06/12 17:16:54 INFO TaskSetManager: Finished task 3.0 in stage 1.0 (TID 4) in 1100 ms on localhost
[GitHub] marcoabreu opened a new issue #11249: Flaky Scala test: IllegalArgumentException: requirement failed: Failed to start ps scheduler
marcoabreu opened a new issue #11249: Flaky Scala test: IllegalArgumentException: requirement failed: Failed to start ps scheduler URL: https://github.com/apache/incubator-mxnet/issues/11249 http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/incubator-mxnet/detail/PR-11246/1/pipeline/ ``Exception in thread "Thread-113" java.lang.IllegalArgumentException: requirement failed: Failed to start ps scheduler process with exit code 134`` ``` 18/06/12 17:16:53 INFO Utils: /work/mxnet/scala-package/assembly/linux-x86_64-cpu/target/mxnet-full_2.11-linux-x86_64-cpu-1.3.0-SNAPSHOT.jar has been previously copied to /tmp/spark-202cac2e-62d9-4ac4-a99a-27bca2aaf87f/userFiles-2f1680a1-aea4-4de0-8784-3300ad265be7/mxnet-full_2.11-linux-x86_64-cpu-1.3.0-SNAPSHOT.jar 18/06/12 17:16:53 INFO Executor: Fetching file:/work/mxnet/scala-package/spark/target/mxnet-spark_2.11-1.3.0-SNAPSHOT.jar with timestamp 1528823813012 18/06/12 17:16:53 INFO Utils: /work/mxnet/scala-package/spark/target/mxnet-spark_2.11-1.3.0-SNAPSHOT.jar has been previously copied to /tmp/spark-202cac2e-62d9-4ac4-a99a-27bca2aaf87f/userFiles-2f1680a1-aea4-4de0-8784-3300ad265be7/mxnet-spark_2.11-1.3.0-SNAPSHOT.jar 18/06/12 17:16:53 INFO MXNet: Starting server ... 18/06/12 17:16:53 INFO HadoopRDD: Input split: file:/tmp/mxnet-spark-test-15288237675593208593691818310660/train.txt:234881024+8528911 18/06/12 17:16:53 INFO HadoopRDD: Input split: file:/tmp/mxnet-spark-test-15288237675593208593691818310660/train.txt:67108864+33554432 18/06/12 17:16:53 INFO HadoopRDD: Input split: file:/tmp/mxnet-spark-test-15288237675593208593691818310660/train.txt:167772160+33554432 18/06/12 17:16:53 INFO HadoopRDD: Input split: file:/tmp/mxnet-spark-test-15288237675593208593691818310660/train.txt:201326592+33554432 18/06/12 17:16:53 INFO HadoopRDD: Input split: file:/tmp/mxnet-spark-test-15288237675593208593691818310660/train.txt:134217728+33554432 18/06/12 17:16:53 INFO HadoopRDD: Input split: file:/tmp/mxnet-spark-test-15288237675593208593691818310660/train.txt:33554432+33554432 18/06/12 17:16:53 INFO HadoopRDD: Input split: file:/tmp/mxnet-spark-test-15288237675593208593691818310660/train.txt:0+33554432 18/06/12 17:16:53 INFO HadoopRDD: Input split: file:/tmp/mxnet-spark-test-15288237675593208593691818310660/train.txt:100663296+33554432 18/06/12 17:16:53 INFO ParameterServer: Started process: java -cp /tmp/spark-202cac2e-62d9-4ac4-a99a-27bca2aaf87f/userFiles-2f1680a1-aea4-4de0-8784-3300ad265be7/mxnet-full_2.11-linux-x86_64-cpu-1.3.0-SNAPSHOT.jar:/tmp/spark-202cac2e-62d9-4ac4-a99a-27bca2aaf87f/userFiles-2f1680a1-aea4-4de0-8784-3300ad265be7/mxnet-spark_2.11-1.3.0-SNAPSHOT.jar org.apache.mxnet.spark.ParameterServer --role=server --root-uri=172.17.0.4 --root-port=45669 --num-server=1 --num-worker=2 --timeout=300 at 172.17.0.4:45669 18/06/12 17:16:53 INFO ParameterServer: Starting InputStream-Redirecter Thread for 172.17.0.4:45669 18/06/12 17:16:53 INFO ParameterServer: Starting ErrorStream-Redirecter Thread for 172.17.0.4:45669 SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details. 18/06/12 17:16:53 INFO Executor: Finished task 7.0 in stage 1.0 (TID 8). 2254 bytes result sent to driver 18/06/12 17:16:53 INFO TaskSetManager: Finished task 7.0 in stage 1.0 (TID 8) in 530 ms on localhost (1/8) 18/06/12 17:16:54 INFO Executor: Finished task 0.0 in stage 1.0 (TID 1). 2254 bytes result sent to driver 18/06/12 17:16:54 INFO TaskSetManager: Finished task 0.0 in stage 1.0 (TID 1) in 1083 ms on localhost (2/8) 18/06/12 17:16:54 INFO Executor: Finished task 5.0 in stage 1.0 (TID 6). 2254 bytes result sent to driver 18/06/12 17:16:54 INFO TaskSetManager: Finished task 5.0 in stage 1.0 (TID 6) in 1092 ms on localhost (3/8) 18/06/12 17:16:54 INFO Executor: Finished task 2.0 in stage 1.0 (TID 3). 2254 bytes result sent to driver 18/06/12 17:16:54 INFO Executor: Finished task 1.0 in stage 1.0 (TID 2). 2254 bytes result sent to driver 18/06/12 17:16:54 INFO TaskSetManager: Finished task 2.0 in stage 1.0 (TID 3) in 1093 ms on localhost (4/8) 18/06/12 17:16:54 INFO TaskSetManager: Finished task 1.0 in stage 1.0 (TID 2) in 1094 ms on localhost (5/8) 18/06/12 17:16:54 INFO Executor: Finished task 6.0 in stage 1.0 (TID 7). 2254 bytes result sent to driver 18/06/12 17:16:54 INFO TaskSetManager: Finished task 6.0 in stage 1.0 (TID 7) in 1097 ms on localhost (6/8) 18/06/12 17:16:54 INFO Executor: Finished task 3.0 in stage 1.0 (TID 4). 2254 bytes result sent to driver 18/06/12 17:16:54 INFO TaskSetManager: Finished task 3.0 in stage 1.0 (TID 4) in 1100 ms on localhost