[GitHub] marcoabreu opened a new issue #11249: Flaky Scala test: IllegalArgumentException: requirement failed: Failed to start ps scheduler

2018-06-15 Thread GitBox
marcoabreu opened a new issue #11249: Flaky Scala test: 
IllegalArgumentException: requirement failed: Failed to start ps scheduler
URL: https://github.com/apache/incubator-mxnet/issues/11249
 
 
   
http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/incubator-mxnet/detail/PR-11246/1/pipeline/
   ``Exception in thread "Thread-113" java.lang.IllegalArgumentException: 
requirement failed: Failed to start ps scheduler process with exit code 134``
   
   ```
   18/06/12 17:16:53 INFO Utils: 
/work/mxnet/scala-package/assembly/linux-x86_64-cpu/target/mxnet-full_2.11-linux-x86_64-cpu-1.3.0-SNAPSHOT.jar
 has been previously copied to 
/tmp/spark-202cac2e-62d9-4ac4-a99a-27bca2aaf87f/userFiles-2f1680a1-aea4-4de0-8784-3300ad265be7/mxnet-full_2.11-linux-x86_64-cpu-1.3.0-SNAPSHOT.jar
   
   18/06/12 17:16:53 INFO Executor: Fetching 
file:/work/mxnet/scala-package/spark/target/mxnet-spark_2.11-1.3.0-SNAPSHOT.jar 
with timestamp 1528823813012
   
   18/06/12 17:16:53 INFO Utils: 
/work/mxnet/scala-package/spark/target/mxnet-spark_2.11-1.3.0-SNAPSHOT.jar has 
been previously copied to 
/tmp/spark-202cac2e-62d9-4ac4-a99a-27bca2aaf87f/userFiles-2f1680a1-aea4-4de0-8784-3300ad265be7/mxnet-spark_2.11-1.3.0-SNAPSHOT.jar
   
   18/06/12 17:16:53 INFO MXNet: Starting server ...
   
   18/06/12 17:16:53 INFO HadoopRDD: Input split: 
file:/tmp/mxnet-spark-test-15288237675593208593691818310660/train.txt:234881024+8528911
   
   18/06/12 17:16:53 INFO HadoopRDD: Input split: 
file:/tmp/mxnet-spark-test-15288237675593208593691818310660/train.txt:67108864+33554432
   
   18/06/12 17:16:53 INFO HadoopRDD: Input split: 
file:/tmp/mxnet-spark-test-15288237675593208593691818310660/train.txt:167772160+33554432
   
   18/06/12 17:16:53 INFO HadoopRDD: Input split: 
file:/tmp/mxnet-spark-test-15288237675593208593691818310660/train.txt:201326592+33554432
   
   18/06/12 17:16:53 INFO HadoopRDD: Input split: 
file:/tmp/mxnet-spark-test-15288237675593208593691818310660/train.txt:134217728+33554432
   
   18/06/12 17:16:53 INFO HadoopRDD: Input split: 
file:/tmp/mxnet-spark-test-15288237675593208593691818310660/train.txt:33554432+33554432
   
   18/06/12 17:16:53 INFO HadoopRDD: Input split: 
file:/tmp/mxnet-spark-test-15288237675593208593691818310660/train.txt:0+33554432
   
   18/06/12 17:16:53 INFO HadoopRDD: Input split: 
file:/tmp/mxnet-spark-test-15288237675593208593691818310660/train.txt:100663296+33554432
   
   18/06/12 17:16:53 INFO ParameterServer: Started process: java  -cp 
/tmp/spark-202cac2e-62d9-4ac4-a99a-27bca2aaf87f/userFiles-2f1680a1-aea4-4de0-8784-3300ad265be7/mxnet-full_2.11-linux-x86_64-cpu-1.3.0-SNAPSHOT.jar:/tmp/spark-202cac2e-62d9-4ac4-a99a-27bca2aaf87f/userFiles-2f1680a1-aea4-4de0-8784-3300ad265be7/mxnet-spark_2.11-1.3.0-SNAPSHOT.jar
 org.apache.mxnet.spark.ParameterServer --role=server --root-uri=172.17.0.4 
--root-port=45669 --num-server=1 --num-worker=2 --timeout=300 at 
172.17.0.4:45669
   
   18/06/12 17:16:53 INFO ParameterServer: Starting InputStream-Redirecter 
Thread for 172.17.0.4:45669
   
   18/06/12 17:16:53 INFO ParameterServer: Starting ErrorStream-Redirecter 
Thread for 172.17.0.4:45669
   
   SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
   
   SLF4J: Defaulting to no-operation (NOP) logger implementation
   
   SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further 
details.
   
   18/06/12 17:16:53 INFO Executor: Finished task 7.0 in stage 1.0 (TID 8). 
2254 bytes result sent to driver
   
   18/06/12 17:16:53 INFO TaskSetManager: Finished task 7.0 in stage 1.0 (TID 
8) in 530 ms on localhost (1/8)
   
   18/06/12 17:16:54 INFO Executor: Finished task 0.0 in stage 1.0 (TID 1). 
2254 bytes result sent to driver
   
   18/06/12 17:16:54 INFO TaskSetManager: Finished task 0.0 in stage 1.0 (TID 
1) in 1083 ms on localhost (2/8)
   
   18/06/12 17:16:54 INFO Executor: Finished task 5.0 in stage 1.0 (TID 6). 
2254 bytes result sent to driver
   
   18/06/12 17:16:54 INFO TaskSetManager: Finished task 5.0 in stage 1.0 (TID 
6) in 1092 ms on localhost (3/8)
   
   18/06/12 17:16:54 INFO Executor: Finished task 2.0 in stage 1.0 (TID 3). 
2254 bytes result sent to driver
   
   18/06/12 17:16:54 INFO Executor: Finished task 1.0 in stage 1.0 (TID 2). 
2254 bytes result sent to driver
   
   18/06/12 17:16:54 INFO TaskSetManager: Finished task 2.0 in stage 1.0 (TID 
3) in 1093 ms on localhost (4/8)
   
   18/06/12 17:16:54 INFO TaskSetManager: Finished task 1.0 in stage 1.0 (TID 
2) in 1094 ms on localhost (5/8)
   
   18/06/12 17:16:54 INFO Executor: Finished task 6.0 in stage 1.0 (TID 7). 
2254 bytes result sent to driver
   
   18/06/12 17:16:54 INFO TaskSetManager: Finished task 6.0 in stage 1.0 (TID 
7) in 1097 ms on localhost (6/8)
   
   18/06/12 17:16:54 INFO Executor: Finished task 3.0 in stage 1.0 (TID 4). 
2254 bytes result sent to driver
   
   18/06/12 17:16:54 INFO TaskSetManager: Finished task 3.0 in stage 1.0 (TID 
4) in 1100 ms on localhost 

[GitHub] marcoabreu opened a new issue #11249: Flaky Scala test: IllegalArgumentException: requirement failed: Failed to start ps scheduler

2018-06-12 Thread GitBox
marcoabreu opened a new issue #11249: Flaky Scala test: 
IllegalArgumentException: requirement failed: Failed to start ps scheduler
URL: https://github.com/apache/incubator-mxnet/issues/11249
 
 
   
http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/incubator-mxnet/detail/PR-11246/1/pipeline/
   ``Exception in thread "Thread-113" java.lang.IllegalArgumentException: 
requirement failed: Failed to start ps scheduler process with exit code 134``
   
   ```
   18/06/12 17:16:53 INFO Utils: 
/work/mxnet/scala-package/assembly/linux-x86_64-cpu/target/mxnet-full_2.11-linux-x86_64-cpu-1.3.0-SNAPSHOT.jar
 has been previously copied to 
/tmp/spark-202cac2e-62d9-4ac4-a99a-27bca2aaf87f/userFiles-2f1680a1-aea4-4de0-8784-3300ad265be7/mxnet-full_2.11-linux-x86_64-cpu-1.3.0-SNAPSHOT.jar
   
   18/06/12 17:16:53 INFO Executor: Fetching 
file:/work/mxnet/scala-package/spark/target/mxnet-spark_2.11-1.3.0-SNAPSHOT.jar 
with timestamp 1528823813012
   
   18/06/12 17:16:53 INFO Utils: 
/work/mxnet/scala-package/spark/target/mxnet-spark_2.11-1.3.0-SNAPSHOT.jar has 
been previously copied to 
/tmp/spark-202cac2e-62d9-4ac4-a99a-27bca2aaf87f/userFiles-2f1680a1-aea4-4de0-8784-3300ad265be7/mxnet-spark_2.11-1.3.0-SNAPSHOT.jar
   
   18/06/12 17:16:53 INFO MXNet: Starting server ...
   
   18/06/12 17:16:53 INFO HadoopRDD: Input split: 
file:/tmp/mxnet-spark-test-15288237675593208593691818310660/train.txt:234881024+8528911
   
   18/06/12 17:16:53 INFO HadoopRDD: Input split: 
file:/tmp/mxnet-spark-test-15288237675593208593691818310660/train.txt:67108864+33554432
   
   18/06/12 17:16:53 INFO HadoopRDD: Input split: 
file:/tmp/mxnet-spark-test-15288237675593208593691818310660/train.txt:167772160+33554432
   
   18/06/12 17:16:53 INFO HadoopRDD: Input split: 
file:/tmp/mxnet-spark-test-15288237675593208593691818310660/train.txt:201326592+33554432
   
   18/06/12 17:16:53 INFO HadoopRDD: Input split: 
file:/tmp/mxnet-spark-test-15288237675593208593691818310660/train.txt:134217728+33554432
   
   18/06/12 17:16:53 INFO HadoopRDD: Input split: 
file:/tmp/mxnet-spark-test-15288237675593208593691818310660/train.txt:33554432+33554432
   
   18/06/12 17:16:53 INFO HadoopRDD: Input split: 
file:/tmp/mxnet-spark-test-15288237675593208593691818310660/train.txt:0+33554432
   
   18/06/12 17:16:53 INFO HadoopRDD: Input split: 
file:/tmp/mxnet-spark-test-15288237675593208593691818310660/train.txt:100663296+33554432
   
   18/06/12 17:16:53 INFO ParameterServer: Started process: java  -cp 
/tmp/spark-202cac2e-62d9-4ac4-a99a-27bca2aaf87f/userFiles-2f1680a1-aea4-4de0-8784-3300ad265be7/mxnet-full_2.11-linux-x86_64-cpu-1.3.0-SNAPSHOT.jar:/tmp/spark-202cac2e-62d9-4ac4-a99a-27bca2aaf87f/userFiles-2f1680a1-aea4-4de0-8784-3300ad265be7/mxnet-spark_2.11-1.3.0-SNAPSHOT.jar
 org.apache.mxnet.spark.ParameterServer --role=server --root-uri=172.17.0.4 
--root-port=45669 --num-server=1 --num-worker=2 --timeout=300 at 
172.17.0.4:45669
   
   18/06/12 17:16:53 INFO ParameterServer: Starting InputStream-Redirecter 
Thread for 172.17.0.4:45669
   
   18/06/12 17:16:53 INFO ParameterServer: Starting ErrorStream-Redirecter 
Thread for 172.17.0.4:45669
   
   SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
   
   SLF4J: Defaulting to no-operation (NOP) logger implementation
   
   SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further 
details.
   
   18/06/12 17:16:53 INFO Executor: Finished task 7.0 in stage 1.0 (TID 8). 
2254 bytes result sent to driver
   
   18/06/12 17:16:53 INFO TaskSetManager: Finished task 7.0 in stage 1.0 (TID 
8) in 530 ms on localhost (1/8)
   
   18/06/12 17:16:54 INFO Executor: Finished task 0.0 in stage 1.0 (TID 1). 
2254 bytes result sent to driver
   
   18/06/12 17:16:54 INFO TaskSetManager: Finished task 0.0 in stage 1.0 (TID 
1) in 1083 ms on localhost (2/8)
   
   18/06/12 17:16:54 INFO Executor: Finished task 5.0 in stage 1.0 (TID 6). 
2254 bytes result sent to driver
   
   18/06/12 17:16:54 INFO TaskSetManager: Finished task 5.0 in stage 1.0 (TID 
6) in 1092 ms on localhost (3/8)
   
   18/06/12 17:16:54 INFO Executor: Finished task 2.0 in stage 1.0 (TID 3). 
2254 bytes result sent to driver
   
   18/06/12 17:16:54 INFO Executor: Finished task 1.0 in stage 1.0 (TID 2). 
2254 bytes result sent to driver
   
   18/06/12 17:16:54 INFO TaskSetManager: Finished task 2.0 in stage 1.0 (TID 
3) in 1093 ms on localhost (4/8)
   
   18/06/12 17:16:54 INFO TaskSetManager: Finished task 1.0 in stage 1.0 (TID 
2) in 1094 ms on localhost (5/8)
   
   18/06/12 17:16:54 INFO Executor: Finished task 6.0 in stage 1.0 (TID 7). 
2254 bytes result sent to driver
   
   18/06/12 17:16:54 INFO TaskSetManager: Finished task 6.0 in stage 1.0 (TID 
7) in 1097 ms on localhost (6/8)
   
   18/06/12 17:16:54 INFO Executor: Finished task 3.0 in stage 1.0 (TID 4). 
2254 bytes result sent to driver
   
   18/06/12 17:16:54 INFO TaskSetManager: Finished task 3.0 in stage 1.0 (TID 
4) in 1100 ms on localhost