[jira] [Updated] (SPARK-2586) Lack of information to figure out connection to Tachyon master is inactive/ down

2014-09-15 Thread Andrew Ash (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-2586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Ash updated SPARK-2586:
--
Description: 
When you running Spark with Tachyon, when the connection to Tachyon master is 
down (due to problem in network or the Master node is down) there is no clear 
log or error message to indicate it.

Here is sample stack running SparkTachyonPi example with Tachyon connecting:

{noformat}
14/07/15 16:43:10 INFO Utils: Using Spark's default log4j profile: 
org/apache/spark/log4j-defaults.properties
14/07/15 16:43:10 WARN Utils: Your hostname, henry-pivotal.local resolves to a 
loopback address: 127.0.0.1; using 10.64.5.148 instead (on interface en5)
14/07/15 16:43:10 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another 
address
14/07/15 16:43:11 INFO SecurityManager: Changing view acls to: hsaputra
14/07/15 16:43:11 INFO SecurityManager: SecurityManager: authentication 
disabled; ui acls disabled; users with view permissions: Set(hsaputra)
14/07/15 16:43:11 INFO Slf4jLogger: Slf4jLogger started
14/07/15 16:43:11 INFO Remoting: Starting remoting
14/07/15 16:43:11 INFO Remoting: Remoting started; listening on addresses 
:[akka.tcp://sp...@office-5-148.pa.gopivotal.com:53203]
14/07/15 16:43:11 INFO Remoting: Remoting now listens on addresses: 
[akka.tcp://sp...@office-5-148.pa.gopivotal.com:53203]
14/07/15 16:43:11 INFO SparkEnv: Registering MapOutputTracker
14/07/15 16:43:11 INFO SparkEnv: Registering BlockManagerMaster
14/07/15 16:43:11 INFO DiskBlockManager: Created local directory at 
/var/folders/nv/nsr_3ysj0wgfq93fqp0rdt3wgp/T/spark-local-20140715164311-e63c
14/07/15 16:43:11 INFO ConnectionManager: Bound socket to port 53204 with id = 
ConnectionManagerId(office-5-148.pa.gopivotal.com,53204)
14/07/15 16:43:11 INFO MemoryStore: MemoryStore started with capacity 2.1 GB
14/07/15 16:43:11 INFO BlockManagerMaster: Trying to register BlockManager
14/07/15 16:43:11 INFO BlockManagerMasterActor: Registering block manager 
office-5-148.pa.gopivotal.com:53204 with 2.1 GB RAM
14/07/15 16:43:11 INFO BlockManagerMaster: Registered BlockManager
14/07/15 16:43:11 INFO HttpServer: Starting HTTP Server
14/07/15 16:43:11 INFO HttpBroadcast: Broadcast server started at 
http://10.64.5.148:53205
14/07/15 16:43:11 INFO HttpFileServer: HTTP File server directory is 
/var/folders/nv/nsr_3ysj0wgfq93fqp0rdt3wgp/T/spark-b2fb12ae-4608-4833-87b6-b335da00738e
14/07/15 16:43:11 INFO HttpServer: Starting HTTP Server
14/07/15 16:43:12 INFO SparkUI: Started SparkUI at 
http://office-5-148.pa.gopivotal.com:4040
2014-07-15 16:43:12.210 java[39068:1903] Unable to load realm info from 
SCDynamicStore
14/07/15 16:43:12 WARN NativeCodeLoader: Unable to load native-hadoop library 
for your platform... using builtin-java classes where applicable
14/07/15 16:43:12 INFO SparkContext: Added JAR 
examples/target/scala-2.10/spark-examples-1.1.0-SNAPSHOT-hadoop2.4.0.jar at 
http://10.64.5.148:53206/jars/spark-examples-1.1.0-SNAPSHOT-hadoop2.4.0.jar 
with timestamp 1405467792813
14/07/15 16:43:12 INFO AppClient$ClientActor: Connecting to master 
spark://henry-pivotal.local:7077...
14/07/15 16:43:12 INFO SparkContext: Starting job: reduce at 
SparkTachyonPi.scala:43
14/07/15 16:43:12 INFO DAGScheduler: Got job 0 (reduce at 
SparkTachyonPi.scala:43) with 2 output partitions (allowLocal=false)
14/07/15 16:43:12 INFO DAGScheduler: Final stage: Stage 0(reduce at 
SparkTachyonPi.scala:43)
14/07/15 16:43:12 INFO DAGScheduler: Parents of final stage: List()
14/07/15 16:43:12 INFO DAGScheduler: Missing parents: List()
14/07/15 16:43:12 INFO DAGScheduler: Submitting Stage 0 (MappedRDD[1] at map at 
SparkTachyonPi.scala:39), which has no missing parents
14/07/15 16:43:13 INFO DAGScheduler: Submitting 2 missing tasks from Stage 0 
(MappedRDD[1] at map at SparkTachyonPi.scala:39)
14/07/15 16:43:13 INFO TaskSchedulerImpl: Adding task set 0.0 with 2 tasks
14/07/15 16:43:13 INFO SparkDeploySchedulerBackend: Connected to Spark cluster 
with app ID app-20140715164313-
14/07/15 16:43:13 INFO AppClient$ClientActor: Executor added: 
app-20140715164313-/0 on 
worker-20140715164009-office-5-148.pa.gopivotal.com-52519 
(office-5-148.pa.gopivotal.com:52519) with 8 cores
14/07/15 16:43:13 INFO SparkDeploySchedulerBackend: Granted executor ID 
app-20140715164313-/0 on hostPort office-5-148.pa.gopivotal.com:52519 with 
8 cores, 512.0 MB RAM
14/07/15 16:43:13 INFO AppClient$ClientActor: Executor updated: 
app-20140715164313-/0 is now RUNNING
14/07/15 16:43:15 INFO SparkDeploySchedulerBackend: Registered executor: 
Actor[akka.tcp://sparkexecu...@office-5-148.pa.gopivotal.com:53213/user/Executor#-423405256]
 with ID 0
14/07/15 16:43:15 INFO TaskSetManager: Re-computing pending task lists.
14/07/15 16:43:15 INFO TaskSetManager: Starting task 0.0:0 as TID 0 on executor 
0: office-5-148.pa.gopivotal.com (PROCESS_LOCAL)
14/07/15 16:43:15 INFO 

[jira] [Updated] (SPARK-2586) Lack of information to figure out connection to Tachyon master is inactive/ down

2014-09-15 Thread Andrew Ash (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-2586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Ash updated SPARK-2586:
--
Description: 
When you running Spark with Tachyon, when the connection to Tachyon master is 
down (due to problem in network or the Master node is down) there is no clear 
log or error message to indicate it.

Here is sample stack running SparkTachyonPi example with Tachyon connecting:

{noformat}
14/07/15 16:43:10 INFO Utils: Using Spark's default log4j profile: 
org/apache/spark/log4j-defaults.properties
14/07/15 16:43:10 WARN Utils: Your hostname, henry-pivotal.local resolves to a 
loopback address: 127.0.0.1; using 10.64.5.148 instead (on interface en5)
14/07/15 16:43:10 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another 
address
14/07/15 16:43:11 INFO SecurityManager: Changing view acls to: hsaputra
14/07/15 16:43:11 INFO SecurityManager: SecurityManager: authentication 
disabled; ui acls disabled; users with view permissions: Set(hsaputra)
14/07/15 16:43:11 INFO Slf4jLogger: Slf4jLogger started
14/07/15 16:43:11 INFO Remoting: Starting remoting
14/07/15 16:43:11 INFO Remoting: Remoting started; listening on addresses 
:[akka.tcp://sp...@office-5-148.pa.gopivotal.com:53203]
14/07/15 16:43:11 INFO Remoting: Remoting now listens on addresses: 
[akka.tcp://sp...@office-5-148.pa.gopivotal.com:53203]
14/07/15 16:43:11 INFO SparkEnv: Registering MapOutputTracker
14/07/15 16:43:11 INFO SparkEnv: Registering BlockManagerMaster
14/07/15 16:43:11 INFO DiskBlockManager: Created local directory at 
/var/folders/nv/nsr_3ysj0wgfq93fqp0rdt3wgp/T/spark-local-20140715164311-e63c
14/07/15 16:43:11 INFO ConnectionManager: Bound socket to port 53204 with id = 
ConnectionManagerId(office-5-148.pa.gopivotal.com,53204)
14/07/15 16:43:11 INFO MemoryStore: MemoryStore started with capacity 2.1 GB
14/07/15 16:43:11 INFO BlockManagerMaster: Trying to register BlockManager
14/07/15 16:43:11 INFO BlockManagerMasterActor: Registering block manager 
office-5-148.pa.gopivotal.com:53204 with 2.1 GB RAM
14/07/15 16:43:11 INFO BlockManagerMaster: Registered BlockManager
14/07/15 16:43:11 INFO HttpServer: Starting HTTP Server
14/07/15 16:43:11 INFO HttpBroadcast: Broadcast server started at 
http://10.64.5.148:53205
14/07/15 16:43:11 INFO HttpFileServer: HTTP File server directory is 
/var/folders/nv/nsr_3ysj0wgfq93fqp0rdt3wgp/T/spark-b2fb12ae-4608-4833-87b6-b335da00738e
14/07/15 16:43:11 INFO HttpServer: Starting HTTP Server
14/07/15 16:43:12 INFO SparkUI: Started SparkUI at 
http://office-5-148.pa.gopivotal.com:4040
2014-07-15 16:43:12.210 java[39068:1903] Unable to load realm info from 
SCDynamicStore
14/07/15 16:43:12 WARN NativeCodeLoader: Unable to load native-hadoop library 
for your platform... using builtin-java classes where applicable
14/07/15 16:43:12 INFO SparkContext: Added JAR 
examples/target/scala-2.10/spark-examples-1.1.0-SNAPSHOT-hadoop2.4.0.jar at 
http://10.64.5.148:53206/jars/spark-examples-1.1.0-SNAPSHOT-hadoop2.4.0.jar 
with timestamp 1405467792813
14/07/15 16:43:12 INFO AppClient$ClientActor: Connecting to master 
spark://henry-pivotal.local:7077...
14/07/15 16:43:12 INFO SparkContext: Starting job: reduce at 
SparkTachyonPi.scala:43
14/07/15 16:43:12 INFO DAGScheduler: Got job 0 (reduce at 
SparkTachyonPi.scala:43) with 2 output partitions (allowLocal=false)
14/07/15 16:43:12 INFO DAGScheduler: Final stage: Stage 0(reduce at 
SparkTachyonPi.scala:43)
14/07/15 16:43:12 INFO DAGScheduler: Parents of final stage: List()
14/07/15 16:43:12 INFO DAGScheduler: Missing parents: List()
14/07/15 16:43:12 INFO DAGScheduler: Submitting Stage 0 (MappedRDD[1] at map at 
SparkTachyonPi.scala:39), which has no missing parents
14/07/15 16:43:13 INFO DAGScheduler: Submitting 2 missing tasks from Stage 0 
(MappedRDD[1] at map at SparkTachyonPi.scala:39)
14/07/15 16:43:13 INFO TaskSchedulerImpl: Adding task set 0.0 with 2 tasks
14/07/15 16:43:13 INFO SparkDeploySchedulerBackend: Connected to Spark cluster 
with app ID app-20140715164313-
14/07/15 16:43:13 INFO AppClient$ClientActor: Executor added: 
app-20140715164313-/0 on 
worker-20140715164009-office-5-148.pa.gopivotal.com-52519 
(office-5-148.pa.gopivotal.com:52519) with 8 cores
14/07/15 16:43:13 INFO SparkDeploySchedulerBackend: Granted executor ID 
app-20140715164313-/0 on hostPort office-5-148.pa.gopivotal.com:52519 with 
8 cores, 512.0 MB RAM
14/07/15 16:43:13 INFO AppClient$ClientActor: Executor updated: 
app-20140715164313-/0 is now RUNNING
14/07/15 16:43:15 INFO SparkDeploySchedulerBackend: Registered executor: 
Actor[akka.tcp://sparkexecu...@office-5-148.pa.gopivotal.com:53213/user/Executor#-423405256]
 with ID 0
14/07/15 16:43:15 INFO TaskSetManager: Re-computing pending task lists.
14/07/15 16:43:15 INFO TaskSetManager: Starting task 0.0:0 as TID 0 on executor 
0: office-5-148.pa.gopivotal.com (PROCESS_LOCAL)
14/07/15 16:43:15 INFO 

[jira] [Updated] (SPARK-2586) Lack of information to figure out connection to Tachyon master is inactive/ down

2014-07-18 Thread Henry Saputra (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-2586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Saputra updated SPARK-2586:
-

Labels: tachyon  (was: )

 Lack of information to figure out connection to Tachyon master is inactive/ 
 down
 

 Key: SPARK-2586
 URL: https://issues.apache.org/jira/browse/SPARK-2586
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Reporter: Henry Saputra
  Labels: tachyon

 When you running Spark with Tachyon, when the connection to Tachyon master is 
 down (due to problem in network or the Master node is down) there is no clear 
 log or error message to indicate it.
 Here is sample stack running SparkTachyonPi example with Tachyon connecting:
 14/07/15 16:43:10 INFO Utils: Using Spark's default log4j profile: 
 org/apache/spark/log4j-defaults.properties
 14/07/15 16:43:10 WARN Utils: Your hostname, henry-pivotal.local resolves to 
 a loopback address: 127.0.0.1; using 10.64.5.148 instead (on interface en5)
 14/07/15 16:43:10 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to 
 another address
 14/07/15 16:43:11 INFO SecurityManager: Changing view acls to: hsaputra
 14/07/15 16:43:11 INFO SecurityManager: SecurityManager: authentication 
 disabled; ui acls disabled; users with view permissions: Set(hsaputra)
 14/07/15 16:43:11 INFO Slf4jLogger: Slf4jLogger started
 14/07/15 16:43:11 INFO Remoting: Starting remoting
 14/07/15 16:43:11 INFO Remoting: Remoting started; listening on addresses 
 :[akka.tcp://sp...@office-5-148.pa.gopivotal.com:53203]
 14/07/15 16:43:11 INFO Remoting: Remoting now listens on addresses: 
 [akka.tcp://sp...@office-5-148.pa.gopivotal.com:53203]
 14/07/15 16:43:11 INFO SparkEnv: Registering MapOutputTracker
 14/07/15 16:43:11 INFO SparkEnv: Registering BlockManagerMaster
 14/07/15 16:43:11 INFO DiskBlockManager: Created local directory at 
 /var/folders/nv/nsr_3ysj0wgfq93fqp0rdt3wgp/T/spark-local-20140715164311-e63c
 14/07/15 16:43:11 INFO ConnectionManager: Bound socket to port 53204 with id 
 = ConnectionManagerId(office-5-148.pa.gopivotal.com,53204)
 14/07/15 16:43:11 INFO MemoryStore: MemoryStore started with capacity 2.1 GB
 14/07/15 16:43:11 INFO BlockManagerMaster: Trying to register BlockManager
 14/07/15 16:43:11 INFO BlockManagerMasterActor: Registering block manager 
 office-5-148.pa.gopivotal.com:53204 with 2.1 GB RAM
 14/07/15 16:43:11 INFO BlockManagerMaster: Registered BlockManager
 14/07/15 16:43:11 INFO HttpServer: Starting HTTP Server
 14/07/15 16:43:11 INFO HttpBroadcast: Broadcast server started at 
 http://10.64.5.148:53205
 14/07/15 16:43:11 INFO HttpFileServer: HTTP File server directory is 
 /var/folders/nv/nsr_3ysj0wgfq93fqp0rdt3wgp/T/spark-b2fb12ae-4608-4833-87b6-b335da00738e
 14/07/15 16:43:11 INFO HttpServer: Starting HTTP Server
 14/07/15 16:43:12 INFO SparkUI: Started SparkUI at 
 http://office-5-148.pa.gopivotal.com:4040
 2014-07-15 16:43:12.210 java[39068:1903] Unable to load realm info from 
 SCDynamicStore
 14/07/15 16:43:12 WARN NativeCodeLoader: Unable to load native-hadoop library 
 for your platform... using builtin-java classes where applicable
 14/07/15 16:43:12 INFO SparkContext: Added JAR 
 examples/target/scala-2.10/spark-examples-1.1.0-SNAPSHOT-hadoop2.4.0.jar at 
 http://10.64.5.148:53206/jars/spark-examples-1.1.0-SNAPSHOT-hadoop2.4.0.jar 
 with timestamp 1405467792813
 14/07/15 16:43:12 INFO AppClient$ClientActor: Connecting to master 
 spark://henry-pivotal.local:7077...
 14/07/15 16:43:12 INFO SparkContext: Starting job: reduce at 
 SparkTachyonPi.scala:43
 14/07/15 16:43:12 INFO DAGScheduler: Got job 0 (reduce at 
 SparkTachyonPi.scala:43) with 2 output partitions (allowLocal=false)
 14/07/15 16:43:12 INFO DAGScheduler: Final stage: Stage 0(reduce at 
 SparkTachyonPi.scala:43)
 14/07/15 16:43:12 INFO DAGScheduler: Parents of final stage: List()
 14/07/15 16:43:12 INFO DAGScheduler: Missing parents: List()
 14/07/15 16:43:12 INFO DAGScheduler: Submitting Stage 0 (MappedRDD[1] at map 
 at SparkTachyonPi.scala:39), which has no missing parents
 14/07/15 16:43:13 INFO DAGScheduler: Submitting 2 missing tasks from Stage 0 
 (MappedRDD[1] at map at SparkTachyonPi.scala:39)
 14/07/15 16:43:13 INFO TaskSchedulerImpl: Adding task set 0.0 with 2 tasks
 14/07/15 16:43:13 INFO SparkDeploySchedulerBackend: Connected to Spark 
 cluster with app ID app-20140715164313-
 14/07/15 16:43:13 INFO AppClient$ClientActor: Executor added: 
 app-20140715164313-/0 on 
 worker-20140715164009-office-5-148.pa.gopivotal.com-52519 
 (office-5-148.pa.gopivotal.com:52519) with 8 cores
 14/07/15 16:43:13 INFO SparkDeploySchedulerBackend: Granted executor ID 
 app-20140715164313-/0 on hostPort office-5-148.pa.gopivotal.com:52519 
 with 8 cores, 512.0 MB RAM
 14/07/15