About the exception "Received LaunchTask command but executor was null"

2016-03-02 Thread Sea
Hi, all:
 Sometimes task will fail with exception "About the exception "Received 
LaunchTask command but executor was null",  and I find it is a common problem:
 https://issues.apache.org/jira/browse/SPARK-13112
 https://issues.apache.org/jira/browse/SPARK-13060
 
 I have a question that why should CoarseGrainedExecutorBackend wait for 
the response of driver? Can we initialize executor in the onStart function 
after the request to driver? 
 If CoarseGrainedExecutorBackend receive messages like 'LaunchTask' or 
'KillTask' which means that it is registered sucessfully.

Re?? please help with ClassNotFoundException

2015-08-13 Thread Sea
I have no idea... We use scala. You upgrade to 1.4 so quickly...,  are you 
using spark in production?  Spark 1.3 is better than spark1.4. 


--  --
??: "??";;
: 2015??8??14??(??) 11:14
??: "Sea"<261810...@qq.com>; "dev@spark.apache.org"; 

: Re: please help with ClassNotFoundException



Hi Sea I have updated spark to 1.4.1, however the problem still exists, any 
idea?


Sea <261810...@qq.com>??2015??8??14?? 12:36??

Yes, I guess so. I see this bug before.




--  --
??: "??";;
????: 2015??8??13??(??) 9:30
??: "Sea"<261810...@qq.com>; "dev@spark.apache.org"; 

: Re: please help with ClassNotFoundException




Hi sea    Is it the same issue as 
https://issues.apache.org/jira/browse/SPARK-8368


Sea <261810...@qq.com>??2015??8??13?? 6:52??

Are you using 1.4.0?  If yes, use 1.4.1




--  --
??: "??";;
: 2015??8??13??(??) 6:04
??: "dev"; 

: please help with ClassNotFoundException




Hi,I am using spark 1.4 when an issue occurs to me.
I am trying to use the aggregate function:
JavaRdd rdd = some rdd;
HashMap zeroValue = new HashMap();
// add initial key-value pair for zeroValue
rdd.aggregate(zeroValue, 
   new Function2,
String,
HashMap>(){//implementation},
   new Function2,
String,
HashMap(){//implementation})


here is the stack trace when i run the application:


Caused by: java.lang.ClassNotFoundException: TypeA
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:274)
at 
org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:66)
at 
java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1612)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1517)
at 
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1771)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
at java.util.HashMap.readObject(HashMap.java:1180)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
at 
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
at 
org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:69)
at 
org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:89)
at org.apache.spark.util.Utils$.clone(Utils.scala:1458)
at org.apache.spark.rdd.RDD$$anonfun$aggregate$1.apply(RDD.scala:1049)
at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:148)
at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:109)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:286)
at org.apache.spark.rdd.RDD.aggregate(RDD.scala:1047)
at 
org.apache.spark.api.java.JavaRDDLike$class.aggregate(JavaRDDLike.scala:413)
at 
org.apache.spark.api.java.AbstractJavaRDDLike.aggregate(JavaRDDLike.scala:47)

 however I have checked that TypeA is in the jar file which is in the 
classpath
And when I use an empty HashMap as the zeroValue, the exception has gone
Does anyone meet the same problem, or can anyone help me with it?



-- 

Best RegardZhouQianhao



-- 

Best RegardZhouQianhao

Re?? please help with ClassNotFoundException

2015-08-13 Thread Sea
Yes, I guess so. I see this bug before.




--  --
??: "??";;
: 2015??8??13??(??) 9:30
??: "Sea"<261810...@qq.com>; "dev@spark.apache.org"; 

: Re: please help with ClassNotFoundException



Hi seaIs it the same issue as 
https://issues.apache.org/jira/browse/SPARK-8368


Sea <261810...@qq.com>??2015??8??13?? 6:52??

Are you using 1.4.0?  If yes, use 1.4.1




--  --
??: "??";;
: 2015??8??13??(??) 6:04
??: "dev"; 

: please help with ClassNotFoundException




Hi,I am using spark 1.4 when an issue occurs to me.
I am trying to use the aggregate function:
JavaRdd rdd = some rdd;
HashMap zeroValue = new HashMap();
// add initial key-value pair for zeroValue
rdd.aggregate(zeroValue, 
   new Function2,
String,
HashMap>(){//implementation},
   new Function2,
String,
HashMap(){//implementation})


here is the stack trace when i run the application:


Caused by: java.lang.ClassNotFoundException: TypeA
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:274)
at 
org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:66)
at 
java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1612)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1517)
at 
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1771)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
at java.util.HashMap.readObject(HashMap.java:1180)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
at 
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
at 
org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:69)
at 
org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:89)
at org.apache.spark.util.Utils$.clone(Utils.scala:1458)
at org.apache.spark.rdd.RDD$$anonfun$aggregate$1.apply(RDD.scala:1049)
at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:148)
at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:109)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:286)
at org.apache.spark.rdd.RDD.aggregate(RDD.scala:1047)
at 
org.apache.spark.api.java.JavaRDDLike$class.aggregate(JavaRDDLike.scala:413)
at 
org.apache.spark.api.java.AbstractJavaRDDLike.aggregate(JavaRDDLike.scala:47)

 however I have checked that TypeA is in the jar file which is in the 
classpath
And when I use an empty HashMap as the zeroValue, the exception has gone
Does anyone meet the same problem, or can anyone help me with it?



-- 

Best RegardZhouQianhao

??????please help with ClassNotFoundException

2015-08-13 Thread Sea
Are you using 1.4.0?  If yes, use 1.4.1




--  --
??: "??";;
: 2015??8??13??(??) 6:04
??: "dev"; 

: please help with ClassNotFoundException



Hi,I am using spark 1.4 when an issue occurs to me.
I am trying to use the aggregate function:
JavaRdd rdd = some rdd;
HashMap zeroValue = new HashMap();
// add initial key-value pair for zeroValue
rdd.aggregate(zeroValue, 
   new Function2,
String,
HashMap>(){//implementation},
   new Function2,
String,
HashMap(){//implementation})


here is the stack trace when i run the application:


Caused by: java.lang.ClassNotFoundException: TypeA
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:274)
at 
org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:66)
at 
java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1612)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1517)
at 
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1771)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
at java.util.HashMap.readObject(HashMap.java:1180)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
at 
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
at 
org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:69)
at 
org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:89)
at org.apache.spark.util.Utils$.clone(Utils.scala:1458)
at org.apache.spark.rdd.RDD$$anonfun$aggregate$1.apply(RDD.scala:1049)
at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:148)
at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:109)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:286)
at org.apache.spark.rdd.RDD.aggregate(RDD.scala:1047)
at 
org.apache.spark.api.java.JavaRDDLike$class.aggregate(JavaRDDLike.scala:413)
at 
org.apache.spark.api.java.AbstractJavaRDDLike.aggregate(JavaRDDLike.scala:47)

 however I have checked that TypeA is in the jar file which is in the 
classpath
And when I use an empty HashMap as the zeroValue, the exception has gone
Does anyone meet the same problem, or can anyone help me with it?

回复: Asked to remove non-existent executor exception

2015-07-26 Thread Sea
This exception is so ugly!!!  The screen is full of these information when the 
program runs a long time,  and they will not fail the job. 
 
I comment it in the source code. I think this information is useless because 
the executor is already removed and I don't know what does the executor id mean.
 
Should we remove this information forever?
 
 
 
 15/07/23 13:26:41 ERROR SparkDeploySchedulerBackend: Asked to remove 
non-existent executor 2...
 
15/07/23 13:26:41 ERROR SparkDeploySchedulerBackend: Asked to remove 
non-existent executor 2...






  

 

 -- 原始邮件 --
  发件人: "Ted Yu";;
 发送时间: 2015年7月26日(星期天) 晚上10:51
 收件人: "Pa Rö"; 
 抄送: "user"; 
 主题: Re: Asked to remove non-existent executor exception

 

 You can list the files in tmpfs in reverse chronological order and remove the 
oldest until you have enough space.  

 Cheers

 
 On Sun, Jul 26, 2015 at 12:43 AM, Pa Rö  wrote:
  i has seen that the "tempfs" is full, how i can clear this?
   
 2015-07-23 13:41 GMT+02:00 Pa Rö :
 hello spark community,


i have build an application with geomesa, accumulo and spark.

if it run on spark local mode, it is working, but on spark

cluster not. in short it says: No space left on device. Asked to remove 
non-existent executor XY. 
I´m confused, because there were many GB´s of free space. do i need to change 
my configuration or what else can i do? thanks in advance.

here is the complete exception:

og4j:WARN No appenders could be found for logger 
(org.apache.accumulo.fate.zookeeper.ZooSession).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more 
info.
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
15/07/23 13:26:39 INFO SparkContext: Running Spark version 1.3.0
15/07/23 13:26:39 WARN NativeCodeLoader: Unable to load native-hadoop library 
for your platform... using builtin-java classes where applicable
15/07/23 13:26:39 INFO SecurityManager: Changing view acls to: marcel
15/07/23 13:26:39 INFO SecurityManager: Changing modify acls to: marcel
15/07/23 13:26:39 INFO SecurityManager: SecurityManager: authentication 
disabled; ui acls disabled; users with view permissions: Set(marcel); users 
with modify permissions: Set(marcel)
15/07/23 13:26:39 INFO Slf4jLogger: Slf4jLogger started
15/07/23 13:26:40 INFO Remoting: Starting remoting
15/07/23 13:26:40 INFO Remoting: Remoting started; listening on addresses 
:[akka.tcp://sparkDriver@node1-scads02:52478]
15/07/23 13:26:40 INFO Utils: Successfully started service 'sparkDriver' on 
port 52478.
15/07/23 13:26:40 INFO SparkEnv: Registering MapOutputTracker
15/07/23 13:26:40 INFO SparkEnv: Registering BlockManagerMaster
15/07/23 13:26:40 INFO DiskBlockManager: Created local directory at 
/tmp/spark-ca9319d4-68a2-4add-a21a-48b13ae9cf81/blockmgr-cbf8af23-e113-4732-8c2c-7413ad237b3b
15/07/23 13:26:40 INFO MemoryStore: MemoryStore started with capacity 1916.2 MB
15/07/23 13:26:40 INFO HttpFileServer: HTTP File server directory is 
/tmp/spark-9d4a04d5-3535-49e0-a859-d278a0cc7bf8/httpd-1882aafc-45fe-4490-803d-c04fc67510a2
15/07/23 13:26:40 INFO HttpServer: Starting HTTP Server
15/07/23 13:26:40 INFO Server: jetty-8.y.z-SNAPSHOT
15/07/23 13:26:40 INFO AbstractConnector: Started SocketConnector@0.0.0.0:56499
15/07/23 13:26:40 INFO Utils: Successfully started service 'HTTP file server' 
on port 56499.
15/07/23 13:26:40 INFO SparkEnv: Registering OutputCommitCoordinator
15/07/23 13:26:40 INFO Server: jetty-8.y.z-SNAPSHOT
15/07/23 13:26:40 INFO AbstractConnector: Started 
SelectChannelConnector@0.0.0.0:4040
15/07/23 13:26:40 INFO Utils: Successfully started service 'SparkUI' on port 
4040.
15/07/23 13:26:40 INFO SparkUI: Started SparkUI at http://node1-scads02:4040
15/07/23 13:26:40 INFO AppClient$ClientActor: Connecting to master 
akka.tcp://sparkMaster@node1-scads02:7077/user/Master...
15/07/23 13:26:40 INFO SparkDeploySchedulerBackend: Connected to Spark cluster 
with app ID app-20150723132640-
15/07/23 13:26:40 INFO AppClient$ClientActor: Executor added: 
app-20150723132640-/0 on worker-20150723132524-node3-scads06-7078 
(node3-scads06:7078) with 8 cores
15/07/23 13:26:40 INFO SparkDeploySchedulerBackend: Granted executor ID 
app-20150723132640-/0 on hostPort node3-scads06:7078 with 8 cores, 512.0 MB 
RAM
15/07/23 13:26:40 INFO AppClient$ClientActor: Executor added: 
app-20150723132640-/1 on worker-20150723132513-node2-scads05-7078 
(node2-scads05:7078) with 8 cores
15/07/23 13:26:40 INFO SparkDeploySchedulerBackend: Granted executor ID 
app-20150723132640-/1 on hostPort node2-scads05:7078 with 8 cores, 512.0 MB 
RAM
15/07/23 13:26:40 INFO AppClient$ClientActor: Executor updated: 
app-20150723132640-/0 is now RUNNING
15/07/23 13:26:40 INFO AppClient$ClientActor: Executor updated: 
app-20150723132640-/1 is now RUNNING
15/07/23 13:26:40 INFO AppClient$ClientActor: Executor updated: 
app-20150723132640-00

?????? Time is ugly in Spark Streaming....

2015-06-27 Thread Sea
Yes , things go well now.  It is a problem of SimpleDateFormat. Thank you all.




--  --
??: "Dumas Hwang";;
: 2015??6??27??(??) 8:16
??: "Tathagata Das"; 
: "Emrehan T??z??n"; "Sea"<261810...@qq.com>; 
"dev"; "user"; 
: Re: Time is ugly in Spark Streaming



Java's SimpleDateFormat is not thread safe.  You can consider using 
DateTimeFormatter if you are using Java 8 or Joda-time

On Sat, Jun 27, 2015 at 3:32 AM, Tathagata Das  wrote:
Could you print the "time" on the driver (that is, in foreachRDD but before 
RDD.foreachPartition) and see if it is behaving weird?

TD


On Fri, Jun 26, 2015 at 3:57 PM, Emrehan T??z??n  
wrote:
 





On Fri, Jun 26, 2015 at 12:30 PM, Sea <261810...@qq.com> wrote:

 Hi, all
 

 I find a problem in spark streaming, when I use the time in function 
foreachRDD... I find the time is very interesting. 
 val messages = KafkaUtils.createDirectStream[String, String, StringDecoder, 
StringDecoder](ssc, kafkaParams, topicsSet)
 dataStream.map(x => createGroup(x._2, 
dimensions)).groupByKey().foreachRDD((rdd, time) => {
try {
if (!rdd.partitions.isEmpty) {
  rdd.foreachPartition(partition => {
handlePartition(partition, timeType, time, dimensions, outputTopic, brokerList)
  })
}
  } catch {
case e: Exception => e.printStackTrace()
  }
})
 

 val dateFormat = new SimpleDateFormat("-MM-dd'T'HH:mm:ss")
  var date = dateFormat.format(new Date(time.milliseconds)) 
 
 Then I insert the 'date' into Kafka , but I found .
  

 
{"timestamp":"2015-06-00T16:50:02","status":"3","type":"1","waittime":"0","count":17}
 
{"timestamp":"2015-06-26T16:51:13","status":"1","type":"1","waittime":"0","count":34}
 
{"timestamp":"2015-06-00T16:50:02","status":"4","type":"0","waittime":"0","count":279}
 
{"timestamp":"2015-06-26T16:52:00","status":"11","type":"1","waittime":"0","count":9}
 
{"timestamp":"0020-06-26T16:50:36","status":"7","type":"0","waittime":"0","count":1722}
 
{"timestamp":"2015-06-10T16:51:17","status":"0","type":"0","waittime":"0","count":2958}
 
{"timestamp":"2015-06-26T16:52:00","status":"0","type":"1","waittime":"0","count":114}
 
{"timestamp":"2015-06-10T16:51:17","status":"11","type":"0","waittime":"0","count":2066}
 
{"timestamp":"2015-06-26T16:52:00","status":"1","type":"0","waittime":"0","count":1539}

?????? Time is ugly in Spark Streaming....

2015-06-26 Thread Sea
Yes, I make it.




--  --
??: "Gerard Maas";;
: 2015??6??26??(??) 5:40
??: "Sea"<261810...@qq.com>; 
: "user"; "dev"; 
: Re: Time is ugly in Spark Streaming



Are you sharing the SimpleDateFormat instance? This looks a lot more like the 
non-thread-safe behaviour of SimpleDateFormat (that has claimed many 
unsuspecting victims over the years), than any 'ugly' Spark Streaming. Try 
writing the timestamps in millis to Kafka and compare.

-kr, Gerard.


On Fri, Jun 26, 2015 at 11:06 AM, Sea <261810...@qq.com> wrote:
Hi, all


I find a problem in spark streaming, when I use the time in function 
foreachRDD... I find the time is very interesting.
val messages = KafkaUtils.createDirectStream[String, String, StringDecoder, 
StringDecoder](ssc, kafkaParams, topicsSet)
dataStream.map(x => createGroup(x._2, 
dimensions)).groupByKey().foreachRDD((rdd, time) => {
  try {
if (!rdd.partitions.isEmpty) {
  rdd.foreachPartition(partition => {
handlePartition(partition, timeType, time, dimensions, outputTopic, 
brokerList)
  })
}
  } catch {
case e: Exception => e.printStackTrace()
  }
})


val dateFormat = new SimpleDateFormat("-MM-dd'T'HH:mm:ss")
var date = dateFormat.format(new Date(time.milliseconds))

Then I insert the 'date' into Kafka , but I found .


{"timestamp":"2015-06-00T16:50:02","status":"3","type":"1","waittime":"0","count":17}
{"timestamp":"2015-06-26T16:51:13","status":"1","type":"1","waittime":"0","count":34}
{"timestamp":"2015-06-00T16:50:02","status":"4","type":"0","waittime":"0","count":279}
{"timestamp":"2015-06-26T16:52:00","status":"11","type":"1","waittime":"0","count":9}
{"timestamp":"0020-06-26T16:50:36","status":"7","type":"0","waittime":"0","count":1722}
{"timestamp":"2015-06-10T16:51:17","status":"0","type":"0","waittime":"0","count":2958}
{"timestamp":"2015-06-26T16:52:00","status":"0","type":"1","waittime":"0","count":114}
{"timestamp":"2015-06-10T16:51:17","status":"11","type":"0","waittime":"0","count":2066}
{"timestamp":"2015-06-26T16:52:00","status":"1","type":"0","waittime":"0","count":1539}

Time is ugly in Spark Streaming....

2015-06-26 Thread Sea
Hi, all


I find a problem in spark streaming, when I use the time in function 
foreachRDD... I find the time is very interesting.
val messages = KafkaUtils.createDirectStream[String, String, StringDecoder, 
StringDecoder](ssc, kafkaParams, topicsSet)
dataStream.map(x => createGroup(x._2, 
dimensions)).groupByKey().foreachRDD((rdd, time) => {
  try {
if (!rdd.partitions.isEmpty) {
  rdd.foreachPartition(partition => {
handlePartition(partition, timeType, time, dimensions, outputTopic, 
brokerList)
  })
}
  } catch {
case e: Exception => e.printStackTrace()
  }
})


val dateFormat = new SimpleDateFormat("-MM-dd'T'HH:mm:ss")
var date = dateFormat.format(new Date(time.milliseconds))

Then I insert the 'date' into Kafka , but I found .


{"timestamp":"2015-06-00T16:50:02","status":"3","type":"1","waittime":"0","count":17}
{"timestamp":"2015-06-26T16:51:13","status":"1","type":"1","waittime":"0","count":34}
{"timestamp":"2015-06-00T16:50:02","status":"4","type":"0","waittime":"0","count":279}
{"timestamp":"2015-06-26T16:52:00","status":"11","type":"1","waittime":"0","count":9}
{"timestamp":"0020-06-26T16:50:36","status":"7","type":"0","waittime":"0","count":1722}
{"timestamp":"2015-06-10T16:51:17","status":"0","type":"0","waittime":"0","count":2958}
{"timestamp":"2015-06-26T16:52:00","status":"0","type":"1","waittime":"0","count":114}
{"timestamp":"2015-06-10T16:51:17","status":"11","type":"0","waittime":"0","count":2066}
{"timestamp":"2015-06-26T16:52:00","status":"1","type":"0","waittime":"0","count":1539}

?????? Spark-sql(yarn-client) java.lang.NoClassDefFoundError: org/apache/spark/deploy/yarn/ExecutorLauncher

2015-06-18 Thread Sea
Thanks, Yin Huai
I work it out.
I use JDK1.7 to build Spark 1.4.0, but my yarn cluster run on JDK1.6. 
But java.version in pom.xml in 1.6 and the exception makes me confused




--  --
??: "Yin Huai";;
: 2015??6??18??(??) 11:19
??: "Sea"<261810...@qq.com>; 
: "dev"; 
: Re: Spark-sql(yarn-client) java.lang.NoClassDefFoundError: 
org/apache/spark/deploy/yarn/ExecutorLauncher



Is it the full stack trace?

On Thu, Jun 18, 2015 at 6:39 AM, Sea <261810...@qq.com> wrote:
Hi, all:


I want to run spark sql on yarn(yarn-client), but ... I already set 
"spark.yarn.jar" and  "spark.jars" in conf/spark-defaults.conf.
./bin/spark-sql -f game.sql --executor-memory 2g --num-executors 100 > game.txt
Exception in thread "main" java.lang.NoClassDefFoundError: 
org/apache/spark/deploy/yarn/ExecutorLauncher
Caused by: java.lang.ClassNotFoundException: 
org.apache.spark.deploy.yarn.ExecutorLauncher
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
Could not find the main class: org.apache.spark.deploy.yarn.ExecutorLauncher.  
Program will exit.







Anyone can help?

Spark-sql(yarn-client) java.lang.NoClassDefFoundError: org/apache/spark/deploy/yarn/ExecutorLauncher

2015-06-18 Thread Sea
Hi, all:


I want to run spark sql on yarn(yarn-client), but ... I already set 
"spark.yarn.jar" and  "spark.jars" in conf/spark-defaults.conf.
./bin/spark-sql -f game.sql --executor-memory 2g --num-executors 100 > game.txt
Exception in thread "main" java.lang.NoClassDefFoundError: 
org/apache/spark/deploy/yarn/ExecutorLauncher
Caused by: java.lang.ClassNotFoundException: 
org.apache.spark.deploy.yarn.ExecutorLauncher
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
Could not find the main class: org.apache.spark.deploy.yarn.ExecutorLauncher.  
Program will exit.





Anyone can help?

Spark-sql(yarn-client) java.lang.NoClassDefFoundError: org/apache/spark/deploy/yarn/ExecutorLauncher

2015-06-18 Thread Sea
Hi, all:

I want to run spark sql on yarn(yarn-client), but ... I already set 
"spark.yarn.jar" and  "spark.jars" in conf/spark-defaults.conf.
./bin/spark-sql -f game.sql --executor-memory 2g --num-executors 100 > game.txt
Exception in thread "main" java.lang.NoClassDefFoundError: 
org/apache/spark/deploy/yarn/ExecutorLauncher
Caused by: java.lang.ClassNotFoundException: 
org.apache.spark.deploy.yarn.ExecutorLauncher
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
Could not find the main class: org.apache.spark.deploy.yarn.ExecutorLauncher.  
Program will exit.





Any can help?

Re?? About HostName display in SparkUI

2015-06-15 Thread Sea
In the conf/slaves file, I have hostnames. 
Before 1.4.0, it is okay. I view the code in class org.apache.spark.util.Utils, 
I alter function localHostName and localHostNameForURI, and it turns back to 
hostnames again. 
I just don't know why to change these basic functions. Hostname is nice. 




--  --
??: "Akhil Das";;
: 2015??6??15??(??) ????5:36
??: "Sea"<261810...@qq.com>; 
: "dev"; 
: Re: About HostName display in SparkUI



In the conf/slaves file, are you having the ip addresses? or the hostnames?


ThanksBest Regards



 
On Sat, Jun 13, 2015 at 9:51 PM, Sea <261810...@qq.com> wrote:
In spark 1.4.0, I find that the Address is ip (it was hostname in v1.3.0), why? 
who did it?

7FAB9BA9@AFBE9573.34FC7E55
Description: Binary data


About HostName display in SparkUI

2015-06-13 Thread Sea
In spark 1.4.0, I find that the Address is ip (it was hostname in v1.3.0), why? 
who did it?

Exception in using updateStateByKey

2015-04-27 Thread Sea
Hi, all:
I use function updateStateByKey in Spark Streaming, I need to store the states 
for one minite,  I set "spark.cleaner.ttl" to 120, the duration is 2 seconds, 
but it throws Exception 




Caused by: 
org.apache.hadoop.ipc.RemoteException(java.io.FileNotFoundException): File does 
not exist: spark/ck/hdfsaudit/receivedData/0/log-1430139541443-1430139601443
at 
org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:61)
at 
org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:51)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1499)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1448)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1428)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1402)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:468)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:269)
at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:59566)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2048)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2042)


at org.apache.hadoop.ipc.Client.call(Client.java:1347)
at org.apache.hadoop.ipc.Client.call(Client.java:1300)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
at com.sun.proxy.$Proxy14.getBlockLocations(Unknown Source)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getBlockLocations(ClientNamenodeProtocolTranslatorPB.java:188)
at sun.reflect.GeneratedMethodAccessor21.invoke(Unknown Source)



Why?


my code is 


ssc = StreamingContext(sc,2)
kvs = KafkaUtils.createStream(ssc, zkQuorum, group, {topic: 1})
kvs.window(60,2).map(lambda x: analyzeMessage(x[1]))\
.filter(lambda x: x[1] != None).updateStateByKey(updateStateFunc) \
.filter(lambda x: x[1]['isExisted'] != 1) \
.foreachRDD(lambda rdd: rdd.foreachPartition(insertIntoDb))

回复: Filesystem closed Exception

2015-03-21 Thread Sea
Hi, Vinodkc‍
Yes, I found another solution, https://github.com/apache/spark/pull/4771/
I will test it later.‍




-- 原始邮件 --
发件人: "vinodkc";;
发送时间: 2015年3月21日(星期六) 下午4:52
收件人: "dev"; 

主题: Re: Filesystem closed Exception



Hi Sea,

I've raised a  JIRA Issue on this :
https://issues.apache.org/jira/browse/SPARK-6445 . Making a PR  now

On Sat, Mar 21, 2015 at 11:06 AM, Sea [via Apache Spark Developers List] <
ml-node+s1001551n11145...@n3.nabble.com> wrote:

> Hi, all:
>
>
>
>
> When I exit the console of spark-sql, the following exception
> throwed..
>
>
> My spark version is 1.3.0, hadoop version is 2.2.0
>
>
> Exception in thread "Thread-3" java.io.IOException: Filesystem closed
> at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:629)
> at
> org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1677)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1106)
>
> at
> org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1102)
>
> at
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1102)
>
> at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1397)
> at
> org.apache.spark.scheduler.EventLoggingListener.stop(EventLoggingListener.scala:196)
>
> at
> org.apache.spark.SparkContext$$anonfun$stop$4.apply(SparkContext.scala:1388)
>
> at
> org.apache.spark.SparkContext$$anonfun$stop$4.apply(SparkContext.scala:1388)
>
> at scala.Option.foreach(Option.scala:236)
> at org.apache.spark.SparkContext.stop(SparkContext.scala:1388)
> at
> org.apache.spark.sql.hive.thriftserver.SparkSQLEnv$.stop(SparkSQLEnv.scala:66)
>
> at
> org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$$anon$1.run(SparkSQLCLIDriver.scala:107)‍
>
>
> --
>  If you reply to this email, your message will be added to the discussion
> below:
>
> http://apache-spark-developers-list.1001551.n3.nabble.com/Filesystem-closed-Exception-tp11145.html
>  To start a new topic under Apache Spark Developers List, email
> ml-node+s1001551n...@n3.nabble.com
> To unsubscribe from Apache Spark Developers List, click here
> <http://apache-spark-developers-list.1001551.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=1&code=dmlub2Qua2MuaW5AZ21haWwuY29tfDF8MTk2Mjg4MTAzOA==>
> .
> NAML
> <http://apache-spark-developers-list.1001551.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>




--
View this message in context: 
http://apache-spark-developers-list.1001551.n3.nabble.com/Filesystem-closed-Exception-tp11145p11148.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

Filesystem closed Exception

2015-03-20 Thread Sea
Hi, all:




When I exit the console of spark-sql, the following exception throwed..


My spark version is 1.3.0, hadoop version is 2.2.0


Exception in thread "Thread-3" java.io.IOException: Filesystem closed
at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:629)
at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1677)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1106)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1102)
at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1102)
at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1397)
at 
org.apache.spark.scheduler.EventLoggingListener.stop(EventLoggingListener.scala:196)
at 
org.apache.spark.SparkContext$$anonfun$stop$4.apply(SparkContext.scala:1388)
at 
org.apache.spark.SparkContext$$anonfun$stop$4.apply(SparkContext.scala:1388)
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.SparkContext.stop(SparkContext.scala:1388)
at 
org.apache.spark.sql.hive.thriftserver.SparkSQLEnv$.stop(SparkSQLEnv.scala:66)
at 
org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$$anon$1.run(SparkSQLCLIDriver.scala:107)‍

Filesystem closed Exception

2015-03-20 Thread Sea
Hi, all:




When I exit the console of spark-sql, the following exception throwed..


Exception in thread "Thread-3" java.io.IOException: Filesystem closed
at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:629)
at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1677)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1106)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1102)
at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1102)
at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1397)
at 
org.apache.spark.scheduler.EventLoggingListener.stop(EventLoggingListener.scala:196)
at 
org.apache.spark.SparkContext$$anonfun$stop$4.apply(SparkContext.scala:1388)
at 
org.apache.spark.SparkContext$$anonfun$stop$4.apply(SparkContext.scala:1388)
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.SparkContext.stop(SparkContext.scala:1388)
at 
org.apache.spark.sql.hive.thriftserver.SparkSQLEnv$.stop(SparkSQLEnv.scala:66)
at 
org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$$anon$1.run(SparkSQLCLIDriver.scala:107)‍

InvalidAuxServiceException in dynamicAllocation

2015-03-17 Thread Sea
Hi, all:


Spark1.3.0 hadoop2.2.0


I put the following params in the spark-defaults.conf 


spark.dynamicAllocation.enabled true
spark.dynamicAllocation.minExecutors 20
spark.dynamicAllocation.maxExecutors 300
spark.dynamicAllocation.executorIdleTimeout 300
spark.shuffle.service.enabled true‍



I started the thriftserver and do a query. Exception happened!
I find it in JIRA https://issues.apache.org/jira/browse/SPARK-5759‍ 
It says fixed version 1.3.0


Caused by: org.apache.hadoop.yarn.exceptions.InvalidAuxServiceException: The 
auxService:spark_shuffle does not existat 
sun.reflect.GeneratedConstructorAccessor28.newInstance(Unknown Source)   at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
 at java.lang.reflect.Constructor.newInstance(Constructor.java:513) 
 at 
org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.instantiateException(SerializedExceptionPBImpl.java:152)
at 
org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.deSerialize(SerializedExceptionPBImpl.java:106)
 at 
org.apache.hadoop.yarn.client.api.impl.NMClientImpl.startContainer(NMClientImpl.java:203)
at 
org.apache.spark.deploy.yarn.ExecutorRunnable.startContainer(ExecutorRunnable.scala:113)
 ... 4 more‍

InvalidAuxServiceException in dynamicAllocation

2015-03-17 Thread Sea
Hi, all:


Spark1.3.0 hadoop2.2.0


I put the following params in the spark-defaults.conf 


spark.dynamicAllocation.enabled true
spark.dynamicAllocation.minExecutors 20
spark.dynamicAllocation.maxExecutors 300
spark.dynamicAllocation.executorIdleTimeout 300
spark.shuffle.service.enabled true‍



I started the thriftserver and do a query. Exception happened!
I find it in JIRA https://issues.apache.org/jira/browse/SPARK-5759‍ 
It says fixed version 1.3.0


Caused by: org.apache.hadoop.yarn.exceptions.InvalidAuxServiceException: The 
auxService:spark_shuffle does not existat 
sun.reflect.GeneratedConstructorAccessor28.newInstance(Unknown Source)   at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
 at java.lang.reflect.Constructor.newInstance(Constructor.java:513) 
 at 
org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.instantiateException(SerializedExceptionPBImpl.java:152)
at 
org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.deSerialize(SerializedExceptionPBImpl.java:106)
 at 
org.apache.hadoop.yarn.client.api.impl.NMClientImpl.startContainer(NMClientImpl.java:203)
at 
org.apache.spark.deploy.yarn.ExecutorRunnable.startContainer(ExecutorRunnable.scala:113)
 ... 4 more‍