Do you have the full stack trace? Could you check if it's same as
https://issues.apache.org/jira/browse/SPARK-10422
Best Regards,
Shixiong Zhu
2015-10-01 17:05 GMT+08:00 Eyad Sibai <eyad.alsi...@gmail.com>:
> Hi
>
> I am trying to call .persist() on a dataframe but once I e
Which version are you using? Could you take a look at the new Streaming UI
in 1.4.0?
Best Regards,
Shixiong Zhu
2015-09-29 7:52 GMT+08:00 Siva <sbhavan...@gmail.com>:
> Hi,
>
> Could someone recommend the monitoring tools for spark streaming?
>
> By extending Streamin
enough space.
Best Regards,
Shixiong Zhu
2015-09-29 1:04 GMT+08:00 swetha <swethakasire...@gmail.com>:
>
> Hi,
>
> I see a lot of data getting filled locally as shown below from my streaming
> job. I have my checkpoint set to hdfs. But, I still see the following data
> fi
"count" Spark jobs will run in parallel.
Moreover, "spark.streaming.concurrentJobs" is an internal configuration and
it may be changed in future.
Best Regards,
Shixiong Zhu
2015-09-26 3:34 GMT+08:00 Atul Kulkarni <atulskulka...@gmail.com>:
> Can someone please he
You can change "spark.sql.broadcastTimeout" to increase the timeout. The
default value is 300 seconds.
Best Regards,
Shixiong Zhu
2015-09-24 15:16 GMT+08:00 Eyad Sibai <eyad.alsi...@gmail.com>:
> I am trying to join two tables using dataframes using python 3.4 and I am
>
Looks like you have an incompatible hbase-default.xml in some place. You
can use the following code to find the location of "hbase-default.xml"
println(Thread.currentThread().getContextClassLoader().getResource("hbase-default.xml"))
Best Regards,
Shixiong Zhu
2015-09-21
. RDD.compute: this will run in the executor and the location is not
guaranteed. E.g.,
DStream.foreachRDD(rdd => rdd.foreach { v =>
println(v)
})
"println(v)" is called in the executor.
Best Regards,
Shixiong Zhu
2015-09-17 3:47 GMT+08:00 Renyi Xiong <renyixio...@gmail.com>:
&
Looks like you returns a "Some(null)" in "compute". If you don't want to
create a RDD, it should return None. If you want to return an empty RDD, it
should return "Some(sc.emptyRDD)".
Best Regards,
Shixiong Zhu
2015-09-15 2:51 GMT+08:00 Juan Rodríguez Hortalá <
Could you send a PR to fix it? Thanks!
Best Regards,
Shixiong Zhu
2015-12-08 13:31 GMT-08:00 Richard Marscher <rmarsc...@localytics.com>:
> Alright I was able to work through the problem.
>
> So the owning thread was one from the executor task launch worker, which
> at least
Which version are you using? Could you post these thread names here?
Best Regards,
Shixiong Zhu
2015-12-07 14:30 GMT-08:00 Richard Marscher <rmarsc...@localytics.com>:
> Hi,
>
> I've been running benchmarks against Spark in local mode in a long running
> process. I'm seeing th
Best Regards,
Shixiong Zhu
2015-12-17 4:39 GMT-08:00 Bartłomiej Alberski <albers...@gmail.com>:
> I prepared simple example helping in reproducing problem:
>
> https://github.com/alberskib/spark-streaming-broadcast-issue
>
> I think that in that way it will be easier for you
It doesn't guarantee that. E.g.,
scala> sc.parallelize(Seq(1.0, 2.0, 3.0, 4.0), 2).filter(_ >
2.0).zipWithUniqueId().collect().foreach(println)
(3.0,1)
(4.0,3)
It only guarantees "unique".
Best Regards,
Shixiong Zhu
2015-12-13 10:18 GMT-08:00 Sourav Mazumder <sourav.m
Hye Rachana, could you provide the full jstack outputs? Maybe it's same as
https://issues.apache.org/jira/browse/SPARK-11104
Best Regards,
Shixiong Zhu
2016-01-04 12:56 GMT-08:00 Rachana Srivastava <
rachana.srivast...@markmonitor.com>:
> Hello All,
>
>
>
> I am running my
Looks you need to add an "driver" option to your codes, such as
sqlContext.read.format("jdbc").options(
Map("url" -> "jdbc:oracle:thin:@:1521:xxx",
"driver" -> "oracle.jdbc.driver.OracleDriver",
"dbtable&q
Just replace `localhost` with a host name that can be accessed by Yarn
containers.
Best Regards,
Shixiong Zhu
2015-12-22 0:11 GMT-08:00 prasadreddy <alle.re...@gmail.com>:
> How do we achieve this on yarn-cluster mode
>
> Please advice.
>
> Thanks
> Prasad
>
&
Looks you have a reference to some Akka class. Could you post your codes?
Best Regards,
Shixiong Zhu
2015-12-17 23:43 GMT-08:00 Pankaj Narang <pankajnaran...@gmail.com>:
> I am encountering below error. Can somebody guide ?
>
> Something similar is one this link
> https://
You are right. "checkpointInterval" is only for data checkpointing.
"metadata checkpoint" is done for each batch. Feel free to send a PR to add
the missing doc.
Best Regards,
Shixiong Zhu
2015-12-18 8:26 GMT-08:00 Lan Jiang <ljia...@gmail.com>:
> Need some clarific
What's the Scala version of your Spark? Is it 2.10?
Best Regards,
Shixiong Zhu
2015-12-17 10:10 GMT-08:00 Christos Mantas <cman...@cslab.ece.ntua.gr>:
> Hello,
>
> I am trying to set up a simple example with Spark Streaming (Python) and
> Kafka on a single machine deployment.
Het Eyal, I just checked the couchbase spark connector jar. The target
version of some of classes are Java 8 (52.0). You can create a ticket in
https://issues.couchbase.com/projects/SPARKC
Best Regards,
Shixiong Zhu
2015-11-26 9:03 GMT-08:00 Ted Yu <yuzhih...@gmail.com>:
> StoreMod
101 - 119 of 119 matches
Mail list logo