Re: Handling Empty RDD

2016-05-22 Thread Yogesh Vyas
gt;> >> Hi, >> I am reading files using textFileStream, performing some action onto >> it and then saving it to HDFS using saveAsTextFile. >> But whenever there is no file to read, Spark will write and empty RDD( >> [] ) to HDFS. >> So, how to handle the empty RDD.

Re: Handling Empty RDD

2016-05-22 Thread Ted Yu
some action onto > it and then saving it to HDFS using saveAsTextFile. > But whenever there is no file to read, Spark will write and empty RDD( > [] ) to HDFS. > So, how to handle the empty RDD. > > I checked rdd.isEmpty() and rdd.count>0

Handling Empty RDD

2016-05-22 Thread Yogesh Vyas
Hi, I am reading files using textFileStream, performing some action onto it and then saving it to HDFS using saveAsTextFile. But whenever there is no file to read, Spark will write and empty RDD( [] ) to HDFS. So, how to handle the empty RDD. I checked rdd.isEmpty() and rdd.count>0, but b

Re: partition an empty RDD

2016-04-07 Thread Tenghuan He
in your code, and it is required. > > On Thu, Apr 7, 2016 at 5:52 AM, Tenghuan He <tenghua...@gmail.com> wrote: > > Hi all, > > > > I want to create an empty rdd and partition it > > > > val buffer: RDD[(K, (V, Int))] = base.context.emptyRDD[(K, (V, > >

Re: partition an empty RDD

2016-04-07 Thread Sean Owen
It means pretty much what it says. Your code does not have runtime class info about K at this point in your code, and it is required. On Thu, Apr 7, 2016 at 5:52 AM, Tenghuan He <tenghua...@gmail.com> wrote: > Hi all, > > I want to create an empty rdd and partition it > > va

partition an empty RDD

2016-04-06 Thread Tenghuan He
Hi all, I want to create an empty rdd and partition it val buffer: RDD[(K, (V, Int))] = base.context.emptyRDD[(K, (V, Int))].partitionBy(new HashPartitioner(5)) but got Error: No ClassTag available for K scala needs at runtime to have information about K , but how to solve this? Thanks

Re: Spark Streaming - Travis CI and GitHub custom receiver - continuous data but empty RDD?

2016-03-05 Thread Ted Yu
TY")} else > {rdd.collect().foreach(event => println(event.getRepo.getName + " " + > event.getId))} > }) > > ctx.start() > ctx.awaitTermination() > > Thanks in advance! > > > > -- > View this message in context: > http://apa

Spark Streaming - Travis CI and GitHub custom receiver - continuous data but empty RDD?

2016-03-05 Thread Dominik Safaric
lse {rdd.collect().foreach(event => println(event.getRepo.getName + " " + event.getId))} }) ctx.start() ctx.awaitTermination() Thanks in advance! -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-Travis-CI-and-GitHub-custom-recei

Re: registering an empty RDD as a temp table in a PySpark SQL context

2015-08-18 Thread Hemant Bhanawat
are used to verify the schema: https://github.com/apache/spark/blob/branch-1.3/python/pyspark/sql/context.py#L299 Before I attempt to extend the Scala code to handle an empty RDD or provide an empty DataFrame that can be registered, I was wondering what people recommend in this case. Perhaps

registering an empty RDD as a temp table in a PySpark SQL context

2015-08-17 Thread Eric Walker
to verify the schema: https://github.com/apache/spark/blob/branch-1.3/python/pyspark/sql/context.py#L299 Before I attempt to extend the Scala code to handle an empty RDD or provide an empty DataFrame that can be registered, I was wondering what people recommend in this case. Perhaps there's

Re: How to create empty RDD

2015-07-07 Thread ๏̯͡๏
It worked Zhou. On Mon, Jul 6, 2015 at 10:43 PM, Wei Zhou zhweisop...@gmail.com wrote: I userd val output: RDD[(DetailInputRecord, VISummary)] = sc.emptyRDD[(DetailInputRecord, VISummary)] to create empty RDD before. Give it a try, it might work for you too. 2015-07-06 14:11 GMT-07:00 ÐΞ

Re: How to create empty RDD

2015-07-06 Thread Richard Marscher
This should work val output: RDD[(DetailInputRecord, VISummary)] = sc.paralellize(Seq.empty[(DetailInputRecord, VISummary)]) On Mon, Jul 6, 2015 at 5:11 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) deepuj...@gmail.com wrote: I need to return an empty RDD of type val output: RDD[(DetailInputRecord, VISummary

How to create empty RDD

2015-07-06 Thread ๏̯͡๏
I need to return an empty RDD of type val output: RDD[(DetailInputRecord, VISummary)] This does not work val output: RDD[(DetailInputRecord, VISummary)] = new RDD() as RDD is abstract class. How do i create empty RDD ? -- Deepak

Re: How to create empty RDD

2015-07-06 Thread Wei Zhou
I userd val output: RDD[(DetailInputRecord, VISummary)] = sc.emptyRDD[(DetailInputRecord, VISummary)] to create empty RDD before. Give it a try, it might work for you too. 2015-07-06 14:11 GMT-07:00 ÐΞ€ρ@Ҝ (๏̯͡๏) deepuj...@gmail.com: I need to return an empty RDD of type val output: RDD

Empty RDD?

2015-04-08 Thread Vadim Bichutskiy
When I call *transform* or *foreachRDD *on* DStream*, I keep getting an error that I have an empty RDD, which make sense since my batch interval maybe smaller than the rate at which new data are coming in. How to guard against it? Thanks, Vadim ᐧ

Re: Empty RDD?

2015-04-08 Thread Tathagata Das
Aah yes. The jsonRDD method needs to walk through the whole RDD to understand the schema, and does not work if there is not data in it. Making sure there is no data in it using take(1) should work. TD

Re: Empty RDD?

2015-04-08 Thread Vadim Bichutskiy
Thanks TD! On Apr 8, 2015, at 9:36 PM, Tathagata Das t...@databricks.com wrote: Aah yes. The jsonRDD method needs to walk through the whole RDD to understand the schema, and does not work if there is not data in it. Making sure there is no data in it using take(1) should work. TD

Re: Empty RDD?

2015-04-08 Thread Tathagata Das
...@gmail.com wrote: When I call *transform* or *foreachRDD *on* DStream*, I keep getting an error that I have an empty RDD, which make sense since my batch interval maybe smaller than the rate at which new data are coming in. How to guard against it? Thanks, Vadim ᐧ

Re: How to create an empty RDD with a given type?

2015-01-12 Thread Xuelin Cao
12, 2015 at 9:50 PM, Xuelin Cao xuelincao2...@gmail.com wrote: Hi, I'd like to create a transform function, that convert RDD[String] to RDD[Int] Occasionally, the input RDD could be an empty RDD. I just want to directly create an empty RDD[Int] if the input RDD is empty. And, I

Re: How to create an empty RDD with a given type?

2015-01-12 Thread Justin Yip
a transform function, that convert RDD[String] to RDD[Int] Occasionally, the input RDD could be an empty RDD. I just want to directly create an empty RDD[Int] if the input RDD is empty. And, I don't want to return None as the result. Is there an easy way to do that?

RE: Spark Streaming empty RDD issue

2014-12-04 Thread Shao, Saisai
Streaming empty RDD issue Hi Experts I am using Spark Streaming to integrate Kafka for real time data processing. I am facing some issues related to Spark Streaming So I want to know how can we detect 1) Our connection has been lost 2) Our receiver is down 3) Spark Streaming has no new messages

Spark Streaming empty RDD issue

2014-12-03 Thread Hafiz Mujadid
these issues? I will be glad to hear from you and will be thankful to you. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-empty-RDD-issue-tp20329.html Sent from the Apache Spark User List mailing list archive at Nabble.com

Re: Spark Streaming empty RDD issue

2014-12-03 Thread Akhil Das
and will be thankful to you. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-empty-RDD-issue-tp20329.html Sent from the Apache Spark User List mailing list archive at Nabble.com

Re: How to not write empty RDD partitions in RDD.saveAsTextFile()

2014-10-20 Thread Yi Tian
I think you could use `repartition` to make sure there would be no empty partitions. You could also try `coalesce` to combine partitions , but it can't make sure there are no more empty partitions. Best Regards, Yi Tian tianyi.asiai...@gmail.com On Oct 18, 2014, at 20:30,