Hi,
I've a partitioned table in Hive (Avro) that I can query alright from hive
cli.
When using SparkSQL, I'm able to query some of the partitions, but getting
exception on some of the partitions.
The query is:
sqlContext.sql("select * from myTable where source='http' and date =
20150812").take(
(notFilter).checkpoint(interval)
toNotUpdate.foreachRDD(rdd =>
pending = rdd
)
Thanks
On 3 August 2015 at 13:09, Tathagata Das wrote:
> Can you tell us more about streaming app? DStream operation that you are
> using?
>
> On Sun, Aug 2, 2015 at 9:14 PM, Anand Nalya wrote:
>
Hi,
I'm writing a Streaming application in Spark 1.3. After running for some
time, I'm getting following execption. I'm sure, that no other process is
modifying the hdfs file. Any idea, what might be the cause of this?
15/08/02 21:24:13 ERROR scheduler.DAGSchedulerEventProcessLoop:
DAGSchedulerEv
I also ran into a similar use case. Is this possible?
On 15 July 2015 at 18:12, Michel Hubert wrote:
> Hi,
>
>
>
>
>
> I want to implement a time-out mechanism in de updateStateByKey(…)
> routine.
>
>
>
> But is there a way the retrieve the time of the start of the batch
> corresponding to the
e before using foreach(println). If the RDD is huge, you
>> definitely don't want to do that.
>>
>> Hope this helps.
>>
>> dean
>>
>> Dean Wampler, Ph.D.
>> Author: Programming Scala, 2nd Edition
>> <http://shop.oreilly.com/product/063692003
product/0636920033073.do> (O'Reilly)
> Typesafe <http://typesafe.com>
> @deanwampler <http://twitter.com/deanwampler>
> http://polyglotprogramming.com
>
> On Thu, Jul 9, 2015 at 5:56 AM, Anand Nalya wrote:
>
>> The data coming from dstream have the same key
it?
>
> On Thu, Jul 9, 2015 at 3:17 AM, Anand Nalya wrote:
>
>> Thats from the Streaming tab for Spark 1.4 WebUI.
>>
>> On 9 July 2015 at 15:35, Michel Hubert wrote:
>>
>>> Hi,
>>>
>>>
>>>
>>> I was just wondering h
Thats from the Streaming tab for Spark 1.4 WebUI.
On 9 July 2015 at 15:35, Michel Hubert wrote:
> Hi,
>
>
>
> I was just wondering how you generated to second image with the charts.
>
> What product?
>
>
>
> *From:* Anand Nalya [mailto:anand.na...@gmail.com]
&
Hi,
I've an application in which an rdd is being updated with tuples coming
from RDDs in a DStream with following pattern.
dstream.foreachRDD(rdd => {
myRDD = myRDD.union(rdd.filter(myfilter)).reduceByKey(_+_)
})
I'm using cache() and checkpointin to cache results. Over the time, the
lineage o
Hi,
Suppose I have an RDD that is loaded from some file and then I also have a
DStream that has data coming from some stream. I want to keep union some of
the tuples from the DStream into my RDD. For this I can use something like
this:
var myRDD: RDD[(String, Long)] = sc.fromText...
dstream.f
Hi,
I've a RDD which I want to split into two disjoint RDDs on with a boolean
function. I can do this with the following
val rdd1 = rdd.filter(f)
val rdd2 = rdd.filter(fnot)
I'm assuming that each of the above statement will traverse the RDD once
thus resulting in 2 passes.
Is there a way of do
Hi,
I'm using spark 1.4. I've a array field in my data frame and when I'm
trying to write this dataframe to postgres, I'm getting the following
exception:
Exception in thread "main" java.lang.IllegalArgumentException: Can't
translate null value for field
StructField(filter,ArrayType(StringType,fa
12 matches
Mail list logo