).
>>
>> The best solution I found so far (performance wise) was to write a custom
>> UDAF which does the window internally. This was still 8 times lower
>> throughput than batch and required a lot of coding and is not a general
>> solution.
>>
>> I am looki
does the window internally. This was still 8 times lower
> throughput than batch and required a lot of coding and is not a general
> solution.
>
> I am looking for an approach to improve the performance even more
> (preferably to either be on par with batch or a relatively low factor
>
problem is that any attempt to do a streaming like this results in
performance which is hundreds of times slower than batch.
Is there a correct way to do such an aggregation on streaming data (using
dataframes rather than RDD operations).
Assaf.
From: Liang-Chi Hsieh [via Apa
ch.
> Is there a correct way to do such an aggregation on streaming data (using
> dataframes rather than RDD operations).
> Assaf.
>
>
>
> From: Liang-Chi Hsieh [via Apache Spark Developers List] [mailto:
> ml-node+s1001551n20361h80@.nabble
> ]
> Sent: Monday,
[via Apache Spark Developers List]
[mailto:ml-node+s1001551n20361...@n3.nabble.com]
Sent: Monday, December 26, 2016 5:42 PM
To: Mendelson, Assaf
Subject: Re: Shuffle intermidiate results not being cached
Hi,
Let me quote your example codes:
var totalTime: Long = 0
var allDF
w.spark.tc/
--
View this message in context:
http://apache-spark-developers-list.1001551.n3.nabble.com/Shuffle-intermidiate-results-not-being-cached-tp20358p20361.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.
-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
e with the aggregations
> from there. Instead it seems it reads each dataframe from file all over
> again.
>
> Is this a bug? Am I doing something wrong?
>
>
>
> Thanks.
>
> Assaf.
>
> ----------
> View this message in cont
text:
http://apache-spark-developers-list.1001551.n3.nabble.com/Shuffle-intermidiate-results-not-being-cached-tp20358.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.