Hi,
Spark does that out of the box for you :)
It compresses down the execution steps as much as possible.
Regards
Mayur

Mayur Rustagi
Ph: +1 (760) 203 3257
http://www.sigmoidanalytics.com
@mayur_rustagi <https://twitter.com/mayur_rustagi>



On Wed, Jul 9, 2014 at 3:15 PM, Konstantin Kudryavtsev <
[email protected]> wrote:

> Hi all,
>
> I wondered if you could help me to clarify the next situation:
> in the classic example
>
> val file = spark.textFile("hdfs://...")
> val errors = file.filter(line => line.contains("ERROR"))
>
> As I understand, the data is read in memory in first, and after that
> filtering is applying. Is it any way to apply filtering during the read
> step? and don't put all objects into memory?
>
> Thank you,
> Konstantin Kudryavtsev
>

Reply via email to