Hi all,

I wondered if you could help me to clarify the next situation:
in the classic example

val file = spark.textFile("hdfs://...")
val errors = file.filter(line => line.contains("ERROR"))

As I understand, the data is read in memory in first, and after that
filtering is applying. Is it any way to apply filtering during the read
step? and don't put all objects into memory?

Thank you,
Konstantin Kudryavtsev

Reply via email to