on-for-large-lzo-files
>
>
> Mohammed
>
>
> -Original Message-
> From: Matt Narrell [mailto:matt.narr...@gmail.com ]
> Sent: Tuesday, October 6, 2015 4:08 PM
> To: Mohammed Guller
> Cc: davidkl; user@spark.apache.org
> Subject: Re: laziness in textFile
on-for-large-lzo-files
>
>
> Mohammed
>
>
> -Original Message-
> From: Matt Narrell [mailto:matt.narr...@gmail.com ]
> Sent: Tuesday, October 6, 2015 4:08 PM
> To: Mohammed Guller
> Cc: davidkl; user@spark.apache.org
> Subject: Re: laziness in textFile
spark-hadoop-throws-exception-for-large-lzo-files
Mohammed
-Original Message-
From: Matt Narrell [mailto:matt.narr...@gmail.com]
Sent: Tuesday, October 6, 2015 4:08 PM
To: Mohammed Guller
Cc: davidkl; user@spark.apache.org
Subject: Re: laziness in textFile reading from HDFS?
Agreed. This is
gt; save operation, I don't see how caching would help.
>
> Mohammed
>
>
> -Original Message-
> From: Matt Narrell [mailto:matt.narr...@gmail.com]
> Sent: Tuesday, October 6, 2015 3:32 PM
> To: Mohammed Guller
> Cc: davidkl; user@spark.apache.org
>
idkl; user@spark.apache.org
> Subject: Re: laziness in textFile reading from HDFS?
>
> Is there any more information or best practices here? I have the exact same
> issues when reading large data sets from HDFS (larger than available RAM) and
> I cannot run without setting the RDD persi
Mohammed
>>
>> -----Original Message-----
>> From: davidkl [mailto:davidkl...@hotmail.com]
>> Sent: Monday, September 28, 2015 1:40 AM
>> To: user@spark.apache.org
>> Subject: laziness in textFile reading from HDFS?
>>
>> Hello,
>>
>> I nee
: laziness in textFile reading from HDFS?
Is there any more information or best practices here? I have the exact same
issues when reading large data sets from HDFS (larger than available RAM) and I
cannot run without setting the RDD persistence level to MEMORY_AND_DISK_SER,
and using nearly all the
ad operation is lazy
> 4) It is okay to have more number of partitions than number of cores.
>
> Mohammed
>
> -Original Message-
> From: davidkl [mailto:davidkl...@hotmail.com]
> Sent: Monday, September 28, 2015 1:40 AM
> To: user@spark.apache.org
> Subje
[mailto:davidkl...@hotmail.com]
Sent: Monday, September 28, 2015 1:40 AM
To: user@spark.apache.org
Subject: laziness in textFile reading from HDFS?
Hello,
I need to process a significant amount of data every day, about 4TB. This will
be processed in batches of about 140GB. The cluster this will
spark-user-list.1001560.n3.nabble.com/laziness-in-textFile-reading-from-HDFS-tp24837.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.or
10 matches
Mail list logo