Re: Executorlost failure

Wes Peng Thu, 07 Apr 2022 22:14:23 -0700

I just did a test, even for a single node (local deployment), spark canhandle the data whose size is much larger than the total memory.


My test VM (2g ram, 2 cores):


$ free -m

total used free shared buff/cacheavailableMem: 1992 1845 92 19 5436

Swap:          1023         285         738


The data size:

$ du -h rate.csv
3.2G    rate.csv


Loading this file into spark for calculation can be done without error:

scala> val df = spark.read.format("csv").option("inferSchema",true).load("skydrive/rate.csv")val df: org.apache.spark.sql.DataFrame = [_c0: string, _c1: string ... 2more fields]


scala> df.printSchema

warning: 1 deprecation (since 2.13.3); for details, enable `:setting-deprecation` or `:replay -deprecation`

root
 |-- _c0: string (nullable = true)
 |-- _c1: string (nullable = true)
 |-- _c2: double (nullable = true)
 |-- _c3: integer (nullable = true)

scala>df.groupBy("_c1").agg(avg("_c2").alias("avg_rating")).orderBy(desc("avg_rating")).showwarning: 1 deprecation (since 2.13.3); for details, enable `:setting-deprecation` or `:replay -deprecation`+----------+----------+

|       _c1|avg_rating|
+----------+----------+
|0001360000|       5.0|
|0001711474|       5.0|
|0001360779|       5.0|
|0001006657|       5.0|
|0001361155|       5.0|
|0001018043|       5.0|
|000136118X|       5.0|
|0000202010|       5.0|
|0001371037|       5.0|
|0000401048|       5.0|
|0001371045|       5.0|
|0001203010|       5.0|
|0001381245|       5.0|
|0001048236|       5.0|
|0001436163|       5.0|
|000104897X|       5.0|
|0001437879|       5.0|
|0001056107|       5.0|
|0001468685|       5.0|
|0001061240|       5.0|
+----------+----------+
only showing top 20 rows


So as you see spark can handle file larger than its memory well. :)

Thanks


rajat kumar wrote:

With autoscaling can have any numbers of executors.


---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Re: Executorlost failure

Reply via email to