[jira] [Commented] (SPARK-2650) Wrong initial sizes for in-memory column buffers

2014-08-04 Thread Cheng Lian (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-2650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084397#comment-14084397
 ] 

Cheng Lian commented on SPARK-2650:
---

Did some experiments and came to some conclusions:

# Needless to say, the {{10 * 1024 * 104}} is definitely a typo, but it's not 
related to the OOMs. More reasonable initial buffer sizes don't help solving 
these OOMs.
# The OOMs are also not related to whether the table size is larger than 
available memory. The cause is that the process of building in-memory columnar 
buffers is memory consuming, and multiple tasks building buffers in parallel 
eat too much memory altogether.
# According to 2, reducing parallelism or increasing executor memory can 
workaround this issue. For example, a {{HiveThriftServer2}} started with 
default executor memory (512MB) and {{--total-executor-cores=1}} could cache a 
1.7GB table.
# Shark performs better than Spark SQL in this case, but still OOMs when the 
table gets larger: caching a 1.8GB table with default Shark configurations 
makes Shark OOM too.

I'm investigating why Spark SQL consumes more memory than Shark when building 
in-memory columnar buffers.

 Wrong initial sizes for in-memory column buffers
 

 Key: SPARK-2650
 URL: https://issues.apache.org/jira/browse/SPARK-2650
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.0.0, 1.0.1
Reporter: Michael Armbrust
Assignee: Cheng Lian
Priority: Critical

 The logic for setting up the initial column buffers is different for Spark 
 SQL compared to Shark and I'm seeing OOMs when caching tables that are larger 
 than available memory (where shark was okay).
 Two suspicious things: the intialSize is always set to 0 so we always go with 
 the default.  The default looks like it was copied from code like 10 * 1024 * 
 1024... but in Spark SQL its 10 * 102 * 1024.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-2650) Wrong initial sizes for in-memory column buffers

2014-08-04 Thread Cheng Lian (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-2650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085104#comment-14085104
 ] 

Cheng Lian commented on SPARK-2650:
---

Some additional comments after more experiments and some improvements:

# How exactly the OOMs occur when caching a large table (assume N cores and M 
memory are available within a single executor):
#- Say the table is so large that the underlying RDD is divided into X 
partitions (usually X  N, let's assume this here)
#- When caching the table, N tasks are executed in parallel, building column 
buffers, each of them is memory consuming. Say, each task consumes Y memory in 
average.
#- At some point, memory consumptions of all N parallel tasks altogether, 
namely N * Y exceeds available memory of the executor, an OOM is thrown
#- All tasks fail and retry, but fail again, until the driver stops retrying, 
and the job fail
#- I guess the reason that this issue hasn't been reported is that, usually M  
N * Y holds in production.
# Initial buffer sizes do affect the OOMs in a subtle way:
#- Too large an initial buffer size implies, apparently, larger memory 
consumption
#- Too small an initial buffer size causes the {{ColumnBuilder}} keeps 
allocating larger buffers to ensure enough free space to hold more elements 
(12.5% larger at a time). Thus 212.5% larger space is required to finish 
growing a buffer (112.5% for the new buffer + 100% for the original one).
#- A well estimated initial buffer size should be 1) large enough to avoid 
buffer growing, and 2) small enough to avoid memory waste. For example, by hand 
tuning, 5MB can be a good initial size for an executor with 512M memory and 1 
core.
# An apparent approach to help fixing this issue is to try reducing the memory 
consumption during the column building process.
#- [PR #1769|https://github.com/apache/spark/pull/1769] is submitted to reduce 
memory consumption of the column building proces.
# Another approach is to estimate the initial buffer size. To do this, Shark 
uses an estimated table partition size by leveraging HDFS block size and column 
element default size. We can use similar approach in Spark SQL for Hive tables, 
and some configurable initial size for non-Hive tables.
#- Currently {{InMemoryRelation}} resides in package 
{{org.apache.spark.sql.columnar}} and doesn't know anything about Hive tables. 
We can add an {{estimatedPartitionSize}} method, and override it in a new 
{{InMemoryMetastoreRelation}} to estimate RDD partition sizes of a Hive table. 
This will be done in another separate PR.

 Wrong initial sizes for in-memory column buffers
 

 Key: SPARK-2650
 URL: https://issues.apache.org/jira/browse/SPARK-2650
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.0.0, 1.0.1
Reporter: Michael Armbrust
Assignee: Cheng Lian
Priority: Critical

 The logic for setting up the initial column buffers is different for Spark 
 SQL compared to Shark and I'm seeing OOMs when caching tables that are larger 
 than available memory (where shark was okay).
 Two suspicious things: the intialSize is always set to 0 so we always go with 
 the default.  The default looks like it was copied from code like 10 * 1024 * 
 1024... but in Spark SQL its 10 * 102 * 1024.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org