Tom - sorry for the delay. If you try OpenJDK (on a smaller heap), do you
see the same problem? Would be great to isolate whether the problem is
related to J9 or not. In either case we should fix it though.

On Fri, Mar 13, 2015 at 9:33 AM, Tom Hubregtsen <thubregt...@gmail.com>
wrote:

> I use the spark-submit script and the config files in a conf directory. I
> see the memory settings reflected in the stdout, as well as in the webUI.
> (it prints all variables from spark-default.conf, and metions I have 540GB
> free memory available when trying to store a broadcast variable or RDD). I
> also run "ps -aux | grep java | grep th", which show me that I called java
> with "-Xms1000g -Xmx1000g"
>
> I also tested if these numbers are realistic for the J9 JVM. Outside of
> Spark, when setting just the initial heapsize (Xms), it gives an error, but
> if I also define the maximum option with it (Xmx), it seems to us that it
> is accepting it. Also, in IBM's J9 health center, I see it reserve the
> 900g, and use up to 68g.
>
> Thanks,
>
> Tom
>
> On 13 March 2015 at 02:05, Reynold Xin <r...@databricks.com> wrote:
>
>> How did you run the Spark command? Maybe the memory setting didn't
>> actually apply? How much memory does the web ui say is available?
>>
>> BTW - I don't think any JVM can actually handle 700G heap ... (maybe
>> Zing).
>>
>> On Thu, Mar 12, 2015 at 4:09 PM, Tom Hubregtsen <thubregt...@gmail.com>
>> wrote:
>>
>>> Hi all,
>>>
>>> I'm running the teraSort benchmark with a relative small input set: 5GB.
>>> During profiling, I can see I am using a total of 68GB. I've got a
>>> terabyte
>>> of memory in my system, and set
>>> spark.executor.memory 900g
>>> spark.driver.memory 900g
>>> I use the default for
>>> spark.shuffle.memoryFraction
>>> spark.storage.memoryFraction
>>> I believe that I now have 0.2*900=180GB for shuffle and 0.6*900=540GB for
>>> storage.
>>>
>>> I noticed a lot of variation in runtime (under the same load), and
>>> tracked
>>> this down to this function in
>>> core/src/main/scala/org/apache/spark/util/collection/ExternalSorter.scala
>>>   private def spillToPartitionFiles(collection:
>>> SizeTrackingPairCollection[(Int, K), C]): Unit = {
>>>     spillToPartitionFiles(collection.iterator)
>>>   }
>>> In a slow run, it would loop through this function 12000 times, in a fast
>>> run only 700 times, even though the settings in both runs are the same
>>> and
>>> there are no other users on the system. When I look at the function
>>> calling
>>> this (insertAll, also in ExternalSorter), I see that
>>> spillToPartitionFiles
>>> is only called 700 times in both fast and slow runs, meaning that the
>>> function recursively calls itself very often. Because of the function
>>> name,
>>> I assume the system is spilling to disk. As I have sufficient memory, I
>>> assume that I forgot to set a certain memory setting. Anybody any idea
>>> which
>>> other setting I have to set, in order to not spill data in this scenario?
>>>
>>> Thanks,
>>>
>>> Tom
>>>
>>>
>>>
>>> --
>>> View this message in context:
>>> http://apache-spark-developers-list.1001551.n3.nabble.com/Spilling-when-not-expected-tp11017.html
>>> Sent from the Apache Spark Developers List mailing list archive at
>>> Nabble.com.
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
>>> For additional commands, e-mail: dev-h...@spark.apache.org
>>>
>>>
>>
>

Reply via email to