[jira] [Commented] (SPARK-21595) introduction of spark.sql.windowExec.buffer.spill.threshold in spark 2.2 breaks existing workflow
[ https://issues.apache.org/jira/browse/SPARK-21595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17819249#comment-17819249 ] Ramakrishna commented on SPARK-21595: - [~Rakesh_Shah] How did you manage to solve this ? I am getting this in my streaming query, it does aggregations similar to other streaming queries in same job. However it fails and I get {"timestamp":"21/02/2024 07:11:35","logLevel":"ERROR","class":"MapOutputTracker","thread":"Executor task launch worker for task 25.0 in stage 2.1 (TID 75)","message":"Missing an output location for shuffle 5 partition 35"} Can you please help ? [~tejasp] Can you please help ? > introduction of spark.sql.windowExec.buffer.spill.threshold in spark 2.2 > breaks existing workflow > - > > Key: SPARK-21595 > URL: https://issues.apache.org/jira/browse/SPARK-21595 > Project: Spark > Issue Type: Bug > Components: Documentation, PySpark >Affects Versions: 2.2.0 > Environment: pyspark on linux >Reporter: Stephan Reiling >Assignee: Tejas Patil >Priority: Minor > Labels: documentation, regression > Fix For: 2.2.1, 2.3.0 > > > My pyspark code has the following statement: > {code:java} > # assign row key for tracking > df = df.withColumn( > 'association_idx', > sqlf.row_number().over( > Window.orderBy('uid1', 'uid2') > ) > ) > {code} > where df is a long, skinny (450M rows, 10 columns) dataframe. So this creates > one large window for the whole dataframe to sort over. > In spark 2.1 this works without problem, in spark 2.2 this fails either with > out of memory exception or too many open files exception, depending on memory > settings (which is what I tried first to fix this). > Monitoring the blockmgr, I see that spark 2.1 creates 152 files, spark 2.2 > creates >110,000 files. > In the log I see the following messages (110,000 of these): > {noformat} > 17/08/01 08:55:37 INFO UnsafeExternalSorter: Spilling data because number of > spilledRecords crossed the threshold 4096 > 17/08/01 08:55:37 INFO UnsafeExternalSorter: Thread 156 spilling sort data of > 64.1 MB to disk (0 time so far) > 17/08/01 08:55:37 INFO UnsafeExternalSorter: Spilling data because number of > spilledRecords crossed the threshold 4096 > 17/08/01 08:55:37 INFO UnsafeExternalSorter: Thread 156 spilling sort data of > 64.1 MB to disk (1 time so far) > {noformat} > So I started hunting for clues in UnsafeExternalSorter, without luck. What I > had missed was this one message: > {noformat} > 17/08/01 08:55:37 INFO ExternalAppendOnlyUnsafeRowArray: Reached spill > threshold of 4096 rows, switching to > org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter > {noformat} > Which allowed me to track down the issue. > By changing the configuration to include: > {code:java} > spark.sql.windowExec.buffer.spill.threshold 2097152 > {code} > I got it to work again and with the same performance as spark 2.1. > I have workflows where I use windowing functions that do not fail, but took a > performance hit due to the excessive spilling when using the default of 4096. > I think to make it easier to track down these issues this config variable > should be included in the configuration documentation. > Maybe 4096 is too small of a default value? -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21595) introduction of spark.sql.windowExec.buffer.spill.threshold in spark 2.2 breaks existing workflow
[ https://issues.apache.org/jira/browse/SPARK-21595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17088302#comment-17088302 ] Rakesh Shah commented on SPARK-21595: - Hi [~sreiling] I am also facing same issue where my shuffle is taking a long time, Here I am joining two tables in spark, but before this point there is a window function being used. When I debugged i saw this below info, when I tried to change the property I am not able to change it, it still shows same. 20/04/21 04:15:18 INFO Executor: Finished task 935.0 in stage 43.0 (TID 28873). 26714 bytes result sent to driver 20/04/21 04:18:20 INFO ExternalAppendOnlyUnsafeRowArray: Reached spill threshold of 4096 rows, switching to org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter 20/04/21 04:18:20 INFO ExternalAppendOnlyUnsafeRowArray: Reached spill threshold of 4096 rows, switching to org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter 20/04/21 04:18:20 INFO ExternalAppendOnlyUnsafeRowArray: Reached spill threshold of 4096 rows, switching to org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter 20/04/21 04:20:49 INFO ExternalAppendOnlyUnsafeRowArray: Reached spill threshold of 4096 rows, switching to org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter 20/04/21 04:20:49 INFO ExternalAppendOnlyUnsafeRowArray: Reached spill threshold of 4096 rows, switching to org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter 20/04/21 04:20:49 INFO ExternalAppendOnlyUnsafeRowArray: Reached spill threshold of 4096 rows, switching to org.apache.spark.util.collection.uns can you please help me with this. > introduction of spark.sql.windowExec.buffer.spill.threshold in spark 2.2 > breaks existing workflow > - > > Key: SPARK-21595 > URL: https://issues.apache.org/jira/browse/SPARK-21595 > Project: Spark > Issue Type: Bug > Components: Documentation, PySpark >Affects Versions: 2.2.0 > Environment: pyspark on linux >Reporter: Stephan Reiling >Assignee: Tejas Patil >Priority: Minor > Labels: documentation, regression > Fix For: 2.2.1, 2.3.0 > > > My pyspark code has the following statement: > {code:java} > # assign row key for tracking > df = df.withColumn( > 'association_idx', > sqlf.row_number().over( > Window.orderBy('uid1', 'uid2') > ) > ) > {code} > where df is a long, skinny (450M rows, 10 columns) dataframe. So this creates > one large window for the whole dataframe to sort over. > In spark 2.1 this works without problem, in spark 2.2 this fails either with > out of memory exception or too many open files exception, depending on memory > settings (which is what I tried first to fix this). > Monitoring the blockmgr, I see that spark 2.1 creates 152 files, spark 2.2 > creates >110,000 files. > In the log I see the following messages (110,000 of these): > {noformat} > 17/08/01 08:55:37 INFO UnsafeExternalSorter: Spilling data because number of > spilledRecords crossed the threshold 4096 > 17/08/01 08:55:37 INFO UnsafeExternalSorter: Thread 156 spilling sort data of > 64.1 MB to disk (0 time so far) > 17/08/01 08:55:37 INFO UnsafeExternalSorter: Spilling data because number of > spilledRecords crossed the threshold 4096 > 17/08/01 08:55:37 INFO UnsafeExternalSorter: Thread 156 spilling sort data of > 64.1 MB to disk (1 time so far) > {noformat} > So I started hunting for clues in UnsafeExternalSorter, without luck. What I > had missed was this one message: > {noformat} > 17/08/01 08:55:37 INFO ExternalAppendOnlyUnsafeRowArray: Reached spill > threshold of 4096 rows, switching to > org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter > {noformat} > Which allowed me to track down the issue. > By changing the configuration to include: > {code:java} > spark.sql.windowExec.buffer.spill.threshold 2097152 > {code} > I got it to work again and with the same performance as spark 2.1. > I have workflows where I use windowing functions that do not fail, but took a > performance hit due to the excessive spilling when using the default of 4096. > I think to make it easier to track down these issues this config variable > should be included in the configuration documentation. > Maybe 4096 is too small of a default value? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21595) introduction of spark.sql.windowExec.buffer.spill.threshold in spark 2.2 breaks existing workflow
[ https://issues.apache.org/jira/browse/SPARK-21595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16114694#comment-16114694 ] Tejas Patil commented on SPARK-21595: - [~sreiling] Spilling will happen only when _both_ these are met: - number of records exceeds the limit of in-memory array. Earlier this was hardcoded to be 4096, with the change it will be configurable (default value = 4096). - there is less memory on the executors due to many consumers OR a spill threshold based on number of records has reached (this was always configurable and defaulted to `UnsafeExternalSorter.DEFAULT_NUM_ELEMENTS_FOR_SPILL_THRESHOLD`). This was what used to happen in v2.1 and what I am proposing. The second criteria takes care of looking at the actual memory utilization (spill threshold value used to be defaulted to a super large value so typically the memory pressure situation kicks in before that). Here is a PR for that : https://github.com/apache/spark/pull/18843 Please take a look and share your feedback. > introduction of spark.sql.windowExec.buffer.spill.threshold in spark 2.2 > breaks existing workflow > - > > Key: SPARK-21595 > URL: https://issues.apache.org/jira/browse/SPARK-21595 > Project: Spark > Issue Type: Bug > Components: Documentation, PySpark >Affects Versions: 2.2.0 > Environment: pyspark on linux >Reporter: Stephan Reiling >Priority: Minor > Labels: documentation, regression > > My pyspark code has the following statement: > {code:java} > # assign row key for tracking > df = df.withColumn( > 'association_idx', > sqlf.row_number().over( > Window.orderBy('uid1', 'uid2') > ) > ) > {code} > where df is a long, skinny (450M rows, 10 columns) dataframe. So this creates > one large window for the whole dataframe to sort over. > In spark 2.1 this works without problem, in spark 2.2 this fails either with > out of memory exception or too many open files exception, depending on memory > settings (which is what I tried first to fix this). > Monitoring the blockmgr, I see that spark 2.1 creates 152 files, spark 2.2 > creates >110,000 files. > In the log I see the following messages (110,000 of these): > {noformat} > 17/08/01 08:55:37 INFO UnsafeExternalSorter: Spilling data because number of > spilledRecords crossed the threshold 4096 > 17/08/01 08:55:37 INFO UnsafeExternalSorter: Thread 156 spilling sort data of > 64.1 MB to disk (0 time so far) > 17/08/01 08:55:37 INFO UnsafeExternalSorter: Spilling data because number of > spilledRecords crossed the threshold 4096 > 17/08/01 08:55:37 INFO UnsafeExternalSorter: Thread 156 spilling sort data of > 64.1 MB to disk (1 time so far) > {noformat} > So I started hunting for clues in UnsafeExternalSorter, without luck. What I > had missed was this one message: > {noformat} > 17/08/01 08:55:37 INFO ExternalAppendOnlyUnsafeRowArray: Reached spill > threshold of 4096 rows, switching to > org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter > {noformat} > Which allowed me to track down the issue. > By changing the configuration to include: > {code:java} > spark.sql.windowExec.buffer.spill.threshold 2097152 > {code} > I got it to work again and with the same performance as spark 2.1. > I have workflows where I use windowing functions that do not fail, but took a > performance hit due to the excessive spilling when using the default of 4096. > I think to make it easier to track down these issues this config variable > should be included in the configuration documentation. > Maybe 4096 is too small of a default value? -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21595) introduction of spark.sql.windowExec.buffer.spill.threshold in spark 2.2 breaks existing workflow
[ https://issues.apache.org/jira/browse/SPARK-21595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16114255#comment-16114255 ] Stephan Reiling commented on SPARK-21595: - I have tried out a couple of settings for spark.sql.windowExec.buffer.spill.threshold and I have now settled on 4M as the default for it in my work flows. This gives about the same behavior as spark 2.1. But this is dependent on the amount of spark memory and the size of the rows in the dataframe. I am not in favor of introducing another threshold for this. If the spilling is delayed, but then happens with the low threshold of 4096 rows, in my case this would still spill 110k files to disk and potentially cause a "too many open files" exception (right ?). Just looking at the spilling behavior, it would be better if the value would not specify the number of rows, but the amount of memory. So instead of 4096 rows, it would specify 500MB of memory, and then spill chunks of 500MB to disk. How many rows this is would change case by case. > introduction of spark.sql.windowExec.buffer.spill.threshold in spark 2.2 > breaks existing workflow > - > > Key: SPARK-21595 > URL: https://issues.apache.org/jira/browse/SPARK-21595 > Project: Spark > Issue Type: Bug > Components: Documentation, PySpark >Affects Versions: 2.2.0 > Environment: pyspark on linux >Reporter: Stephan Reiling >Priority: Minor > Labels: documentation, regression > > My pyspark code has the following statement: > {code:java} > # assign row key for tracking > df = df.withColumn( > 'association_idx', > sqlf.row_number().over( > Window.orderBy('uid1', 'uid2') > ) > ) > {code} > where df is a long, skinny (450M rows, 10 columns) dataframe. So this creates > one large window for the whole dataframe to sort over. > In spark 2.1 this works without problem, in spark 2.2 this fails either with > out of memory exception or too many open files exception, depending on memory > settings (which is what I tried first to fix this). > Monitoring the blockmgr, I see that spark 2.1 creates 152 files, spark 2.2 > creates >110,000 files. > In the log I see the following messages (110,000 of these): > {noformat} > 17/08/01 08:55:37 INFO UnsafeExternalSorter: Spilling data because number of > spilledRecords crossed the threshold 4096 > 17/08/01 08:55:37 INFO UnsafeExternalSorter: Thread 156 spilling sort data of > 64.1 MB to disk (0 time so far) > 17/08/01 08:55:37 INFO UnsafeExternalSorter: Spilling data because number of > spilledRecords crossed the threshold 4096 > 17/08/01 08:55:37 INFO UnsafeExternalSorter: Thread 156 spilling sort data of > 64.1 MB to disk (1 time so far) > {noformat} > So I started hunting for clues in UnsafeExternalSorter, without luck. What I > had missed was this one message: > {noformat} > 17/08/01 08:55:37 INFO ExternalAppendOnlyUnsafeRowArray: Reached spill > threshold of 4096 rows, switching to > org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter > {noformat} > Which allowed me to track down the issue. > By changing the configuration to include: > {code:java} > spark.sql.windowExec.buffer.spill.threshold 2097152 > {code} > I got it to work again and with the same performance as spark 2.1. > I have workflows where I use windowing functions that do not fail, but took a > performance hit due to the excessive spilling when using the default of 4096. > I think to make it easier to track down these issues this config variable > should be included in the configuration documentation. > Maybe 4096 is too small of a default value? -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21595) introduction of spark.sql.windowExec.buffer.spill.threshold in spark 2.2 breaks existing workflow
[ https://issues.apache.org/jira/browse/SPARK-21595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16113622#comment-16113622 ] Tejas Patil commented on SPARK-21595: - [~hvanhovell] : I am fine with either options you mentioned. one more option: Right now the (switch from in-memory to `UnsafeExternalSorter`) and (`UnsafeExternalSorter` spilling to disk) is controlled by a single threshold. If we de-couple those two using separate thresholds, then the "spill on memory pressure" behavior will be achieved. The threshold for in-memory can be kept small and keeping the spilling to disk higher will avoid excessive disk spills. This is fairly simple change to do. What do you think ? > introduction of spark.sql.windowExec.buffer.spill.threshold in spark 2.2 > breaks existing workflow > - > > Key: SPARK-21595 > URL: https://issues.apache.org/jira/browse/SPARK-21595 > Project: Spark > Issue Type: Bug > Components: Documentation, PySpark >Affects Versions: 2.2.0 > Environment: pyspark on linux >Reporter: Stephan Reiling >Priority: Minor > Labels: documentation, regression > > My pyspark code has the following statement: > {code:java} > # assign row key for tracking > df = df.withColumn( > 'association_idx', > sqlf.row_number().over( > Window.orderBy('uid1', 'uid2') > ) > ) > {code} > where df is a long, skinny (450M rows, 10 columns) dataframe. So this creates > one large window for the whole dataframe to sort over. > In spark 2.1 this works without problem, in spark 2.2 this fails either with > out of memory exception or too many open files exception, depending on memory > settings (which is what I tried first to fix this). > Monitoring the blockmgr, I see that spark 2.1 creates 152 files, spark 2.2 > creates >110,000 files. > In the log I see the following messages (110,000 of these): > {noformat} > 17/08/01 08:55:37 INFO UnsafeExternalSorter: Spilling data because number of > spilledRecords crossed the threshold 4096 > 17/08/01 08:55:37 INFO UnsafeExternalSorter: Thread 156 spilling sort data of > 64.1 MB to disk (0 time so far) > 17/08/01 08:55:37 INFO UnsafeExternalSorter: Spilling data because number of > spilledRecords crossed the threshold 4096 > 17/08/01 08:55:37 INFO UnsafeExternalSorter: Thread 156 spilling sort data of > 64.1 MB to disk (1 time so far) > {noformat} > So I started hunting for clues in UnsafeExternalSorter, without luck. What I > had missed was this one message: > {noformat} > 17/08/01 08:55:37 INFO ExternalAppendOnlyUnsafeRowArray: Reached spill > threshold of 4096 rows, switching to > org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter > {noformat} > Which allowed me to track down the issue. > By changing the configuration to include: > {code:java} > spark.sql.windowExec.buffer.spill.threshold 2097152 > {code} > I got it to work again and with the same performance as spark 2.1. > I have workflows where I use windowing functions that do not fail, but took a > performance hit due to the excessive spilling when using the default of 4096. > I think to make it easier to track down these issues this config variable > should be included in the configuration documentation. > Maybe 4096 is too small of a default value? -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21595) introduction of spark.sql.windowExec.buffer.spill.threshold in spark 2.2 breaks existing workflow
[ https://issues.apache.org/jira/browse/SPARK-21595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16113388#comment-16113388 ] Herman van Hovell commented on SPARK-21595: --- The old and the new code are not exactly the same. The old code path would start using a disk spilling buffer when a window would become larger than 4096 rows. The key difference is that old code path would not start to spill at that point, that would only happen when the Spark would get pressed for memory and the memory manager starts to force spills. The current version is overly active and starts spilling at a much earlier stage. We have seen similar problems with customer workloads on our end. We either need to set this to a more sensible default, or return this to the old behavior. > introduction of spark.sql.windowExec.buffer.spill.threshold in spark 2.2 > breaks existing workflow > - > > Key: SPARK-21595 > URL: https://issues.apache.org/jira/browse/SPARK-21595 > Project: Spark > Issue Type: Bug > Components: Documentation, PySpark >Affects Versions: 2.2.0 > Environment: pyspark on linux >Reporter: Stephan Reiling >Priority: Minor > Labels: documentation, regression > > My pyspark code has the following statement: > {code:java} > # assign row key for tracking > df = df.withColumn( > 'association_idx', > sqlf.row_number().over( > Window.orderBy('uid1', 'uid2') > ) > ) > {code} > where df is a long, skinny (450M rows, 10 columns) dataframe. So this creates > one large window for the whole dataframe to sort over. > In spark 2.1 this works without problem, in spark 2.2 this fails either with > out of memory exception or too many open files exception, depending on memory > settings (which is what I tried first to fix this). > Monitoring the blockmgr, I see that spark 2.1 creates 152 files, spark 2.2 > creates >110,000 files. > In the log I see the following messages (110,000 of these): > {noformat} > 17/08/01 08:55:37 INFO UnsafeExternalSorter: Spilling data because number of > spilledRecords crossed the threshold 4096 > 17/08/01 08:55:37 INFO UnsafeExternalSorter: Thread 156 spilling sort data of > 64.1 MB to disk (0 time so far) > 17/08/01 08:55:37 INFO UnsafeExternalSorter: Spilling data because number of > spilledRecords crossed the threshold 4096 > 17/08/01 08:55:37 INFO UnsafeExternalSorter: Thread 156 spilling sort data of > 64.1 MB to disk (1 time so far) > {noformat} > So I started hunting for clues in UnsafeExternalSorter, without luck. What I > had missed was this one message: > {noformat} > 17/08/01 08:55:37 INFO ExternalAppendOnlyUnsafeRowArray: Reached spill > threshold of 4096 rows, switching to > org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter > {noformat} > Which allowed me to track down the issue. > By changing the configuration to include: > {code:java} > spark.sql.windowExec.buffer.spill.threshold 2097152 > {code} > I got it to work again and with the same performance as spark 2.1. > I have workflows where I use windowing functions that do not fail, but took a > performance hit due to the excessive spilling when using the default of 4096. > I think to make it easier to track down these issues this config variable > should be included in the configuration documentation. > Maybe 4096 is too small of a default value? -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21595) introduction of spark.sql.windowExec.buffer.spill.threshold in spark 2.2 breaks existing workflow
[ https://issues.apache.org/jira/browse/SPARK-21595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16110053#comment-16110053 ] Tejas Patil commented on SPARK-21595: - This config was introduced by me in SPARK-13450. The reason why 4096 was used is because before the change it was using 4096 as threshold to switch to `UnsafeExternalSorter` (see WindowExec.scala in https://github.com/apache/spark/pull/16909/files). I don't have real workloads which use WINDOW operator so would defer from proposing a value but I am open to change the default value to something that works well for everyone. After you bumped up the config, how many files were generated ? I want to know what value would effectively create the same number of files as spark 2.1 did. > introduction of spark.sql.windowExec.buffer.spill.threshold in spark 2.2 > breaks existing workflow > - > > Key: SPARK-21595 > URL: https://issues.apache.org/jira/browse/SPARK-21595 > Project: Spark > Issue Type: Bug > Components: Documentation, PySpark >Affects Versions: 2.2.0 > Environment: pyspark on linux >Reporter: Stephan Reiling >Priority: Minor > Labels: documentation, regression > > My pyspark code has the following statement: > {code:java} > # assign row key for tracking > df = df.withColumn( > 'association_idx', > sqlf.row_number().over( > Window.orderBy('uid1', 'uid2') > ) > ) > {code} > where df is a long, skinny (450M rows, 10 columns) dataframe. So this creates > one large window for the whole dataframe to sort over. > In spark 2.1 this works without problem, in spark 2.2 this fails either with > out of memory exception or too many open files exception, depending on memory > settings (which is what I tried first to fix this). > Monitoring the blockmgr, I see that spark 2.1 creates 152 files, spark 2.2 > creates >110,000 files. > In the log I see the following messages (110,000 of these): > {noformat} > 17/08/01 08:55:37 INFO UnsafeExternalSorter: Spilling data because number of > spilledRecords crossed the threshold 4096 > 17/08/01 08:55:37 INFO UnsafeExternalSorter: Thread 156 spilling sort data of > 64.1 MB to disk (0 time so far) > 17/08/01 08:55:37 INFO UnsafeExternalSorter: Spilling data because number of > spilledRecords crossed the threshold 4096 > 17/08/01 08:55:37 INFO UnsafeExternalSorter: Thread 156 spilling sort data of > 64.1 MB to disk (1 time so far) > {noformat} > So I started hunting for clues in UnsafeExternalSorter, without luck. What I > had missed was this one message: > {noformat} > 17/08/01 08:55:37 INFO ExternalAppendOnlyUnsafeRowArray: Reached spill > threshold of 4096 rows, switching to > org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter > {noformat} > Which allowed me to track down the issue. > By changing the configuration to include: > {code:java} > spark.sql.windowExec.buffer.spill.threshold 2097152 > {code} > I got it to work again and with the same performance as spark 2.1. > I have workflows where I use windowing functions that do not fail, but took a > performance hit due to the excessive spilling when using the default of 4096. > I think to make it easier to track down these issues this config variable > should be included in the configuration documentation. > Maybe 4096 is too small of a default value? -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org