[GitHub] [spark] bersprockets commented on pull request #42206: [SPARK-44582][SQL] Skip iterator on SMJ if it was cleaned up

via GitHub Tue, 01 Aug 2023 13:20:46 -0700


bersprockets commented on PR #42206:
URL: https://github.com/apache/spark/pull/42206#issuecomment-1661030654


   While I have not yet looked at the solution provided here, I can confirm 
that the bug does indeed exist on the master branch.
   
   I was able to reproduce, but I had to have some debug statements for the 
timing to be right. My debug statements don't change logic or access any data 
structures (other than thread name), so it's not changing behavior, just 
timing. In more complex jobs with lots going on in the JVM, the timing might be 
right without debug statements.
   
   To get the SIGSEGV, the WindowExec iterator needs a `nextRow` value (called 
X in the PR description) whose internal fields are pointing to offheap memory 
held (indirectly) by `SortExec`. When the generated SMJ code calls 
`cleanUpResources` because the right side is empty, the offheap memory held 
(indirectly) by `SortExec` gets released.
   
   Later, when the `WindowExec` iterator calls `buffer.add(nextRow)`, the 
buffer tries to copy data from the cleaned up offheap page pointed at by 
`nextRow`. If you're lucky (which you usually are), the memory is still part of 
the JVM's process, so there's no crash. However, if that memory has been 
removed from the JVM's process, you get the SIGSEGV:
   ```
   bash-3.2$ bin/spark-sql --driver-memory 450m --conf 
spark.memory.offHeap.enabled=true --conf spark.memory.offHeap.size=3m --master 
"local[2]" --conf spark.sql.shuffle.partitions=2
   ...
   spark-sql (default)> create or replace temp view leftside as
   select a, b, c, sum(c) over(partition by a order by b range between 2 
preceding and current row) as sumc
   from t1;
   Time taken: 1.692 seconds
   spark-sql (default)> create or replace temp view rightside as
   select a, b, c, sum(c) over(partition by a order by b range between 2 
preceding and current row) as sumc
   from t2;
   Time taken: 0.082 seconds
   spark-sql (default)> set spark.sql.adaptive.enabled=false;
   spark.sql.adaptive.enabled   false
   Time taken: 0.11 seconds, Fetched 1 row(s)
   spark-sql (default)> set spark.sql.autoBroadcastJoinThreshold=-1;
   spark.sql.autoBroadcastJoinThreshold -1
   Time taken: 0.018 seconds, Fetched 1 row(s)
   spark-sql (default)> create or replace temp view joined as
   select l.a, l.sumc, r.a as ra, r.sumc as rsumc
   from leftside l
   join rightside r
   on l.a = r.a;
   Time taken: 0.178 seconds
   spark-sql (default)> drop table if exists myoutput; create table myoutput 
stored as parquet as select * from joined;
   23/08/01 12:50:00 WARN ObjectStore: Failed to get database global_temp, 
returning NoSuchObjectException
   Time taken: 0.571 seconds
   23/08/01 12:50:00 WARN SessionState: METASTORE_FILTER_HOOK will be ignored, 
since hive.security.authorization.manager is set to instance of 
HiveAuthorizerFactory.
   23/08/01 12:50:01 WARN HiveConf: HiveConf of name 
hive.internal.ss.authz.settings.applied.marker does not exist
   23/08/01 12:50:01 WARN HiveConf: HiveConf of name hive.stats.jdbc.timeout 
does not exist
   23/08/01 12:50:01 WARN HiveConf: HiveConf of name hive.stats.retries.wait 
does not exist
   23/08/01 12:50:01 WARN HiveMetaStore: Location: 
file:/Users/bruce/github/spark_fork_smj_issue/spark-warehouse/myoutput 
specified for non-external table:myoutput
   Executor task launch worker for task 1.0 in stage 2.0 (TID 3) &&& sorter 
cleanupResources
   Executor task launch worker for task 0.0 in stage 2.0 (TID 2) &&& sorter 
cleanupResources
   Executor task launch worker for task 1.0 in stage 2.0 (TID 3) >>> In 
processRows
   Executor task launch worker for task 1.0 in stage 2.0 (TID 3) ^^^ WindowExec 
fetching next partition
   Executor task launch worker for task 1.0 in stage 2.0 (TID 3) %%% Adding row 
to buffer
   Executor task launch worker for task 1.0 in stage 2.0 (TID 3) ^^^ WindowExec 
getting next from buffer
   Executor task launch worker for task 1.0 in stage 2.0 (TID 3) +++ Getting 
next streamed row
   Executor task launch worker for task 1.0 in stage 2.0 (TID 3) +++ Returning 
false
   Executor task launch worker for task 1.0 in stage 2.0 (TID 3) >>> Performing 
eager cleanup
   Executor task launch worker for task 1.0 in stage 2.0 (TID 3) &&& sorter 
cleanupResources
   Executor task launch worker for task 1.0 in stage 2.0 (TID 3) &&& sorter 
cleanupResources
   Executor task launch worker for task 1.0 in stage 2.0 (TID 3) >>> In 
processRows
   Executor task launch worker for task 1.0 in stage 2.0 (TID 3) ^^^ WindowExec 
fetching next partition
   Executor task launch worker for task 1.0 in stage 2.0 (TID 3) %%% Adding row 
to buffer
   #
   # A fatal error has been detected by the Java Runtime Environment:
   #
   #  SIGSEGV (0xb) at pc=0x000000010f4c19a0, pid=96472, tid=52259
   #
   # JRE version: Java(TM) SE Runtime Environment 18.9 (11.0.12+8) (build 
11.0.12+8-LTS-237)
   # Java VM: Java HotSpot(TM) 64-Bit Server VM 18.9 (11.0.12+8-LTS-237, mixed 
mode, tiered, compressed oops, g1 gc, bsd-amd64)
   # Problematic frame:
   # V  [libjvm.dylib+0x1329a0]  acl_CopyRight+0x29
   #
   ...
   ```
   The error report stack trace looks slightly different than that in the PR 
description, because I am running this on the master branch (which uses 
`WindowEvaluatorFactory`), but it dies in the same place 
(`ExternalAppendOnlyUnsafeRowArray.add`):
   ```
   Stack: [0x0000700010666000,0x0000700010766000],  sp=0x0000700010764308,  
free space=1016k
   Native frames: (J=compiled Java code, A=aot compiled Java code, 
j=interpreted, Vv=VM code, C=native code)
   V  [libjvm.dylib+0x1329a0]  acl_CopyRight+0x29
   J 4228  
jdk.internal.misc.Unsafe.copyMemory0(Ljava/lang/Object;JLjava/lang/Object;JJ)V 
[email protected] (0 bytes) @ 0x00000001249c88b4 
[0x00000001249c87c0+0x00000000000000f4]
   J 4226 c1 
jdk.internal.misc.Unsafe.copyMemory(Ljava/lang/Object;JLjava/lang/Object;JJ)V 
[email protected] (33 bytes) @ 0x000000011da6e92c 
[0x000000011da6e0e0+0x000000000000084c]
   j  sun.misc.Unsafe.copyMemory(Ljava/lang/Object;JLjava/lang/Object;JJ)V+11 
[email protected]
   j  
org.apache.spark.unsafe.Platform.copyMemory(Ljava/lang/Object;JLjava/lang/Object;JJ)V+34
   j  
org.apache.spark.sql.catalyst.expressions.UnsafeRow.copy()Lorg/apache/spark/sql/catalyst/expressions/UnsafeRow;+37
   j  
org.apache.spark.sql.execution.ExternalAppendOnlyUnsafeRowArray.add(Lorg/apache/spark/sql/catalyst/expressions/UnsafeRow;)V+48
   j  
org.apache.spark.sql.execution.window.WindowEvaluatorFactory$WindowPartitionEvaluator$$anon$1.fetchNextPartition()V+55
   j  
org.apache.spark.sql.execution.window.WindowEvaluatorFactory$WindowPartitionEvaluator$$anon$1.next()Lorg/apache/spark/sql/catalyst/InternalRow;+61
   j  
org.apache.spark.sql.execution.window.WindowEvaluatorFactory$WindowPartitionEvaluator$$anon$1.next()Ljava/lang/Object;+1
   j  
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage3.processNext()V+7
   j  org.apache.spark.sql.execution.BufferedRowIterator.hasNext()Z+11
   j  
org.apache.spark.sql.execution.WholeStageCodegenEvaluatorFactory$WholeStageCodegenPartitionEvaluator$$anon$1.hasNext()Z+4
   j  
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage7.smj_findNextJoinRows_0$(Lorg/apache/spark/sql/catalyst/expressions/GeneratedClass$GeneratedIteratorForCodegenStage7;Lscala/collection/Iterator;Lscala/collection/Iterator;)Z+11
   j  
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage7.processNext()V+412
   j  org.apache.spark.sql.execution.BufferedRowIterator.hasNext()Z+11
   j  
org.apache.spark.sql.execution.WholeStageCodegenEvaluatorFactory$WholeStageCodegenPartitionEvaluator$$anon$1.hasNext()Z+4
   j  
org.apache.spark.sql.execution.datasources.FileFormatDataWriter.writeWithIterator(Lscala/collection/Iterator;)V+3
   j  
org.apache.spark.sql.execution.datasources.FileFormatWriter$.$anonfun$executeTask$1(Lorg/apache/spark/sql/execution/datasources/FileFormatDataWriter;Lscala/collection/Iterator;)Lorg/apache/spark/sql/execution/datasources/WriteTaskResult;+2
   ...
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] bersprockets commented on pull request #42206: [SPARK-44582][SQL] Skip iterator on SMJ if it was cleaned up

Reply via email to