[jira] [Commented] (SPARK-35635) concurrent insert statements from multiple beeline fail with job aborted exception
[ https://issues.apache.org/jira/browse/SPARK-35635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17741309#comment-17741309 ] jeanlyn commented on SPARK-35635: - When the tasks are running concurrently, the "_temporary" will be attempted to be deleted multiple times, which may result in job failure. Is it more appropriate to reopen this issue? [~gurwls223] > concurrent insert statements from multiple beeline fail with job aborted > exception > -- > > Key: SPARK-35635 > URL: https://issues.apache.org/jira/browse/SPARK-35635 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.1.1 > Environment: Spark 3.1.1 >Reporter: Chetan Bhat >Priority: Minor > > Create tables - > CREATE TABLE J1_TBL ( > i integer, > j integer, > t string > ) USING parquet; > CREATE TABLE J2_TBL ( > i integer, > k integer > ) USING parquet; > From 4 concurrent beeline sessions execute the insert into select queries - > INSERT INTO J1_TBL VALUES (1, 4, 'one'); > INSERT INTO J1_TBL VALUES (2, 3, 'two'); > INSERT INTO J1_TBL VALUES (3, 2, 'three'); > INSERT INTO J1_TBL VALUES (4, 1, 'four'); > INSERT INTO J1_TBL VALUES (5, 0, 'five'); > INSERT INTO J1_TBL VALUES (6, 6, 'six'); > INSERT INTO J1_TBL VALUES (7, 7, 'seven'); > INSERT INTO J1_TBL VALUES (8, 8, 'eight'); > INSERT INTO J1_TBL VALUES (0, NULL, 'zero'); > INSERT INTO J1_TBL VALUES (NULL, NULL, 'null'); > INSERT INTO J1_TBL VALUES (NULL, 0, 'zero'); > INSERT INTO J2_TBL VALUES (1, -1); > INSERT INTO J2_TBL VALUES (2, 2); > INSERT INTO J2_TBL VALUES (3, -3); > INSERT INTO J2_TBL VALUES (2, 4); > INSERT INTO J2_TBL VALUES (5, -5); > INSERT INTO J2_TBL VALUES (5, -5); > INSERT INTO J2_TBL VALUES (0, NULL); > INSERT INTO J2_TBL VALUES (NULL, NULL); > INSERT INTO J2_TBL VALUES (NULL, 0); > > Issue : concurrent insert statements from multiple beeline fail with job > aborted exception. > 0: jdbc:hive2://10.19.89.222:23040/> INSERT INTO J1_TBL VALUES (8, 8, > 'eight'); > Error: org.apache.hive.service.cli.HiveSQLException: Error running query: > org.apache.spark.SparkException: Job aborted. > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:366) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.$anonfun$run$2(SparkExecuteStatementOperation.scala:263) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3$$Lambda$1781/750578465.apply$mcV$sp(Unknown > Source) > at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) > at > org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties(SparkOperation.scala:78) > at > org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties$(SparkOperation.scala:62) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.withLocalProperties(SparkExecuteStatementOperation.scala:45) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:263) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:258) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2.run(SparkExecuteStatementOperation.scala:272) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: org.apache.spark.SparkException: Job aborted. > at > org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:231) > at > org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.run(InsertIntoHadoopFsRelationCommand.scala:188) > at > org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult$lzycompute(commands.scala:109) > at > org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult(commands.scala:107) > at > org.apache.spark.sql.execution.command.DataWritingCommandExec.executeCollect(commands.scala:121) > at org.apache.spark.sql.Dataset.$anonfun$logicalPlan$1(Dataset.scala:228) > at
[jira] [Commented] (SPARK-35635) concurrent insert statements from multiple beeline fail with job aborted exception
[ https://issues.apache.org/jira/browse/SPARK-35635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17741296#comment-17741296 ] jeanlyn commented on SPARK-35635: - We encounter the same issue when concurrent writing in deference partition on same table. > concurrent insert statements from multiple beeline fail with job aborted > exception > -- > > Key: SPARK-35635 > URL: https://issues.apache.org/jira/browse/SPARK-35635 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.1.1 > Environment: Spark 3.1.1 >Reporter: Chetan Bhat >Priority: Minor > > Create tables - > CREATE TABLE J1_TBL ( > i integer, > j integer, > t string > ) USING parquet; > CREATE TABLE J2_TBL ( > i integer, > k integer > ) USING parquet; > From 4 concurrent beeline sessions execute the insert into select queries - > INSERT INTO J1_TBL VALUES (1, 4, 'one'); > INSERT INTO J1_TBL VALUES (2, 3, 'two'); > INSERT INTO J1_TBL VALUES (3, 2, 'three'); > INSERT INTO J1_TBL VALUES (4, 1, 'four'); > INSERT INTO J1_TBL VALUES (5, 0, 'five'); > INSERT INTO J1_TBL VALUES (6, 6, 'six'); > INSERT INTO J1_TBL VALUES (7, 7, 'seven'); > INSERT INTO J1_TBL VALUES (8, 8, 'eight'); > INSERT INTO J1_TBL VALUES (0, NULL, 'zero'); > INSERT INTO J1_TBL VALUES (NULL, NULL, 'null'); > INSERT INTO J1_TBL VALUES (NULL, 0, 'zero'); > INSERT INTO J2_TBL VALUES (1, -1); > INSERT INTO J2_TBL VALUES (2, 2); > INSERT INTO J2_TBL VALUES (3, -3); > INSERT INTO J2_TBL VALUES (2, 4); > INSERT INTO J2_TBL VALUES (5, -5); > INSERT INTO J2_TBL VALUES (5, -5); > INSERT INTO J2_TBL VALUES (0, NULL); > INSERT INTO J2_TBL VALUES (NULL, NULL); > INSERT INTO J2_TBL VALUES (NULL, 0); > > Issue : concurrent insert statements from multiple beeline fail with job > aborted exception. > 0: jdbc:hive2://10.19.89.222:23040/> INSERT INTO J1_TBL VALUES (8, 8, > 'eight'); > Error: org.apache.hive.service.cli.HiveSQLException: Error running query: > org.apache.spark.SparkException: Job aborted. > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:366) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.$anonfun$run$2(SparkExecuteStatementOperation.scala:263) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3$$Lambda$1781/750578465.apply$mcV$sp(Unknown > Source) > at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) > at > org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties(SparkOperation.scala:78) > at > org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties$(SparkOperation.scala:62) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.withLocalProperties(SparkExecuteStatementOperation.scala:45) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:263) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:258) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2.run(SparkExecuteStatementOperation.scala:272) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: org.apache.spark.SparkException: Job aborted. > at > org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:231) > at > org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.run(InsertIntoHadoopFsRelationCommand.scala:188) > at > org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult$lzycompute(commands.scala:109) > at > org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult(commands.scala:107) > at > org.apache.spark.sql.execution.command.DataWritingCommandExec.executeCollect(commands.scala:121) > at org.apache.spark.sql.Dataset.$anonfun$logicalPlan$1(Dataset.scala:228) > at org.apache.spark.sql.Dataset$$Lambda$1650/1168893915.apply(Unknown Source) > at
[jira] [Commented] (SPARK-35635) concurrent insert statements from multiple beeline fail with job aborted exception
[ https://issues.apache.org/jira/browse/SPARK-35635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17358356#comment-17358356 ] Chetan Bhat commented on SPARK-35635: - Yes thats the issue. That has to be taken care from the system during concurrent query execution. > concurrent insert statements from multiple beeline fail with job aborted > exception > -- > > Key: SPARK-35635 > URL: https://issues.apache.org/jira/browse/SPARK-35635 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.1.1 > Environment: Spark 3.1.1 >Reporter: Chetan Bhat >Priority: Minor > > Create tables - > CREATE TABLE J1_TBL ( > i integer, > j integer, > t string > ) USING parquet; > CREATE TABLE J2_TBL ( > i integer, > k integer > ) USING parquet; > From 4 concurrent beeline sessions execute the insert into select queries - > INSERT INTO J1_TBL VALUES (1, 4, 'one'); > INSERT INTO J1_TBL VALUES (2, 3, 'two'); > INSERT INTO J1_TBL VALUES (3, 2, 'three'); > INSERT INTO J1_TBL VALUES (4, 1, 'four'); > INSERT INTO J1_TBL VALUES (5, 0, 'five'); > INSERT INTO J1_TBL VALUES (6, 6, 'six'); > INSERT INTO J1_TBL VALUES (7, 7, 'seven'); > INSERT INTO J1_TBL VALUES (8, 8, 'eight'); > INSERT INTO J1_TBL VALUES (0, NULL, 'zero'); > INSERT INTO J1_TBL VALUES (NULL, NULL, 'null'); > INSERT INTO J1_TBL VALUES (NULL, 0, 'zero'); > INSERT INTO J2_TBL VALUES (1, -1); > INSERT INTO J2_TBL VALUES (2, 2); > INSERT INTO J2_TBL VALUES (3, -3); > INSERT INTO J2_TBL VALUES (2, 4); > INSERT INTO J2_TBL VALUES (5, -5); > INSERT INTO J2_TBL VALUES (5, -5); > INSERT INTO J2_TBL VALUES (0, NULL); > INSERT INTO J2_TBL VALUES (NULL, NULL); > INSERT INTO J2_TBL VALUES (NULL, 0); > > Issue : concurrent insert statements from multiple beeline fail with job > aborted exception. > 0: jdbc:hive2://10.19.89.222:23040/> INSERT INTO J1_TBL VALUES (8, 8, > 'eight'); > Error: org.apache.hive.service.cli.HiveSQLException: Error running query: > org.apache.spark.SparkException: Job aborted. > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:366) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.$anonfun$run$2(SparkExecuteStatementOperation.scala:263) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3$$Lambda$1781/750578465.apply$mcV$sp(Unknown > Source) > at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) > at > org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties(SparkOperation.scala:78) > at > org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties$(SparkOperation.scala:62) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.withLocalProperties(SparkExecuteStatementOperation.scala:45) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:263) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:258) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2.run(SparkExecuteStatementOperation.scala:272) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: org.apache.spark.SparkException: Job aborted. > at > org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:231) > at > org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.run(InsertIntoHadoopFsRelationCommand.scala:188) > at > org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult$lzycompute(commands.scala:109) > at > org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult(commands.scala:107) > at > org.apache.spark.sql.execution.command.DataWritingCommandExec.executeCollect(commands.scala:121) > at org.apache.spark.sql.Dataset.$anonfun$logicalPlan$1(Dataset.scala:228) > at org.apache.spark.sql.Dataset$$Lambda$1650/1168893915.apply(Unknown Source) > at