[
https://issues.apache.org/jira/browse/SPARK-13286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15218900#comment-15218900
]
Paul Zaczkieiwcz commented on SPARK-13286:
------------------------------------------
I'm seeing this in my production code that used to work in Spark 1.5.1. I
understand that spark 1.6.1 switched from doing individual INSERTs in its JDBC
output to batching 1000 INSERTs at a time. I can't even reproduce the error by
running the same INSERT statement directly in psql (I'm outputting to
Postgres). I'm at a loss what could be causing this, particularly with
SaveMode.Overwrite. There isn't any possibility of a schema mismatch unless
there is a race condition causing the workers to try to insert into the table
before it exists.
> JDBC driver doesn't report full exception
> -----------------------------------------
>
> Key: SPARK-13286
> URL: https://issues.apache.org/jira/browse/SPARK-13286
> Project: Spark
> Issue Type: Improvement
> Components: SQL
> Affects Versions: 1.6.0
> Reporter: Adrian Bridgett
> Priority: Minor
>
> Testing some failure scenarios (inserting data into postgresql where there is
> a schema mismatch) , there is an exception thrown (fine so far) however it
> doesn't report the actual SQL error. It refers to a getNextException call
> but this is beyond my non-existant Java skills to deal with correctly.
> Supporting this would help users to see the SQL error quickly and resolve the
> underlying problem.
> {noformat}
> Caused by: java.sql.BatchUpdateException: Batch entry 0 INSERT INTO core
> VALUES('5fdf5...',....) was aborted. Call getNextException to see the cause.
> at
> org.postgresql.jdbc2.AbstractJdbc2Statement$BatchResultHandler.handleError(AbstractJdbc2Statement.java:2746)
> at
> org.postgresql.core.v3.QueryExecutorImpl$1.handleError(QueryExecutorImpl.java:457)
> at
> org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1887)
> at
> org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:405)
> at
> org.postgresql.jdbc2.AbstractJdbc2Statement.executeBatch(AbstractJdbc2Statement.java:2893)
> at
> org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.savePartition(JdbcUtils.scala:185)
> at
> org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$saveTable$1.apply(JdbcUtils.scala:248)
> at
> org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$saveTable$1.apply(JdbcUtils.scala:247)
> at
> org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$33.apply(RDD.scala:920)
> at
> org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$33.apply(RDD.scala:920)
> at
> org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1858)
> at
> org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1858)
> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
> at org.apache.spark.scheduler.Task.run(Task.scala:89)
> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]