sunriseXDM opened a new issue, #30222:
URL: https://github.com/apache/doris/issues/30222
环境:
spark2.3
doris2.0.3
connector:spark-doris-connector-2.3_2.11-1.2.0
报错内容:
24/01/19 14:52:20 INFO BackendClient: Success connect to Doris
BE{host='192.168.1.11', port=9060}.
24/01/19 14:52:20 INFO BackendClient: Success connect to Doris
BE{host='192.168.1.11', port=9060}.
24/01/19 14:52:20 ERROR RowBatch: Schema size '1' is not equal to arrow
field size '2'.
24/01/19 14:52:20 ERROR RowBatch: Schema size '1' is not equal to arrow
field size '2'.
24/01/19 14:52:20 ERROR RowBatch: Read Doris Data failed because:
org.apache.doris.spark.exception.DorisException: Load Doris data failed,
schema size of fetch data is wrong.
at
org.apache.doris.spark.serialization.RowBatch.<init>(RowBatch.java:99)
at
org.apache.doris.spark.rdd.ScalaValueReader.hasNext(ScalaValueReader.scala:210)
at
org.apache.doris.spark.rdd.AbstractDorisRDDIterator.hasNext(AbstractDorisRDDIterator.scala:56)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
at
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage10.agg_doAggregateWithKeys_0$(Unknown
Source)
at
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage10.processNext(Unknown
Source)
at
org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
at
org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$10$$anon$1.hasNext(WholeStageCodegenExec.scala:614)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
at
org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:125)
at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)
at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
at org.apache.spark.scheduler.Task.run(Task.scala:109)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
排查发现,在github上查到doris-flink-connector
1.5.1修复了这个问题。链接:https://github.com/apache/doris-flink-connector/pull/261
请问spark-doris-connector也有这个问题吗?而生产在用spark2导入doris2.0.2没有这个问题
_Originally posted by @sunriseXDM in
https://github.com/apache/doris/discussions/30134_
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]