[ https://issues.apache.org/jira/browse/IGNITE-10314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16718610#comment-16718610 ]
Ray commented on IGNITE-10314: ------------------------------ [~NIzhikov] I have implemented the refreshFields using internal API after Vladimir confirmed in the dev list. But when running tests in IgniteDataFrameSchemaSpec, there's some odd exception. Exception in thread "main" java.lang.AssertionError: assertion failed: each serializer expression should contain at least one `BoundReference` at scala.Predef$.assert(Predef.scala:170) at org.apache.spark.sql.catalyst.encoders.ExpressionEncoder$$anonfun$11.apply(ExpressionEncoder.scala:238) at org.apache.spark.sql.catalyst.encoders.ExpressionEncoder$$anonfun$11.apply(ExpressionEncoder.scala:236) at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) at scala.collection.immutable.List.foreach(List.scala:392) at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241) at scala.collection.immutable.List.flatMap(List.scala:355) at org.apache.spark.sql.catalyst.encoders.ExpressionEncoder.<init>(ExpressionEncoder.scala:236) at org.apache.spark.sql.catalyst.encoders.RowEncoder$.apply(RowEncoder.scala:63) at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:75) at org.apache.spark.sql.SparkSession.baseRelationToDataFrame(SparkSession.scala:428) at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:233) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:227) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:164) at org.apache.ignite.spark.IgniteDataFrameSchemaSpec.beforeAll(IgniteDataFrameSchemaSpec.scala:122) at org.scalatest.BeforeAndAfterAll$class.beforeAll(BeforeAndAfterAll.scala:187) at org.apache.ignite.spark.AbstractDataFrameSpec.beforeAll(AbstractDataFrameSpec.scala:39) at org.scalatest.BeforeAndAfterAll$class.run(BeforeAndAfterAll.scala:253) at org.apache.ignite.spark.AbstractDataFrameSpec.org$scalatest$BeforeAndAfter$$super$run(AbstractDataFrameSpec.scala:39) at org.scalatest.BeforeAndAfter$class.run(BeforeAndAfter.scala:241) at org.apache.ignite.spark.AbstractDataFrameSpec.run(AbstractDataFrameSpec.scala:39) at org.scalatest.junit.JUnitRunner.run(JUnitRunner.scala:99) at org.junit.runner.JUnitCore.run(JUnitCore.java:160) at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68) at com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:47) at com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:242) at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70) Can you take a look please? I add breakpoint at refreshFields method, and this method is working fine, the latest fields are in the map. > Spark dataframe will get wrong schema if user executes add/drop column DDL > -------------------------------------------------------------------------- > > Key: IGNITE-10314 > URL: https://issues.apache.org/jira/browse/IGNITE-10314 > Project: Ignite > Issue Type: Bug > Components: spark > Affects Versions: 2.3, 2.4, 2.5, 2.6, 2.7 > Reporter: Ray > Assignee: Ray > Priority: Critical > Fix For: 2.8 > > > When user performs add/remove column in DDL, Spark will get the old/wrong > schema. > > Analyse > Currently Spark data frame API relies on QueryEntity to construct schema, but > QueryEntity in QuerySchema is a local copy of the original QueryEntity, so > the original QueryEntity is not updated when modification happens. > > Solution > Get the latest schema using JDBC thin driver's column metadata call, then > update fields in QueryEntity. -- This message was sent by Atlassian JIRA (v7.6.3#76005)