szehon-ho commented on code in PR #56255:
URL: https://github.com/apache/spark/pull/56255#discussion_r3351038918
##########
sql/pipelines/src/test/scala/org/apache/spark/sql/pipelines/graph/AutoCdcScd1AuxiliaryTableDurabilitySuite.scala:
##########
@@ -52,18 +52,11 @@ class AutoCdcScd1AuxiliaryTableDurabilitySuite
// resume cleanly.
val changeDataFeedStream = MemoryStream[(Int, String, Long)]
def buildGraphRegistrationContext(): TestGraphRegistrationContext =
- new TestGraphRegistrationContext(spark) {
- registerTable("target", catalog = Some(catalog), database =
Some(namespace))
- registerFlow(autoCdcFlow(
- name = "auto_cdc_flow",
- target = "target",
- query = dfFlowFunc(
- changeDataFeedStream.toDF().toDF("id", "name", "version")
- ),
- keys = Seq("id"),
- sequencing = functions.col("version")
- ))
- }
+ singleAutoCdcFlowPipeline(
+ "auto_cdc_flow",
+ "target",
+ changeDataFeedStream.toDF().toDF("id", "name", "version"),
+ Seq("id"))
Review Comment:
Good call, done. I removed the default `sequencing` argument from
`singleAutoCdcFlowPipeline` so each test declares its sequence explicitly, and
switched all call sites across the AutoCDC SCD1 E2E suites to named arguments
so the literals read declaratively.
##########
sql/pipelines/src/main/scala/org/apache/spark/sql/pipelines/graph/FlowExecution.scala:
##########
@@ -461,6 +464,12 @@ trait AutoCdcMergeWriteBase {
}
}
+ /**
+ * Returns the resolved AutoCDC key column names as they appear in the
auxiliary schema, in
+ * `changeArgs.keys` declaration order.
+ */
+ private def auxiliaryKeyColumnNames: Seq[String] =
expectedAuxiliaryKeyFields.map(_.name)
Review Comment:
Done, made both `auxiliaryKeyColumnNames` and the underlying
`expectedAuxiliaryKeyFields` lazy vals so the resolution is computed once per
instance.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]