AnishMahto commented on code in PR #56255:
URL: https://github.com/apache/spark/pull/56255#discussion_r3345189596
##########
sql/pipelines/src/main/scala/org/apache/spark/sql/pipelines/graph/FlowExecution.scala:
##########
@@ -461,6 +464,12 @@ trait AutoCdcMergeWriteBase {
}
}
+ /**
+ * Returns the resolved AutoCDC key column names as they appear in the
auxiliary schema, in
+ * `changeArgs.keys` declaration order.
+ */
+ private def auxiliaryKeyColumnNames: Seq[String] =
expectedAuxiliaryKeyFields.map(_.name)
Review Comment:
optional: We can make this a lazy val instead, I could've actually done that
in the original PR.
##########
sql/pipelines/src/test/scala/org/apache/spark/sql/pipelines/graph/AutoCdcScd1AuxiliaryTableDurabilitySuite.scala:
##########
@@ -52,18 +52,11 @@ class AutoCdcScd1AuxiliaryTableDurabilitySuite
// resume cleanly.
val changeDataFeedStream = MemoryStream[(Int, String, Long)]
def buildGraphRegistrationContext(): TestGraphRegistrationContext =
- new TestGraphRegistrationContext(spark) {
- registerTable("target", catalog = Some(catalog), database =
Some(namespace))
- registerFlow(autoCdcFlow(
- name = "auto_cdc_flow",
- target = "target",
- query = dfFlowFunc(
- changeDataFeedStream.toDF().toDF("id", "name", "version")
- ),
- keys = Seq("id"),
- sequencing = functions.col("version")
- ))
- }
+ singleAutoCdcFlowPipeline(
+ "auto_cdc_flow",
+ "target",
+ changeDataFeedStream.toDF().toDF("id", "name", "version"),
+ Seq("id"))
Review Comment:
optional: I like the `singleAutoCdcFlowPipeline` abstraction but I would
lean towards using named arguments when invoking here and below, and also avoid
supplying default argument for `sequencing` in `singleAutoCdcFlowPipeline`.
It's not obvious to readers what the string literals mean otherwise, and I
actually prefer being explicit about the the sequence per-test (at the cost of
redundancy). I'm of the opinion that tests should read declaratively.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]