This is an automated email from the ASF dual-hosted git repository.
github-merge-queue[bot] pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/texera.git
The following commit(s) were added to refs/heads/main by this push:
new 357fed0076 test(workflow-core): add unit test coverage for
VirtualCollection + ReadonlyVirtualDocument (#5740)
357fed0076 is described below
commit 357fed00765169b280136803861fb1f59ec6589d
Author: Xinyuan Lin <[email protected]>
AuthorDate: Sat Jun 20 16:14:48 2026 -0700
test(workflow-core): add unit test coverage for VirtualCollection +
ReadonlyVirtualDocument (#5740)
### What changes were proposed in this PR?
Pin the abstract-base / trait contracts for two storage-model classes in
`common/workflow-core/core/storage/model/`. Tested via minimal in-memory
test-only concrete subclasses (no temp-file harness needed). No
production-code changes.
| Spec | Source class | Tests |
| --- | --- | --- |
| `VirtualCollectionSpec` | `VirtualCollection` (abstract class) | 7 |
| `ReadonlyVirtualDocumentSpec` | `ReadonlyVirtualDocument[T]` (trait) |
11 |
Both spec files follow the `<srcClassName>Spec.scala` one-to-one
convention.
**Behavior pinned — `VirtualCollection`**
| Surface | Contract |
| --- | --- |
| Abstract members (`getURI` / `getDocuments` / `getDocument` /
`remove`) | pinned via a minimal `LinkedHashMap`-backed test-only
subclass |
| `getDocuments` after two `addChild` calls | yields both documents in
insertion order (URI sequence pinned to catch a `HashMap`-backed
regression) |
| `getDocument(name)` for a registered name | returns the same
`VirtualDocument[_]` reference (no copy) |
| `getDocument(name)` for an unregistered name | impl-defined throw
(stub raises `NoSuchElementException`) |
| `remove()` | clears the child set (impl-defined side effect; mirrors
production "delete folder" semantics) |
| Pattern matching | `case _: VirtualCollection` matches a concrete
impl; does NOT match an unrelated type (`String`) |
**Behavior pinned — `ReadonlyVirtualDocument[T]`**
| Surface | Contract |
| --- | --- |
| Eight abstract methods (`getURI` / `getItem` / `get` / `getRange` /
`getAfter` / `getCount` / `asInputStream` / `asFile`) | each is
observable through a stub concrete subclass returning sentinel values |
| `getRange(from, until)` | half-open interval — `until` is exclusive |
| `getRange` call-site | `columns: Option[Seq[String]] = None` default
lets the third argument be omitted |
| `getAfter(offset)` | skips the item AT `offset` and yields every
strictly-later item |
| `asInputStream()` | non-null `InputStream` reading the impl-supplied
bytes (read-1 / read-2 / EOF round-trip) |
| `asFile()` | non-null `File` |
| `getCount` | typed as `Long` (compile-time enforced) |
| Type parameter `T` | preserved across `getItem` / `get` / `getRange` /
`getAfter` (verified with `T = Int`) |
| Pattern matching | `case _: ReadonlyVirtualDocument[_]` matches a
concrete impl |
### Notes
`ReadonlyLocalFileDocument` (the concrete `private[storage]` impl of
`ReadonlyVirtualDocument[Nothing]`) is **deliberately out of scope** for
this PR — an existing spec at
`amber/src/test/scala/org/apache/texera/workflow/common/storage/ReadonlyLocalFileDocumentSpec.scala`
already pins its URI / `asInputStream` round-trip / `asFile` / five
`NotImplementedError` cases via the
`DocumentFactory.openReadonlyDocument(URI)` factory. Adding a parallel
spec here would be the duplicate-spec failure mode the spec-filename
convention is designed to prevent.
### Any related issues, documentation, discussions?
Closes #5736.
### How was this PR tested?
Pure unit-test additions; verified locally with:
- `sbt "WorkflowCore/testOnly
org.apache.texera.amber.core.storage.model.VirtualCollectionSpec
org.apache.texera.amber.core.storage.model.ReadonlyVirtualDocumentSpec"`
— 18 tests, all green
- `sbt scalafmtCheckAll` — clean
- CI to confirm
### Was this PR authored or co-authored using generative AI tooling?
Generated-by: Claude Code (Opus 4.7 [1M context])
---
.../model/ReadonlyVirtualDocumentSpec.scala | 167 +++++++++++++++++++++
.../core/storage/model/VirtualCollectionSpec.scala | 151 +++++++++++++++++++
2 files changed, 318 insertions(+)
diff --git
a/common/workflow-core/src/test/scala/org/apache/texera/amber/core/storage/model/ReadonlyVirtualDocumentSpec.scala
b/common/workflow-core/src/test/scala/org/apache/texera/amber/core/storage/model/ReadonlyVirtualDocumentSpec.scala
new file mode 100644
index 0000000000..ee0beaa758
--- /dev/null
+++
b/common/workflow-core/src/test/scala/org/apache/texera/amber/core/storage/model/ReadonlyVirtualDocumentSpec.scala
@@ -0,0 +1,167 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.texera.amber.core.storage.model
+
+import org.scalatest.flatspec.AnyFlatSpec
+
+import java.io.{ByteArrayInputStream, File, InputStream}
+import java.net.URI
+
+class ReadonlyVirtualDocumentSpec extends AnyFlatSpec {
+
+ //
---------------------------------------------------------------------------
+ // Stub impl — provides one sentinel value per accessor so each abstract
+ // method can be exercised in isolation, plus an int-typed variant so
+ // the type parameter `T` is observed end-to-end.
+ //
---------------------------------------------------------------------------
+
+ private class StubReadonlyIntDoc(items: IndexedSeq[Int]) extends
ReadonlyVirtualDocument[Int] {
+ override def getURI: URI = new URI("file:///stub/int")
+ override def getItem(i: Int): Int = items(i)
+ override def get(): Iterator[Int] = items.iterator
+ override def getRange(from: Int, until: Int, columns:
Option[Seq[String]]): Iterator[Int] =
+ items.slice(from, until).iterator
+ override def getAfter(offset: Int): Iterator[Int] = items.drop(offset +
1).iterator
+ override def getCount: Long = items.size.toLong
+ override def asInputStream(): InputStream =
+ new ByteArrayInputStream(items.map(_.toByte).toArray)
+ // Use the OS-portable temp directory rather than a hardcoded
+ // `/tmp/...` path so the spec also runs on Windows.
+ override def asFile(): File =
+ new File(System.getProperty("java.io.tmpdir"), "stub-int")
+ }
+
+ //
---------------------------------------------------------------------------
+ // Accessor surface — return verbatim
+ //
---------------------------------------------------------------------------
+
+ "ReadonlyVirtualDocument.getURI" should "return the impl-supplied URI" in {
+ val d = new StubReadonlyIntDoc(IndexedSeq(1, 2, 3))
+ assert(d.getURI == new URI("file:///stub/int"))
+ }
+
+ "ReadonlyVirtualDocument.getItem" should "delegate to the impl's index
lookup" in {
+ val d = new StubReadonlyIntDoc(IndexedSeq(10, 20, 30))
+ assert(d.getItem(0) == 10)
+ assert(d.getItem(2) == 30)
+ }
+
+ "ReadonlyVirtualDocument.get" should "iterate every item from the impl in
order" in {
+ val d = new StubReadonlyIntDoc(IndexedSeq(7, 8, 9))
+ assert(d.get().toList == List(7, 8, 9))
+ }
+
+ "ReadonlyVirtualDocument.getRange" should
+ "yield items in `[from, until)` (half-open interval — `until` is
exclusive)" in {
+ // Type via the trait so the default-arg contract (`columns: Option[…] =
None`)
+ // is resolved at the call site through the trait's signature, not the
+ // concrete subclass. Scala resolves default parameters from the
+ // STATIC type, so a `StubReadonlyIntDoc`-typed value without its own
+ // default would not get a default at the call site.
+ val d: ReadonlyVirtualDocument[Int] =
+ new StubReadonlyIntDoc(IndexedSeq(0, 1, 2, 3, 4))
+ assert(d.getRange(1, 4).toList == List(1, 2, 3))
+ }
+
+ it should
+ "accept the optional `columns` argument and resolve its default (None)
when omitted at the call site" in {
+ // The `columns` parameter exists so impls can project a subset of
+ // columns — `Some(...)` can LEGITIMATELY change the result on
+ // column-aware impls (e.g. iceberg-backed documents). What this
+ // case pins is the *call-site* contract: the third argument can
+ // be supplied or omitted, and the default is `None`.
+ //
+ // The stub deliberately ignores `columns` so the [from, until)
+ // slice is independent of the column selection — that lets us
+ // assert the call-site shape without taking a position on how
+ // impls should interpret `columns`.
+ val d: ReadonlyVirtualDocument[Int] = new StubReadonlyIntDoc(IndexedSeq(0,
1, 2))
+ // Call-site with all three positional args.
+ assert(d.getRange(0, 2, columns = Some(Seq("c"))).toList == List(0, 1))
+ // Call-site that omits the third arg — resolved via the trait's default.
+ assert(d.getRange(0, 2).toList == List(0, 1))
+ }
+
+ "ReadonlyVirtualDocument.getAfter" should
+ "skip the item at index `offset` and yield every item strictly after it"
in {
+ val d = new StubReadonlyIntDoc(IndexedSeq(0, 1, 2, 3, 4))
+ assert(d.getAfter(1).toList == List(2, 3, 4))
+ }
+
+ "ReadonlyVirtualDocument.getCount" should "return the item count as a Long"
in {
+ val d = new StubReadonlyIntDoc(IndexedSeq(0, 1, 2, 3, 4))
+ val count: Long = d.getCount
+ assert(count == 5L)
+ }
+
+ "ReadonlyVirtualDocument.asInputStream" should
+ "return a non-null InputStream that reads the impl-supplied bytes" in {
+ val d = new StubReadonlyIntDoc(IndexedSeq(7, 8))
+ val stream = d.asInputStream()
+ try {
+ assert(stream.read() == 7)
+ assert(stream.read() == 8)
+ assert(stream.read() == -1) // EOF
+ } finally {
+ stream.close()
+ }
+ }
+
+ "ReadonlyVirtualDocument.asFile" should "return a non-null File from the
impl" in {
+ val d = new StubReadonlyIntDoc(IndexedSeq.empty)
+ val file = d.asFile()
+ assert(file != null)
+ assert(file.getPath != "")
+ }
+
+ //
---------------------------------------------------------------------------
+ // Type parameter — T is preserved across accessors
+ //
---------------------------------------------------------------------------
+
+ "ReadonlyVirtualDocument[Int]" should
+ "preserve the type parameter on every accessor (compile-time enforced)" in
{
+ // Use a two-item fixture so `getAfter(0)` (a documented, non-negative
+ // offset) yields the second item — the trait's docs describe
+ // `offset` as a 0-based index, so the test should stay within that
+ // documented range.
+ val d: ReadonlyVirtualDocument[Int] = new StubReadonlyIntDoc(IndexedSeq(1,
2))
+ val item: Int = d.getItem(0)
+ val iter: Iterator[Int] = d.get()
+ val range: Iterator[Int] = d.getRange(0, 1)
+ val after: Iterator[Int] = d.getAfter(0)
+ assert(item == 1)
+ assert(iter.toList == List(1, 2))
+ assert(range.toList == List(1))
+ assert(after.toList == List(2))
+ }
+
+ //
---------------------------------------------------------------------------
+ // Pattern-matching contract
+ //
---------------------------------------------------------------------------
+
+ "A ReadonlyVirtualDocument value" should "match the trait via type-pattern"
in {
+ val d: AnyRef = new StubReadonlyIntDoc(IndexedSeq.empty)
+ val matched = d match {
+ case _: ReadonlyVirtualDocument[_] => true
+ case _ => false
+ }
+ assert(matched)
+ }
+}
diff --git
a/common/workflow-core/src/test/scala/org/apache/texera/amber/core/storage/model/VirtualCollectionSpec.scala
b/common/workflow-core/src/test/scala/org/apache/texera/amber/core/storage/model/VirtualCollectionSpec.scala
new file mode 100644
index 0000000000..b93cac63f6
--- /dev/null
+++
b/common/workflow-core/src/test/scala/org/apache/texera/amber/core/storage/model/VirtualCollectionSpec.scala
@@ -0,0 +1,151 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.texera.amber.core.storage.model
+
+import org.scalatest.flatspec.AnyFlatSpec
+
+import java.net.URI
+import scala.collection.mutable
+
+class VirtualCollectionSpec extends AnyFlatSpec {
+
+ //
---------------------------------------------------------------------------
+ // Test harness — a minimal in-memory concrete impl exercises every
+ // abstract method (getURI / getDocuments / getDocument / remove).
+ //
+ // The contained `VirtualDocument`s are stubbed with the smallest
+ // concrete impl: `clear()` and `getURI` are the only abstract members
+ // not given a default by the base class.
+ //
---------------------------------------------------------------------------
+
+ private class StubDocument(uriValue: URI) extends VirtualDocument[Nothing] {
+ override def getURI: URI = uriValue
+ override def clear(): Unit = ()
+ }
+
+ private class StubCollection(uriValue: URI) extends VirtualCollection {
+ private val children = mutable.LinkedHashMap.empty[String,
VirtualDocument[_]]
+ private var removed = false
+
+ def addChild(name: String, doc: VirtualDocument[_]): Unit = children(name)
= doc
+ def wasRemoved: Boolean = removed
+
+ override def getURI: URI = uriValue
+ override def getDocuments: List[VirtualDocument[_]] =
children.values.toList
+ override def getDocument(name: String): VirtualDocument[_] =
+ children.getOrElse(name, throw new NoSuchElementException(name))
+ override def remove(): Unit = {
+ children.clear()
+ removed = true
+ }
+ }
+
+ private def uri(s: String): URI = new URI(s)
+
+ //
---------------------------------------------------------------------------
+ // Abstract class declares four abstract methods — pinned via a
+ // concrete subclass. (VirtualCollection is an `abstract class`, not a
+ // trait — see `VirtualCollection.scala`.)
+ //
---------------------------------------------------------------------------
+
+ "VirtualCollection (concrete subclass)" should "delegate getURI to the
implementation" in {
+ val c = new StubCollection(uri("file:///tmp/coll"))
+ assert(c.getURI == uri("file:///tmp/coll"))
+ }
+
+ it should "expose getDocuments as a list whose membership matches every
addChild call" in {
+ // The `VirtualCollection` API does NOT document an ordering
+ // guarantee on `getDocuments`, so assert on membership (the set
+ // of URIs) rather than exact sequence — over-constraining future
+ // impls is more brittle than under-constraining.
+ val c = new StubCollection(uri("file:///coll"))
+ assert(c.getDocuments.isEmpty)
+ val docA = new StubDocument(uri("file:///coll/a"))
+ val docB = new StubDocument(uri("file:///coll/b"))
+ c.addChild("a", docA)
+ c.addChild("b", docB)
+ val docs = c.getDocuments
+ assert(docs.size == 2)
+ assert(docs.map(_.getURI).toSet == Set(docA.getURI, docB.getURI))
+ }
+
+ it should "look up a child by name via getDocument" in {
+ val c = new StubCollection(uri("file:///coll"))
+ val doc = new StubDocument(uri("file:///coll/only"))
+ c.addChild("only", doc)
+ // Pin that the same reference is returned (no copy).
+ assert(c.getDocument("only") eq doc)
+ }
+
+ it should "let an impl decide how to signal a missing child (not pinned by
the abstract class)" in {
+ // The abstract class declares `getDocument(name): VirtualDocument[_]`
+ // with no exception specification — impls choose how to signal a
+ // missing child. Avoid pinning a specific exception type here so
+ // the case does not become an implicit contract on every future
+ // impl; just pin that the call does NOT silently return a
+ // (legitimate) document for an unregistered name.
+ val c = new StubCollection(uri("file:///coll"))
+ val outcome = scala.util.Try(c.getDocument("does-not-exist"))
+ assert(
+ outcome.isFailure,
+ s"a missing-child lookup must signal failure (it MUST NOT silently
return a document); got: $outcome"
+ )
+ }
+
+ //
---------------------------------------------------------------------------
+ // remove — irreversible side effect
+ //
---------------------------------------------------------------------------
+
+ "VirtualCollection.remove" should
+ "clear the collection of children (impl-defined side effect)" in {
+ val c = new StubCollection(uri("file:///coll"))
+ c.addChild("d", new StubDocument(uri("file:///coll/d")))
+ assert(c.getDocuments.size == 1)
+ c.remove()
+ assert(c.getDocuments.isEmpty)
+ assert(c.wasRemoved)
+ }
+
+ //
---------------------------------------------------------------------------
+ // Type-pattern matching — `case _: VirtualCollection`
+ //
---------------------------------------------------------------------------
+
+ "A VirtualCollection value" should "match the VirtualCollection type via
type-pattern" in {
+ val c: AnyRef = new StubCollection(uri("file:///coll"))
+ val matched = c match {
+ case _: VirtualCollection => true
+ case _ => false
+ }
+ assert(matched)
+ }
+
+ it should
+ "NOT match an unrelated type via type-pattern (sanity check)" in {
+ // Asymmetry sanity: a String is not a VirtualCollection. Catches a
+ // refactor that widened the abstract class to a structural / type-alias
+ // declaration.
+ val notCol: AnyRef = "hello"
+ val matched = notCol match {
+ case _: VirtualCollection => true
+ case _ => false
+ }
+ assert(!matched)
+ }
+}