Dennis-Mircea commented on PR #28083:
URL: https://github.com/apache/flink/pull/28083#issuecomment-4702300159

   > my AI raised these issues:
   > 
   > #### 1. already fixed
   > #### 2. **Potential Null Safety Issue**
   > **Location:** `DataGenTableSourceFactoryTest.stubReaderContext()`
   > 
   > ```java
   > private static SourceReaderContext stubReaderContext() {
   >     return null;
   > }
   > ```
   > 
   > **Concern:** Returning `null` for `SourceReaderContext` is risky. While 
the comment states generators don't invoke methods on the context, this creates 
a fragile contract:
   > 
   > * Future generator implementations might call context methods
   > * NPE would occur at runtime, not compile time
   > * No compile-time safety
   > 
   > **Recommendation:** Use a proper stub/mock implementation:
   > 
   > ```java
   > private static SourceReaderContext stubReaderContext() {
   >     return new SourceReaderContext() {
   >         @Override public int currentParallelism() { return 1; }
   >         @Override public int getIndexOfSubtask() { return 0; }
   >         // ... implement other methods with sensible defaults
   >     };
   > }
   > ```
   > 
   > #### 3. **Missing Javadoc on Public Test Methods**
   > **Location:** `DataGenTableSourceFactoryTest`
   > 
   > Several new test methods lack Javadoc:
   > 
   > * `testSequenceProducesEachValueExactlyOnce()`
   > * `testEffectiveCountIsCappedByRowCountWhenSmaller()`
   > * `testEffectiveCountIsCappedBySequenceWhenRowCountLarger()`
   > 
   > **Recommendation:** Add brief Javadoc explaining what behavior each test 
verifies.
   > 
   > #### 4. **Serialization Not Explicitly Tested**
   > **Observation:** The old test had explicit serialization testing:
   > 
   > ```java
   > gen = InstantiationUtil.clone(gen);
   > ```
   > 
   > The new implementation removes this. While the `serialVersionUID` fields 
are present, there's no explicit verification that the new generator functions 
are properly serializable.
   > 
   > **Recommendation:** Add a test that clones a generator function and 
verifies it still works:
   > 
   > ```java
   > @Test
   > void testGeneratorFunctionSerialization() throws Exception {
   >     GeneratorFunction<Long, RowData> original = buildRowGenerator(...);
   >     GeneratorFunction<Long, RowData> cloned = 
   >         InstantiationUtil.clone(original);
   >     cloned.open(stubReaderContext());
   >     // Verify cloned generator produces expected output
   > }
   > ```
   > 
   > #### 5. **Magic Number in Test**
   > **Location:** `DataGenTableSourceFactoryTest.runGenerator()`
   > 
   > ```java
   > long limit = count == Long.MAX_VALUE ? 1_000L : count;
   > ```
   > 
   > The `1_000L` is a magic number. Should be a named constant:
   > 
   > ```java
   > private static final long DEFAULT_UNBOUNDED_TEST_LIMIT = 1_000L;
   > ```
   
   Thanks @davidradl, appreciate the careful pass. Addressed below:
   
   1. I don't understand this point. What is already fixed?
   
   2. **(null `SourceReaderContext`)** - I've left this as-is. The stub is 
documented: the generator functions only use `open()` for one-time 
initialization and never call any method on the context, so a `null`-returning 
stub is sufficient and the contract is spelled out in the Javadoc. If a future 
generator starts using the context, the right move would be to switch to the 
existing `TestingReaderContext` from `flink-connector-test-utils` rather than 
hand-roll an anonymous class, but pulling that test dependency into 
`flink-table-api-java-bridge` isn't justified today.
   
   3. **(Javadoc on test methods)** - I'd prefer to skip this one. Flink test 
methods are package-private (not public) and the codebase convention is 
generally no Javadoc on tests. The names 
(`testSequenceProducesEachValueExactlyOnce`, etc.) are intended to be 
self-documenting.
   
   4. **(serialization not explicitly tested)** - good catch, this was the one 
with real value. The generator functions are shipped to the TaskManagers, so 
serializability matters. Added `testRowGeneratorIsSerializable`, which clones 
the freshly built `GeneratorFunction<Long, RowData>` via 
`InstantiationUtil.clone(...)` and verifies the round-tripped copy still 
produces the full sequence. (Note the existing `datagen` ITCases already 
exercise the serialization path end-to-end through a MiniCluster. This just 
adds cheap unit-level coverage)
   
   5. **(magic number)** - done. Extracted `1_000L` into a named constant 
`DEFAULT_UNBOUNDED_TEST_LIMIT` with a comment explaining when the cap applies.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to