morningman commented on PR #61963:
URL: https://github.com/apache/doris/pull/61963#issuecomment-4230251034

   ### High Priority Issues
   
   #### 1. Missing end-to-end regression tests (Blocker)
   
   There are no integration tests under `regression-test/`. The existing FE 
unit tests (`PaimonTransactionTest`, `PaimonWriteBufferSizeTest`) only cover 
serialization and buffer size helpers. We need at minimum:
   - Non-partitioned table INSERT INTO + read-back verification
   - Partitioned table INSERT INTO
   - Fixed-bucket table INSERT INTO
   - Type coverage (STRING, INT, DECIMAL, TIMESTAMP, ARRAY, MAP, STRUCT)
   - Error scenarios (non-existent table, schema mismatch)
   
   Also missing: UT for `BindSink.bindPaimonTableSink()`, 
`PaimonTableSink.bindDataSink()`, 
`PhysicalPaimonTableSink.getRequirePhysicalProperties()`, and BE-side 
`VPaimonTableWriter::write()`.
   
   #### 2. `_extract_write_key()` per-row performance bottleneck
   
   `_extract_write_key()` calls `type->to_string(*col, row, options)` for 
**every row** in the write path. For multi-million row writes, the repeated 
string conversion and map lookups introduce non-trivial overhead. The Iceberg 
Writer uses column-level batch partition calculation.
   
   **Suggestion**: Compute partition keys at the block level — extract unique 
partition value sets per block, then batch-dispatch rows by group.
   
   #### 3. `VPaimonTableWriter` copies the entire `TDataSink` by value
   
   `_t_sink` is declared as `const TDataSink` (value type). Since `TDataSink` 
contains `serialized_table` (a full Base64-encoded Paimon Table object), this 
copy can be tens of KB to several MB.
   
   ```diff
   // be/src/vec/sink/vpaimon_table_writer.h
   -    const TDataSink _t_sink;
   +    const TDataSink& _t_sink;
   ```
   
   Note: If switching to a reference, the `TDataSink` owner's lifetime must be 
guaranteed to outlive the writer. Alternatively, use `std::shared_ptr<const 
TDataSink>`.
   
   #### 4. `VPaimonTableWriter::close()` — `Defer` timing issue with commit 
messages
   
   ```cpp
   Defer do_close {[&]() { close_status = _file_store_write->Close(); }};
   // PrepareCommit() runs here, adds messages to _state
   // ... then do_close fires on scope exit
   ```
   
   If `PrepareCommit()` succeeds and commit messages are added to `_state`, but 
the subsequent `Close()` fails (inside `Defer`), the method returns an error 
status even though commit messages have already been submitted. This creates an 
inconsistent state — FE sees commit messages but BE reports failure.
   
   **Suggestion**: Move `Close()` out of `Defer` and call it explicitly after 
`PrepareCommit`, or clear committed messages if `Close()` fails.
   
   ---
   
   ### Medium Priority Issues
   
   #### 5. `deserializeCommitMessagePayload()` log output can be unbounded
   
   When the DPCM header fails to parse, this line dumps the entire payload byte 
array at WARN level:
   
   ```java
   LOG.warn("paimon: payload header mismatch or too short: length={}, 
payload={}",
           payload == null ? 0 : payload.length, 
java.util.Arrays.toString(payload));
   ```
   
   In production, payloads can be megabytes. The brute-force fallback (trying 
versions 11→3, up to 9 exception catches) also has performance implications.
   
   ```diff
   - LOG.warn("paimon: payload header mismatch or too short: length={}, 
payload={}",
   -         payload == null ? 0 : payload.length, 
java.util.Arrays.toString(payload));
   + LOG.warn("paimon: payload header mismatch or too short: length={}",
   +         payload == null ? 0 : payload.length);
   + LOG.debug("paimon: problematic payload bytes: {}",
   +         payload == null ? "null" : java.util.Arrays.toString(payload));
   ```
   
   #### 6. `PaimonDorisMemoryPool::_ensure_thread_context()` — thread_local 
lifetime concern
   
   `tls_attach` is `static thread_local` and never released until the thread is 
destroyed. In thread-pool reuse scenarios, the previously attached query 
tracker may have expired. Consider checking tracker validity on each invocation 
or using scoped management.
   
   #### 7. INSERT OVERWRITE may be incomplete
   
   `UnboundTableSinkCreator.createUnboundTableSinkMaybeOverwrite()` adds a 
Paimon branch but `UnboundPaimonTableSink` does not propagate 
`staticPartitionKeyValues` or `isAutoDetectPartition`, unlike the Iceberg 
implementation. The INSERT OVERWRITE semantics may be incomplete.
   
   ---
   
   ### Low Priority Issues
   
   #### 8. `PaimonTableCache` constants should be configurable
   
   `maxExternalSchemaCacheNum = 50` and `expireAfterAccess = 100 minutes` are 
hardcoded. For environments managing hundreds of Paimon tables with periodic 
writes, 50 cache entries may be insufficient.
   
   #### 9. Dynamic bucket retry in `PaimonJniWriter.writeBatch()` is fragile
   
   ```java
   if (t instanceof IllegalArgumentException
           && t.getMessage() != null
           && t.getMessage().contains("dynamic bucket mode")) {
       writer.write(reusedRow, 0);  // retry with bucket=0
   }
   ```
   
   Error handling based on exception message string matching is brittle. 
Consider detecting dynamic bucket mode during `open()` and storing it as a 
flag, rather than relying on runtime exception matching.
   
   #### 10. Logging level issues
   
   | File | Issue |
   |---|---|
   | `PaimonTableCache.java:47` | `LOG.warn("load table:{}", key)` — Cache miss 
is not a warning condition; should be `LOG.info` |
   | `PaimonInsertExecutor.java` | No log statements in `beforeExec()` or 
`doBeforeCommit()`; consider adding key operation logging |
   
   #### 11. `rawlog_compat.cpp` and `rawlog_compat_paimon.cpp` have duplicated 
formatting logic
   
   Both files implement nearly identical log formatting. Consider extracting a 
shared internal function.
   The remaining items are medium/low priority and can be addressed in 
follow-up iterations.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to