yunlou11 commented on issue #5724:
URL: https://github.com/apache/paimon/issues/5724#issuecomment-2969525337

   Debug at "UniversalCompaction.pickForSizeRatio",  The  "List<LevelSortedRun> 
runs":
   
   
![Image](https://github.com/user-attachments/assets/b53be605-a534-4a14-8bcc-2792791d567c)
   
   The  "data-c5ad2524-733a-4405-a54e-78838925f501-2.parquet" content is:
   | _KEY|_sno|  _SEQUENCE_NUMBER|  _VALUE_KIND|  sno|  name |address|       
email|
   | --- | --- | ---| --- | --- | --- | --- | --- | 
   |0         |6               |194            |0    |6  |dyl5   |hefei  
|1...@qq.com|
   |1         |8               |198            |0    |8  |dyl5   |hefei  
|1...@qq.com|
   
   **The  "List<LevelSortedRun> runs" ignore the "sno:8 (-D)" parquet data 
file:**
   
   
![Image](https://github.com/user-attachments/assets/275f1bc5-841f-43b5-9acc-e8b7b2bf91e5)
   
   data-52794550-f681-4fb3-93ea-742b418c9a37-0.parquet content is:
   
   _KEY_sno | _SEQUENCE_NUMBER | _VALUE_KIND | sno | name | address | email
   -- | -- | -- | -- | -- | -- | --
   8 | 199 | 3 | 8 | dyl5 | hefei | 1...@qq.com
   
   This situation leads to whether the following data can generate a Changelog 
of "sno: 8 (+I)" depending on the "outputLevel" in the parameter "CompactUnit 
unit". 
   - If the "outputLevel" is 4, it cannot be output because it will be judged 
from the "data-c5ad2524-733a-4405-a54e-78838925f501-2.parquet" file that the 
"sno: 8" data already exists. 
   - If the"outputLevel" is 5, the above file content will be ignored, and the 
"highLevel" in "LookupChangelogMergeFunctionWrapper.getResult()" will be null, 
which can generate "sno: 8 (+I)". 
   - However, in reality, both "outputLevel" =4 or=5 are possible:
   
   ```json
   {
       "before": null,
       "after": { "sno": 8, "name": "dyl5", "address": "hefei", "email": 
"1...@qq.com" },
       "op":"c"
   }
   ```
   So sometimes it's possible to output a Changelog with 'sno: 8 (+I)', 
sometimes it's not
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@paimon.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to