[I] [Bug] DataBlobWriter._split_data raise KeyError when a partial write included blob columns [paimon]

via GitHub Wed, 13 May 2026 21:02:16 -0700


SteNicholas opened a new issue, #7849:
URL: https://github.com/apache/paimon/issues/7849


   ### Search before asking
   
   - [x] I searched in the [issues](https://github.com/apache/paimon/issues) 
and found nothing similar.
   
   
   ### Paimon version
   
   1.4.1
   
   ### Compute Engine
   
   PyPaimon
   
   ### Minimal reproduce step
   
   DataBlobWriter._split_data always selected the full normal and blob column 
lists from the table schema. With TableWrite.with_write_type, batches only 
contain the narrowed columns, so pa.RecordBatch.select(...) could reference 
missing names and raise KeyError when a partial write included blob columns.
   
   ### What doesn't meet your expectations?
   
   Pass write_cols from FileStoreWrite into DataBlobWriter, narrow 
normal_column_names and blob_file_column_names to that subset (and only open 
blob writers for blob columns in the subset), so splits only select columns 
present in the batch.
   
   ### Anything else?
   
   _No response_
   
   ### Are you willing to submit a PR?
   
   - [x] I'm willing to submit a PR!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[I] [Bug] DataBlobWriter._split_data raise KeyError when a partial write included blob columns [paimon]

Reply via email to