Yaohua628 commented on PR #38777:
URL: https://github.com/apache/spark/pull/38777#issuecomment-1325752236

   @cloud-fan @dongjoon-hyun @HeartSaVioR Sorry for the back and forth. 
   
   [The previous PR](https://github.com/apache/spark/pull/38683), we changed 
the `_metadata` to not null. And I just realized we probably should make all 
fields inside of the `_metadata` (`file_path`, `file_name`, 
`file_modification_time`, `file_size`, `row_index`) not null as well for 
consistency.
   
   Please let me know WDYT. As @cloud-fan mentioned, it should be fine to write 
not-null data into a nullable column. But my only concern is this change might 
break the existing stateful streaming schema compatibility check?
   
   Also, cc @ala to confirm `row_index` will always be not null for supported 
file formats (e.g. Parquet)
   
   Thanks for all your help!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to