MurrayData commented on issue #47314: URL: https://github.com/apache/arrow/issues/47314#issuecomment-3287959166
> If I understand correctly you are suggesting empty columns that would be created as Null Type to be stored as empty strings? Not necessarily, I'm suggesting user defined. We regularly receive some government statistical datasets, with variable schemas (we can handle this) where fields are inconsistently populated, indicating the field wasn't captured on some occasions. I'd like to be able to set a null datatype to int in this case, as they are counts or, in another case, strings so the schema is consistent. We have a workaround where we read as a dataset, get the schema, then fix them ourselves, but it would be a useful option to avoid having to do this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org