Pear0 commented on code in PR #47860:
URL: https://github.com/apache/arrow/pull/47860#discussion_r2446780136
##########
python/pyarrow/src/arrow/python/arrow_to_pandas.h:
##########
@@ -112,11 +128,11 @@ struct PandasOptions {
bool decode_dictionaries = false;
// Columns that should be casted to categorical
- std::unordered_set<std::string> categorical_columns;
+ std::shared_ptr<const std::unordered_set<std::string>> categorical_columns;
Review Comment:
I originally wanted to put the `PandasOptions` in a shared_ptr, but there
are several code paths that want to take a `PandasOptions` and evolve it in
some way (for example `MakeInnerOptions()`, `ConvertChunkedArrayToPandas()`) so
it seems like you would still end up copying the sets pretty often in some
circumstances (eg. I think once for each struct column?) unless you add
indirection for the sets somehow.
If you'd still prefer not having the set indirection, I can take a stab at
implementing it.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]