Pear0 commented on code in PR #47860:
URL: https://github.com/apache/arrow/pull/47860#discussion_r2446780136


##########
python/pyarrow/src/arrow/python/arrow_to_pandas.h:
##########
@@ -112,11 +128,11 @@ struct PandasOptions {
   bool decode_dictionaries = false;
 
   // Columns that should be casted to categorical
-  std::unordered_set<std::string> categorical_columns;
+  std::shared_ptr<const std::unordered_set<std::string>> categorical_columns;

Review Comment:
   I originally wanted to put the `PandasOptions` in a shared_ptr, but there 
are several code paths that want to take a `PandasOptions` and evolve it in 
some way (for example `MakeInnerOptions()`, `ConvertChunkedArrayToPandas()`) so 
it seems like you would still end up copying the sets pretty often in some 
circumstances (eg. I think once for each struct column?) unless you add 
indirection for the sets somehow.
   
   If you'd still prefer not having the set indirection, I can take a stab at 
implementing it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to