[GitHub] [arrow] westonpace commented on a diff in pull request #33623: GH-33212: [C++][Python] Add use_threads to pyarrow.substrait.run_query

GitBox Thu, 12 Jan 2023 09:31:35 -0800


westonpace commented on code in PR #33623:
URL: https://github.com/apache/arrow/pull/33623#discussion_r1068405976



##########
cpp/src/arrow/engine/substrait/util.h:
##########
@@ -38,10 +38,24 @@ namespace engine {
 using PythonTableProvider =
     std::function<Result<std::shared_ptr<Table>>(const 
std::vector<std::string>&)>;
 
+/// \brief Utility method to run a Substrait plan
+/// \param substrait_buffer The plan to run, must be in binary protobuf format
+/// \param registry A registry of extension functions to make available to the 
plan
+///                 If null then the default registry will be used.
+/// \param memory_pool The memory pool the plan should use to make allocations.
+/// \param func_registry A registry of functions used for execution 
expressions.
+///                      `registry` maps from Substrait function IDs to 
"names"  These
+///                      names will be provided to `func_registry` to get the 
actual
+///                      kernel.
+/// \param conversion_options Options to control plan deserialization
+/// \param use_threads If True then the CPU thread pool will be used for CPU 
work.  If
+///                    False then all work will be done on the calling thread.
+/// \return A record batch reader that will read out the results
 ARROW_ENGINE_EXPORT Result<std::shared_ptr<RecordBatchReader>> 
ExecuteSerializedPlan(
     const Buffer& substrait_buffer, const ExtensionIdRegistry* registry = 
NULLPTR,
     compute::FunctionRegistry* func_registry = NULLPTR,
-    const ConversionOptions& conversion_options = {});
+    const ConversionOptions& conversion_options = {}, bool use_threads = true,

Review Comment:
   The default is generally to maximize performance at whatever expense to CPU 
& RAM.  I think this is ok.  Users usually want things to run as quickly as 
possible.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow] westonpace commented on a diff in pull request #33623: GH-33212: [C++][Python] Add use_threads to pyarrow.substrait.run_query

Reply via email to