rtpsw commented on code in PR #13375:
URL: https://github.com/apache/arrow/pull/13375#discussion_r901085767


##########
cpp/src/arrow/engine/substrait/util.h:
##########
@@ -30,17 +31,45 @@ namespace substrait {
 
 /// \brief Retrieve a RecordBatchReader from a Substrait plan.
 ARROW_ENGINE_EXPORT Result<std::shared_ptr<RecordBatchReader>> 
ExecuteSerializedPlan(
-    const Buffer& substrait_buffer);
+    const Buffer& substrait_buffer, const ExtensionIdRegistry* registry = 
NULLPTR,
+    compute::FunctionRegistry* func_registry = NULLPTR);
 
 /// \brief Get a Serialized Plan from a Substrait JSON plan.
 /// This is a helper method for Python tests.
 ARROW_ENGINE_EXPORT Result<std::shared_ptr<Buffer>> SerializeJsonPlan(
     const std::string& substrait_json);
 
+/// \brief Deserializes a Substrait Plan message to a list of ExecNode 
declarations
+/// including a no-op consumer of the sink output
+///
+/// \param[in] buf a buffer containing the protobuf serialization of a 
Substrait Plan
+/// message
+/// \param[in] registry an extension-id-registry to use, or null for the 
default one.
+/// \return a vector of ExecNode declarations, one for each toplevel relation 
in the
+/// Substrait Plan
+ARROW_ENGINE_EXPORT Result<std::vector<compute::Declaration>> DeserializePlans(
+    const Buffer& buf, const ExtensionIdRegistry* registry);
+
 /// \brief Make a nested registry with the default registry as parent.
 /// See arrow::engine::nested_extension_id_registry for details.
 ARROW_ENGINE_EXPORT std::shared_ptr<ExtensionIdRegistry> 
MakeExtensionIdRegistry();
 
+/// \brief Register a function manually.
+///
+/// Register an arrow function name by an ID, defined by a URI and a name, on 
a given
+/// extension-id-registry.
+///
+/// \param[in] registry an extension-id-registry to use
+/// \param[in] id_uri a URI of the ID to register by
+/// \param[in] id_name a name of the ID to register by
+/// \param[in] arrow_function_name name of arrow function to register
+ARROW_ENGINE_EXPORT Status RegisterFunction(ExtensionIdRegistry& registry,
+                                            const std::string& id_uri,
+                                            const std::string& id_name,
+                                            const std::string& 
arrow_function_name);

Review Comment:
   Generally, I added functions in 
`[cpp/src/arrow/engine/substrait/util.h](https://github.com/apache/arrow/pull/13375/files/cbf9c10c80b03e929b0880814aa03f960ab3517b#diff-2da5e8c8a425f3caf753bd2508931448d2b61ec4e3f3b26670c84ac03bf63e14)`
 to be used from PyArrow code in an upcoming PR. This PyArrow code will 
completely encapsulate the registration of functions embedded within a 
Substrait plan. Granted, non-embedded functions will still require manual 
registration that could be improved as you suggested; let's do this in a later 
PR.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to