bkietz commented on PR #13401:
URL: https://github.com/apache/arrow/pull/13401#issuecomment-1209385088
I think this PR provides useful functionality for round trip testing. I also
agree that we want to support custom relations (which will in all likelihood
require a registry). I think trying to address both goals with the same API
will compromise its efficacy for either.
<hr/>
Re extension relations: I think `google.protobuf.Any` is simple enough that
we don't really need to expose it since it is just a buffer and a type_url. I'd
propose the following structure:
```c++
class ExtensionRelationRegistry {
public:
using Factory = std::function<Result<Declaration>(const Buffer& bytes,
std::vector<Declaration> inputs)>;
enum RelationType {
/// substrait.ExtensionTable: the relation is a literal and requires no
processing to emit data
kTable,
/// substrait.ExtensionLeafRel: the relation has no inputs, for example
may stream data from disk
kLeaf,
/// substrait.ExtensionSingleRel: the relation has a single input, for
example may filter out rows
kSingle,
/// substrait.ExtensionMultiRel: the relation has multiple inputs, for
example may be a HighFolutinJoin
kMulti,
};
virtual Status AddFactory(std::string type_url, RelationType, Factory) = 0;
virtual Result<Factory> GetFactory(const std::string& type_url,
RelationType) = 0;
};
```
When adding the above registry, we can extend ToProto with a lambda fallback
like `struct { RelationType type; std::string type_url; std::shared_ptr<Buffer>
bytes; } custom_to_proto(const Declaration&)` which should be sufficient for
round trip testing purposes.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]