WinkerDu opened a new pull request, #27235:
URL: https://github.com/apache/doris/pull/27235
## Proposed changes
Issue Number: close #xxx
I want to use Doris Multi-catalog to accelerate HMS query. My organization
has custom distributed file system, and we think wrapping the fs access
difference into broker (listLocatedFiles, openReader..) would be a elegant
approach.
This pr introduce HMS catalog conf `broker.name`. If we set this conf, file
split, query scan operation will send to broker.
usage:
create a hms catalog with broker usage
```
CREATE CATALOG hive_catalog_broker PROPERTIES (
'type'='hms',
'hive.metastore.uris' = 'thrift://xxx',
'broker.name' = 'hdfs_broker'
);
```
When we try to query from this catalog, file split and query scan request
will send to broker `hdfs_broker`.
More details about this pr:
1. Introduce HMS catalog proporty `bind.broker.name` to specify broker name
to do remote path work. When `broker.name` is set, `enable.self.splitter` must
be `true` to ensure file splitting process is executed in Fe
2. Introduce 2 more interfaces to broker service:
- `TBrokerIsSplittableResponse isSplittable(1: TBrokerIsSplittableRequest
request)`, helps to invoke input format `isSplitable` interface.
- `TBrokerListResponse listLocatedFiles(1: TBrokerListPathRequest request)`,
helps to do `listFiles` or `listLocatedStatus` for remote file system
3. 3 parts of whole processing will be executed in broker:
- Check whether the path with specified input format name `isSplittable`
- `listLocatedFiles` of table / partition locations.
- `OpenReader` for specified file splits.
<!--Describe your changes.-->
## Further comments
If this is a relatively large or complex change, kick off the discussion at
[[email protected]](mailto:[email protected]) by explaining why you
chose the solution you did and what alternatives you considered, etc...
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]