github-actions[bot] commented on code in PR #64935:
URL: https://github.com/apache/doris/pull/64935#discussion_r3489919798
##########
fe/fe-core/src/main/java/org/apache/doris/httpv2/rest/LoadAction.java:
##########
@@ -800,11 +800,17 @@ private Backend selectBackendForGroupCommit(String
clusterName, HttpServletReque
* @return RedirectView configured for stream load forwarding
*/
private RedirectView redirectToStreamLoadForward(HttpServletRequest
request, TNetworkAddress addr,
- String forwardTarget) {
+ String forwardTarget) throws LoadException {
// Replace _stream_load with _stream_load_forward in the path.
String modifiedPath = request.getRequestURI().replace("/_stream_load",
"/_stream_load_forward");
String queryString = request.getQueryString();
- String redirectQuery = "forward_to=" + forwardTarget;
+ String authToken;
+ try {
+ authToken = Env.getCurrentEnv().getTokenManager().acquireToken();
+ } catch (Exception e) {
+ throw new LoadException("Failed to acquire auth token for stream
load forward", e);
+ }
+ String redirectQuery = "forward_to=" + forwardTarget + "&auth_token="
+ authToken;
Review Comment:
This exposes Doris' internal cluster auth token to the stream-load client in
the `Location` header. The token comes from `TokenManager.acquireToken()` and
is not scoped to this redirect; the same value is accepted by FE's `token`
header path (`streamLoad()` -> `executeWithClusterToken()`, which sets
`UserIdentity.ADMIN`) and by BE's auth-token path, and `TokenManager` keeps
tokens valid for days. Any user who triggers this redirect can copy the
`auth_token` from the URL and reuse it as a cluster token for other requests,
so the forwarding credential should not be the reusable internal token. Please
use a per-request/per-target token with a very short lifetime, or another
mechanism that authenticates only this single `_stream_load_forward` hop
without disclosing the cluster-wide token.
##########
regression-test/suites/load_p0/stream_load/test_group_commit_redirect.groovy:
##########
@@ -93,6 +93,7 @@ suite('test_group_commit_redirect', 'docker') {
options.beNum = 3
options.cloudMode = true
options.beConfigs.add('enable_java_support=false')
+ options.beConfigs.add('enable_group_commit_streamload_be_forward=true')
Review Comment:
This turns on the BE forward endpoint for the docker test, but the test
still sends its direct `_stream_load_forward` requests without the FE-generated
`auth_token` (`groupCommitRedicetSteamLoad` builds only `?forward_to=...`).
That means the load coverage goes through `HttpHandlerWithAuth` fallback, which
is the already-known weak path and is a no-op when `enable_all_http_auth` is
false, rather than proving the new token-authenticated forwarding path works
end to end. Please either drive the production FE redirect and follow the
returned `Location` with its token, or add explicit token and negative-token
cases for `_stream_load_forward` so this security fix is actually covered.
##########
be/src/service/http/action/stream_load_forward_handler.cpp:
##########
@@ -31,9 +31,29 @@
namespace doris {
int StreamLoadForwardHandler::on_header(HttpRequest* req) {
- std::ostringstream params_info;
+ // FE passes auth_token via query param in 307 redirect (can't add
headers).
+ // Check it here to skip FE RPC; fallback to HttpHandlerWithAuth for Basic
Auth.
const auto* params = req->params();
+ auto token_it = params->find("auth_token");
+ if (token_it != params->end()) {
+ if (!_exec_env->check_auth_token(token_it->second)) {
+ LOG(WARNING) << "StreamLoadForward failed - invalid auth token"
+ << ", remote: " << req->remote_host() << ", path: "
<< req->raw_path();
+ HttpChannel::send_reply(req, HttpStatus::FORBIDDEN, "Invalid auth
token");
+ return HttpStatus::FORBIDDEN;
Review Comment:
This branch sends the 403 reply but returns `HttpStatus::FORBIDDEN`, which
is a positive value. `EvHttpServer::on_header()` only treats `res < 0` as "the
handler already replied"; otherwise it continues and installs the progressive
chunk callback/free callback. For an invalid token this leaves the request
alive after the 403 with no `StreamLoadForwardContext` initialized, so later
chunk/handle callbacks can try to send a second error instead of cleanly
stopping. Please return `-1` after sending the reply, matching the
`HttpHandler::on_header()` contract and the other authenticated handlers.
##########
be/src/service/http_service.cpp:
##########
@@ -144,9 +144,11 @@ Status HttpService::start() {
streamload_2pc_action);
// register stream load forward handler
- auto* forward_handler = _pool.add(new StreamLoadForwardHandler());
- _ev_http_server->register_handler(HttpMethod::PUT,
"/api/{db}/{table}/_stream_load_forward",
- forward_handler);
+ if (config::enable_group_commit_streamload_be_forward) {
Review Comment:
This BE-side flag is only consulted while `HttpService::start()` registers
handlers. The matching FE flag is still `@ConfField(mutable = true)` and
`LoadAction` checks it on every request, so `ADMIN SET FRONTEND CONFIG
("enable_group_commit_streamload_be_forward"="true")` can make FE start issuing
`_stream_load_forward` redirects while the running BEs were started with the
default false value and never registered that route. That turns group-commit
stream loads into 404/handler-missing failures until every BE is restarted with
the new flag. Please make the two sides have the same activation semantics, for
example by keeping the handler registered and checking a mutable gate in the
handler, or by making the FE setting restart-only/otherwise preventing FE from
enabling redirects until the BE endpoint is known to be active.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]