This is an automated email from the ASF dual-hosted git repository. github-merge-queue[bot] pushed a commit to branch gh-readonly-queue/main/pr-5198-400f1cdaff960eedbb696a6501aee3dc84a9c453 in repository https://gitbox.apache.org/repos/asf/texera.git
commit 7b7a5c69b83c88caee17b34b9029c7575bebe143 Author: Matthew B. <[email protected]> AuthorDate: Tue Jun 9 14:13:47 2026 -0700 fix(auth): enforce @RolesAllowed in file-service by registering RolesAllowedDynamicFeature (#5198) ### What changes were proposed in this PR? Registers Jersey's `RolesAllowedDynamicFeature` in `file-service`'s Jersey environment so the `@RolesAllowed` / `@PermitAll` annotations already present on `DatasetResource` are actually enforced. `file-service` registered `JwtAuthFilter` (authentication) and the `AuthValueFactoryProvider.Binder` (so `@Auth user` injects), but never `RolesAllowedDynamicFeature` (authorization). The ~20 `@RolesAllowed(Array("REGULAR", "ADMIN"))` annotations on `DatasetResource` were therefore decorative: any valid JWT reached the resource method regardless of role. This adds the one missing registration. ```scala // Enforce @RolesAllowed annotations on resource methods environment.jersey.register(classOf[RolesAllowedDynamicFeature]) ``` Once enforcement is live, the six public-dataset endpoints (`getPublicPresignedUrl`, `getPublicPresignedUrlWithS3`, `getPublicDatasetVersionList`, `retrievePublicDatasetVersionRootFileNodes`, `getPublicDataset`, `getDatasetCover`) must stay reachable without a JWT for anonymous hub visitors. They are already `@PermitAll` on `main`; the new spec pins that so a future refactor cannot silently lock them out. ### Scope note This PR was originally "complete @RolesAllowed enforcement across microservices." Since then the rest of that work has landed or is in flight on `main`, so this PR has been narrowed to the one remaining gap: - `config-service`, `computing-unit-managing-service`, `workflow-compiling-service`: done in #5049 / #5199. - `JwtAuthFilter` priority + eager-401 + `@PermitAll` opt-out: done in #5404. - `access-control-service` (LiteLLM proxy hardening + feature registration): handled by #5421 (@Yicong-Huang). - **`file-service`: this PR.** With this and #5421, every microservice that uses `@RolesAllowed` enforces it, closing out #4904. ### Any related issues, documentation, or discussions? Closes #5433. Follow-up to #4904; companion to #5421. ### How was this PR tested? Added `DatasetResourcePermissionsSpec`, which verifies the six public-dataset endpoints carry `@PermitAll` so they remain anonymous-accessible after enforcement is enabled. `file-service` and the auth modules compile clean. Existing `DatasetResource` `@RolesAllowed` annotations are unchanged. ### Was this PR authored or co-authored using generative AI tooling? Co-authored with Claude Opus 4.8 in compliance with ASF Generative Tooling Guidance. --------- Co-authored-by: Yicong Huang <[email protected]> --- .../org/apache/texera/service/FileService.scala | 4 + .../resource/DatasetResourcePermissionsSpec.scala | 106 +++++++++++++++++++++ 2 files changed, 110 insertions(+) diff --git a/file-service/src/main/scala/org/apache/texera/service/FileService.scala b/file-service/src/main/scala/org/apache/texera/service/FileService.scala index 64a2f64eba..88f7650378 100644 --- a/file-service/src/main/scala/org/apache/texera/service/FileService.scala +++ b/file-service/src/main/scala/org/apache/texera/service/FileService.scala @@ -45,6 +45,7 @@ import org.apache.texera.service.resource.{ import org.apache.texera.service.util.S3StorageClient import org.apache.texera.service.util.LargeBinaryManager import org.eclipse.jetty.server.session.SessionHandler +import org.glassfish.jersey.server.filter.RolesAllowedDynamicFeature import java.nio.file.Path class FileService extends Application[FileServiceConfiguration] with LazyLogging { @@ -95,6 +96,9 @@ class FileService extends Application[FileServiceConfiguration] with LazyLogging new io.dropwizard.auth.AuthValueFactoryProvider.Binder(classOf[SessionUser]) ) + // Enforce @RolesAllowed annotations on resource methods + environment.jersey.register(classOf[RolesAllowedDynamicFeature]) + environment.jersey.register(classOf[DatasetResource]) environment.jersey.register(classOf[DatasetAccessResource]) diff --git a/file-service/src/test/scala/org/apache/texera/service/resource/DatasetResourcePermissionsSpec.scala b/file-service/src/test/scala/org/apache/texera/service/resource/DatasetResourcePermissionsSpec.scala new file mode 100644 index 0000000000..433f218389 --- /dev/null +++ b/file-service/src/test/scala/org/apache/texera/service/resource/DatasetResourcePermissionsSpec.scala @@ -0,0 +1,106 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.texera.service.resource + +import jakarta.annotation.security.{PermitAll, RolesAllowed} +import jakarta.ws.rs.HttpMethod +import org.scalatest.flatspec.AnyFlatSpec +import org.scalatest.matchers.should.Matchers + +import java.lang.reflect.Method + +// FileService registers RolesAllowedDynamicFeature, so every DatasetResource +// HTTP endpoint is now enforced by its own method-level annotation. A method +// with neither @PermitAll nor @RolesAllowed defaults to OPEN, so the contract +// these tests pin is: every endpoint carries exactly one of the two, the +// non-public ones require REGULAR/ADMIN, and the public hub reads stay +// anonymous-accessible. +class DatasetResourcePermissionsSpec extends AnyFlatSpec with Matchers { + + // A JAX-RS endpoint is any public method carrying an HTTP-method annotation + // (@GET/@POST/@PUT/@DELETE/...), each of which is meta-annotated @HttpMethod. + // Scanning by the meta-annotation avoids hard-coding the verb list. + private val endpointMethods: Seq[Method] = + classOf[DatasetResource].getMethods.toSeq + .filter(_.getAnnotations.exists(_.annotationType.isAnnotationPresent(classOf[HttpMethod]))) + + private def isPermitAll(m: Method): Boolean = m.getAnnotation(classOf[PermitAll]) != null + private def rolesOf(m: Method): Option[RolesAllowed] = + Option(m.getAnnotation(classOf[RolesAllowed])) + + // Public read endpoints for anonymous hub visitors browsing public datasets. + // These MUST stay @PermitAll or the public hub breaks for logged-out users. + private val publicEndpointMethods: Set[String] = Set( + "getPublicPresignedUrl", + "getPublicPresignedUrlWithS3", + "getPublicDatasetVersionList", + "retrievePublicDatasetVersionRootFileNodes", + "getPublicDataset", + "getDatasetCover", + "getDatasetCoverUrl" + ) + + "DatasetResource" should "expose HTTP endpoints (sanity check for the reflection scan)" in { + endpointMethods should not be empty + } + + it should "annotate every HTTP endpoint with exactly one of @PermitAll or @RolesAllowed (none defaults to open)" in { + endpointMethods.foreach { m => + withClue(s"${m.getName}: ") { + (isPermitAll(m), rolesOf(m).isDefined) match { + case (true, true) => fail("carries both @PermitAll and @RolesAllowed") + case (false, false) => + fail("carries neither @PermitAll nor @RolesAllowed (would default to open)") + case _ => succeed + } + } + } + } + + it should "guard every non-public endpoint with @RolesAllowed(REGULAR, ADMIN)" in { + val nonPublic = endpointMethods.filterNot(m => publicEndpointMethods.contains(m.getName)) + nonPublic should not be empty + nonPublic.foreach { m => + withClue(s"${m.getName}: ") { + isPermitAll(m) shouldBe false + val roles = rolesOf(m) + roles should not be empty + roles.get.value() should contain theSameElementsAs Array("REGULAR", "ADMIN") + } + } + } + + it should "keep the public dataset endpoints @PermitAll for unauthenticated visitors" in { + // Every documented public endpoint must still exist (guards against renames + // that would silently drop a name from the non-public check above). + val present = endpointMethods.map(_.getName).toSet.intersect(publicEndpointMethods) + withClue("public endpoints missing from DatasetResource (renamed?): ") { + present shouldBe publicEndpointMethods + } + endpointMethods + .filter(m => publicEndpointMethods.contains(m.getName)) + .foreach { m => + withClue(s"${m.getName} must be @PermitAll: ") { + isPermitAll(m) shouldBe true + rolesOf(m) shouldBe empty + } + } + } +}
