eugenegujing commented on issue #5242: URL: https://github.com/apache/texera/issues/5242#issuecomment-4686609746
Following up on the discussion in #4240: as @xuang7 summarized, the import direction is tracked here as a separate effort. I've created two sub-issues to cover it: - #5634 — Google Drive import (introduces a small provider abstraction + backend streaming import endpoint + frontend Picker flow) - #5635 — Dropbox import (second provider, validating the extensible connection layer) Both follow the design decisions @aicam laid out in #4240: no token is ever persisted (one-time OAuth token per import, discarded after the transfer), and the backend streams data directly from the provider into dataset storage (LakeFS/S3), reusing the existing multipart pipeline. **Scope choice for Google Drive:** the import flow uses the Google Picker with the `drive.file` scope only. It is a non-sensitive scope (no Google restricted-scope verification for any deployment), Google enforces at the permission layer that the app can only access files the user explicitly picked, and the consent screen reduces to a single grant. The alternative — an in-app Drive file browser — would require the restricted `drive.readonly` scope and a weeks-long security review per public deployment, for an effectively identical single-file-import UX. The provider interface keeps a `listFiles` capability so in-app browsing can be added later if wanted (the Dropbox provider can use it freely, as Dropbox has no comparable review process). cc @Sentiaus — I'll be working on the import direction here. It doesn't depend on the export PRs (the import side obtains a short-lived token at import time and stores nothing), so the two efforts can proceed in parallel. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
