Hello!
I’m trying to implement an IO source for Google’s Firebase Auth Accounts
<https://firebase.google.com/docs/reference/admin/python/firebase_admin.auth#firebase_admin.auth.Client.list_users>.
The Firebase API exposes a simple page-based querying interface:
class ListUsersPage:
MAX_LIST_USERS_RESULT = 1000
users: list[ExportedUserRecord]
next_page_token: str | None
has_next_page() -> bool: ...
get_next_page() -> ListUsersPage: ...
iterate_all() -> iter[ExportedUserRecord]: ...
I was thinking about converting this API into a Splittable DoFn, but am facing
difficulties in fulfilling the RestrictionTracker interface because:
The total number of users is not available
Splits are "expensive" (need to fetch a page to get a new page token)
The maximum number of records per-page is small (1000)
Given this, is Splittable DoFn even the right tool for implementing this?
Should I just fetch all users synchronously and leverage beam.Create?
Is there another interface/strategy I should be considering altogether?
Any and all help is greatly appreciated,
Thank you!