dabla commented on PR #60704: URL: https://github.com/apache/airflow/pull/60704#issuecomment-3764116453
> > > > This is weird, as I would expect the host url defined in your PowerBI connection to be automatically prepended to the relative url. The PowerBIHook extends the KiotaRequestAdapterHook, thus I would expect to behave in the same way. > > > > Also, why did you fully replace the code in the hook and test, this makes it very difficult to review? > > > > Could you post an example DAG how it's used and connection example? > > > > > > > > > Regarding the URL handling: I had the same expectation initially, however in practice PowerBIHook.run() ends up passing relative paths (e.g. myorg/groups) directly to the async request adapter, which then fails with: > > > `Request URL is missing an 'http://' or 'https://' protocol` > > > This suggests the host configured in the hook is not being prepended for these calls. The issue is reproducible with a standard Power BI connection and occurs consistently when calling get_workspace_list() and other Power BI hook methods. The actual logic change in this PR is intentionally minimal: it only normalizes relative Power BI REST endpoints before delegating to super().run(). Absolute URLs are left untouched. The larger diff is due to formatter and import reordering enforced by CI after rebasing; no additional behavior beyond URL normalization was changed. > > > > > > The relative url is automatically concatenated with the host by the `RequestAdapter's RequestInformation` class. > > Thanks for the clarification - that makes sense. I agree that by design the `RequestInformation` should combine the relative path with the configured host, and that was my initial assumption as well. > > What I’m observing in practice (Airflow 3.x, async execution path) is that `PowerBIHook.run()` forwards relative endpoints like myorg/groups to the async adapter, and the request fails before the URL is fully resolved, with: > > `Request URL is missing an 'http://' or 'https://' protocol` > > This happens consistently when calling methods such as `get_workspace_list()` using a standard Power BI connection. That’s what led me to believe that, at least in this async path, the URL normalization is not being applied as expected. I agree that adding normalization in `PowerBIHook` may not be the right long-term solution if the base `KiotaRequestAdapterHook` is expected to handle this. If you think this should instead be addressed (or investigated) at the KiotaRequestAdapterHook / request adapter level, I’m happy to follow up in that direction. Regarding the diff size: the functional change itself is limited to URL normalization. The larger diff is due to prek/ruff enforcing import order and formatting after rebasing; no additional behavior was intentionally changed. Below is a minimal example of how the hook is used where the issue reproduces. > > ``` > from airflow import DAG > from airflow.providers.microsoft.azure.hooks.powerbi import PowerBIHook > from airflow.operators.python import PythonOperator > from datetime import datetime > > > def list_workspaces(): > hook = PowerBIHook(conn_id="powerbi_default") > return hook.get_workspace_list() > > > with DAG( > dag_id="powerbi_workspace_list_example", > start_date=datetime(2024, 1, 1), > schedule=None, > catchup=False, > ) as dag: > PythonOperator( > task_id="list_powerbi_workspaces", > python_callable=list_workspaces, > ) > ``` Ok now I understand why it isn't working, you'are using [async code in the PythonOperator](https://github.com/apache/airflow/pull/60268), this is not possible yet until Airflow 3.2 is released. @potiuk Here also a nice example where the async PythonOperator will be helpful for people wanting to interact directly with the hooks. I would advise you to use the PowerBIOperator for this, as this will handle the async code for you in deferred mode behind the scenes, or you have to write you method like this: ``` def list_workspaces(): import asyncio hook = PowerBIHook(conn_id="powerbi_default") return asyncio.run(hook.get_workspace_list()) ``` From Airflow 3.2+ you'll be able to do this: ``` @task async def list_workspaces(): hook = PowerBIHook(conn_id="powerbi_default") return await hook.get_workspace_list() ``` This explains why you are encountering issues, this example won't work as the KiotaRequestAdapterHook and PowerBIHook are asynchronous hooks, just like the HttpAsyncHook. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
