gopidesupavan commented on code in PR #62867:
URL: https://github.com/apache/airflow/pull/62867#discussion_r3006244082


##########
providers/common/sql/src/airflow/providers/common/sql/datafusion/object_storage_provider.py:
##########
@@ -70,12 +70,78 @@ def get_scheme(self) -> str:
         return "file://"
 
 
+class GCSObjectStorageProvider(ObjectStorageProvider):
+    """GCS Object Storage Provider using DataFusion's GoogleCloud."""
+
+    @property
+    def get_storage_type(self) -> StorageType:
+        """Return the storage type."""
+        return StorageType.GCS
+
+    def create_object_store(self, path: str, connection_config: 
ConnectionConfig | None = None):
+        """Create a GCS object store using DataFusion's GoogleCloud."""
+        if connection_config is None:
+            raise ValueError(f"connection_config must be provided for 
{self.get_storage_type}")
+
+        try:
+            from airflow.providers.google.common.hooks.base_google import 
GoogleBaseHook
+
+            bucket = self.get_bucket(path)

Review Comment:
   This only supports the `key_path` / `keyfile_dict` / ADC subset of Google 
auth because it relies solely on `provide_gcp_credential_file_as_context()`. It 
drops other valid Airflow Google connection modes like `credential_config_file`.
   
   As written, `gs://` support is narrower than the provider’s existing 
connection contract. Please either map the missing modes through or 
document/reject them explicitly and add regression tests.
   
   
   ```
   credential_config_file = gcp_hook._get_field("credential_config_file", None)
   credentials["application_credentials_path"] = credential_config_file
   ```
   ref: https://datafusion.apache.org/user-guide/cli/datasources.html#gcs



##########
providers/common/sql/src/airflow/providers/common/sql/datafusion/object_storage_provider.py:
##########
@@ -70,12 +70,78 @@ def get_scheme(self) -> str:
         return "file://"
 
 
+class GCSObjectStorageProvider(ObjectStorageProvider):
+    """GCS Object Storage Provider using DataFusion's GoogleCloud."""
+
+    @property
+    def get_storage_type(self) -> StorageType:
+        """Return the storage type."""
+        return StorageType.GCS
+
+    def create_object_store(self, path: str, connection_config: 
ConnectionConfig | None = None):
+        """Create a GCS object store using DataFusion's GoogleCloud."""
+        if connection_config is None:
+            raise ValueError(f"connection_config must be provided for 
{self.get_storage_type}")
+
+        try:
+            from airflow.providers.google.common.hooks.base_google import 
GoogleBaseHook
+
+            bucket = self.get_bucket(path)

Review Comment:
   This only supports the `key_path` / `keyfile_dict` / ADC subset of Google 
auth because it relies solely on `provide_gcp_credential_file_as_context()`. It 
drops other valid Airflow Google connection modes like `credential_config_file`.
   
   `gs://` support is narrower than the provider’s existing connection 
contract. Please either map the missing modes through or document/reject them 
explicitly and add regression tests.
   
   
   ```
   credential_config_file = gcp_hook._get_field("credential_config_file", None)
   credentials["application_credentials_path"] = credential_config_file
   ```
   ref: https://datafusion.apache.org/user-guide/cli/datasources.html#gcs



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to