shunping commented on issue #37445:
URL: https://github.com/apache/beam/issues/37445#issuecomment-3937089947

   I want to add some clarification here. Hopefully, it can give contributors 
some idea to look at.
   
   - First, the prefix of GCS Path in the original issue description may be 
incorrect. I think we only consider URL starting with "gs://" as a GCS path.
     
https://github.com/apache/beam/blob/1052216416f4d77325f77e0af80fc0ee98979e14/sdks/python/apache_beam/io/gcp/gcsfilesystem.py#L54-L57
   
     Therefore, the following code can run successfully if `gcp` extra is 
installed. (The proposed behavior) 
     ```python
     >>> from apache_beam.io import filesystems
     >>> fs=filesystems.FileSystems.get_filesystem("gs://blah")
     >>> print(fs)
     <apache_beam.io.gcp.gcsfilesystem.GCSFileSystem object at 0x1195a3290>
     ```
   
     However, the following code failed even with `gcp` extra, because "gcs://" 
is never recognized.
     ```python
     >>> from apache_beam.io import filesystems
     >>> fs=filesystems.FileSystems.get_filesystem("gcs://blah")
     ValueError: Unable to get filesystem from specified path, please use the 
correct path or ensure the required dependency is installed, e.g., pip install 
apache-beam[gcp]. Path specified: gcs://blah
     ```
   - Second, when `gcp` extra is absent, we see the error in the previously 
successful code
     ```python
     >>> from apache_beam.io import filesystems
     >>> fs=filesystems.FileSystems.get_filesystem("gs://blah")
     Failed to import GCSFileSystem; loading of this filesystem will be 
skipped. Error details: cannot import name 'storage' from 'google.cloud' 
(unknown location)
     ...
     ValueError: Unable to get filesystem from specified path, please use the 
correct path or ensure the required dependency is installed, e.g., pip install 
apache-beam[gcp]. Path specified: gs://blah
     ```
   
     This should be the issue the contributor would need to fix. 
     
     The above error is raised at
     
https://github.com/apache/beam/blob/1052216416f4d77325f77e0af80fc0ee98979e14/sdks/python/apache_beam/io/filesystems.py#L64-L70
     We need to figure out why importing GCSFileSystem failed if there is no 
`gcp` extra installed and so on.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to