Copilot commented on code in PR #689:
URL: https://github.com/apache/sedona-db/pull/689#discussion_r2888760315
##########
python/sedonadb/python/sedonadb/__init__.py:
##########
@@ -15,16 +15,17 @@
# specific language governing permissions and limitations
# under the License.
from sedonadb import _lib
-from sedonadb.context import connect, configure_proj
+from sedonadb.context import connect, configure_proj, configure_gdal
__version__ = _lib.sedona_python_version()
__features__ = _lib.sedona_python_features()
__all__ = ["connect", "options"]
-# Attempt to configure PROJ on import. This will warn if PROJ
-# can't be configured but should never error. The auto-configured
+# Attempt to configure PROJ and GDAL on import. This will warn if PROJ
+# or GDAL can't be configured but should never error. The auto-configured
# value can be overridden as long as the call to configure_proj()
-# occurs before actually creating a transform.
+# and configure_gdal() occurs before actually creating a transform.
Review Comment:
The module-level comment says `configure_gdal()` must occur before "actually
creating a transform", but GDAL configuration isn’t related to creating PROJ
transforms. This looks like a copy/paste from the PROJ note and could mislead
users about when GDAL configuration needs to happen.
Recommend updating the wording to reference GDAL-backed operations (e.g.,
raster/vector IO) rather than transforms.
##########
python/sedonadb/python/sedonadb/context.py:
##########
@@ -604,3 +604,184 @@ def _configure_proj_prefix(prefix: str):
database_path=Path(prefix) / "share" / "proj" / "proj.db",
search_path=Path(prefix) / "share" / "proj",
)
+
+
+def configure_gdal(
+ preset: Optional[
+ Literal["auto", "pyogrio", "rasterio", "conda", "homebrew", "system"]
+ ] = None,
+ *,
+ shared_library: Optional[Union[str, Path]] = None,
+ verbose: bool = False,
+) -> None:
+ """Configure GDAL source
+
+ SedonaDB loads GDAL dynamically at runtime. This is normally configured
+ on package load but may need additional configuration (particularly if the
+ automatic configuration fails).
+
+ This function may be called at any time; however, once a GDAL-backed
+ operation has been performed, subsequent configuration has no effect.
+
+ Args:
+ preset: One of:
+ - None: Use a custom `shared_library` path.
+ - auto: Try all presets in the order pyogrio, rasterio, conda,
+ homebrew, system and warn if none succeeded.
+ - pyogrio: Attempt to use the GDAL shared library bundled with
+ pyogrio. This aligns the GDAL version with the one used by
+ `read_pyogrio()` / `geopandas.read_file()`.
+ - rasterio: Attempt to use the GDAL shared library bundled with
+ rasterio.
+ - conda: Attempt to load libgdal installed via
+ `conda install libgdal`.
+ - homebrew: Attempt to load libgdal installed via
+ `brew install gdal`.
+ - system: Attempt to load libgdal from a directory already on
+ LD_LIBRARY_PATH (Linux), DYLD_LIBRARY_PATH (macOS), or PATH
+ (Windows).
+
+ shared_library: Path to a GDAL shared library.
+ verbose: If True, print information about the configuration process.
+
+ Examples:
+
+ >>> sedona.db.configure_gdal("auto")
+ """
+ if preset is not None:
+ if preset == "pyogrio":
+ _configure_gdal_pyogrio()
+ return
+ elif preset == "rasterio":
+ _configure_gdal_rasterio()
+ return
+ elif preset == "conda":
+ _configure_gdal_conda()
+ return
+ elif preset == "homebrew":
+ prefix = os.environ.get("HOMEBREW_PREFIX", "/opt/homebrew")
+ shared_library = Path(prefix) / "lib" / _gdal_lib_name()
+ elif preset == "system":
+ shared_library = _gdal_lib_name()
+ elif preset == "auto":
+ tried = ["pyogrio", "rasterio", "conda", "homebrew", "system"]
Review Comment:
In the Windows "system" preset, `shared_library` is set to the bare filename
`gdal.dll`, and `ctypes.CDLL` is then called on it. On Windows this can load
from the current working directory depending on DLL search order, which is a
common DLL hijacking vector—especially since `__init__.py` calls
`configure_gdal(preset="auto")` on import.
Consider either (a) removing/skip the `system` preset on win32 from the
`auto` preset, (b) requiring an absolute path for user-provided/shared-library
presets on win32, or (c) using the safer Windows loading APIs/flags (e.g.,
`ctypes.CDLL(..., winmode=...)` on Python 3.8+ and/or `os.add_dll_directory`)
to avoid searching the CWD.
```suggestion
if sys.platform == "win32":
# On Windows, avoid trying the "system" preset from "auto" to
# prevent loading a bare DLL name like "gdal.dll" via the
# default DLL search path (which may include the CWD).
tried = ["pyogrio", "rasterio", "conda", "homebrew"]
else:
tried = ["pyogrio", "rasterio", "conda", "homebrew",
"system"]
```
##########
python/sedonadb/python/sedonadb/context.py:
##########
@@ -604,3 +604,184 @@ def _configure_proj_prefix(prefix: str):
database_path=Path(prefix) / "share" / "proj" / "proj.db",
search_path=Path(prefix) / "share" / "proj",
)
+
+
+def configure_gdal(
+ preset: Optional[
+ Literal["auto", "pyogrio", "rasterio", "conda", "homebrew", "system"]
+ ] = None,
+ *,
+ shared_library: Optional[Union[str, Path]] = None,
+ verbose: bool = False,
+) -> None:
Review Comment:
`configure_gdal()` is a new public API and is now invoked during package
import (`preset="auto"`). It would be good to add Python tests that validate
its behavior without requiring a real GDAL install (e.g., monkeypatch
`importlib.import_module`/`ctypes.CDLL` to simulate success/failure and assert
it warns rather than raising for the `auto` preset, and that invalid
`shared_library` values raise a clear `ValueError`).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]