Thanks for the pointer. I opened an issue.

-erik

> On Jul 21, 2025, at 10:17, Even Rouault <even.roua...@spatialys.com> wrote:
> 
> Erik,
> 
> I don't think it is really worth sozip'ing a zipped Zarr, given that zarr is 
> made of many relatively small files, and sozip shines with big compressed 
> files.  Generally, even when creating a zipped (sozip or not) Zarr file, you 
> need to make sure that your writing pattern matches chunks boundaries, to 
> avoid chunk files to be rewritten several times and making the zip bigger 
> than needed. Please file an issue about the error not being transmitted up to 
> the caller
> 
> Even
> 
> Le 19/07/2025 à 17:44, Erik Schnetter via gdal-dev a écrit :
>> I am using GDAL to create a multidimensional zarr file that is sozip 
>> compressed. I see this error when creating the file:
>> 
>> ERROR 1: dish_positions.00000000.zarr/zarr.json already exists in ZIP file
>> ERROR 8: Open file 
>> /vsizip/data/fengine_init_pathfinder/cx66_dish_positions.00000000.zarr.zip/dish_positions.00000000.zarr/zarr.json
>>  to write failed
>> 
>> Everything is working fine when I do not use sozip compression. I enable 
>> sozip compression by adding a "/vsizip" prefix to the file name. Although 
>> there is an error reported on screen, I do not see an error code reported by 
>> the function creating or closing the multidimensional dataset. The resulting 
>> file ("*.zarr.zip") is created fine and looks almost correct, but all 
>> attributes seem to be missing.
>> 
>> I wonder – is it actually possible to create a zarr file that is sozip 
>> compressed, given that zarr probably writes to each of its file multiple 
>> times? If not, what is the preferred way to create a sozip-compressed zarr 
>> file efficiently?
>> 
>> Some details:
>> 
>> I create the dataset (i.e. the file) via
>> 
>>                 const auto driver_manager = GetGDALDriverManager();
>>                 const auto driver = driver_manager->GetDriverByName("Zarr");
>>                 const auto dataset = 
>> std::unique_ptr<GDALDataset>(driver->CreateMultiDimensional(
>>                     full_path.c_str(), root_group_options_c.data(), 
>> options_c.data()));
>> 
>> where "full_path" is 
>> "/vsizip/data/fengine_init_pathfinder/cx66_dish_positions.00000000.zarr.zip/dish_positions.00000000.zarr".
>> 
>> I then create multiple attributes ("CreateAttribute") and then
>> 
>>                 const auto mdarray = group->CreateMDArray(meta->get_name(), 
>> dimensions, datatype,
>> array_options_c.data());
>>                     const bool success = mdarray->Write(
>>                         arrayStart.data(), count.data(), nullptr, 
>> bufferStride.data(), datatype,
>>                         frame + datatypesize * meta->offset, frame, 
>> buffer->frame_size);
>> 
>> and finish with
>> 
>>                 const CPLErr err = dataset->Close();
>>                 assert(!err);
>> 
>> The full code is available at 
>> <https://github.com/kotekan/kotekan/blob/eschnett/updates-2/lib/stages/gdalFileWrite.cpp>.
>> 
>> -erik
>> 
>> _______________________________________________
>> gdal-dev mailing list
>> gdal-dev@lists.osgeo.org
>> https://lists.osgeo.org/mailman/listinfo/gdal-dev
> 
> -- 
> http://www.spatialys.com
> My software is free, but my time generally not.
> 

_______________________________________________
gdal-dev mailing list
gdal-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev

Reply via email to