Re: [Geoserver-users] Deleting database table through REST API

Gabriel Roldan Fri, 16 Feb 2024 06:27:29 -0800

Hi Andrea,

Thanks for this valuable break down.


Everything you mention makes sense, and I think we're getting to the root
of our disagreement.
Believe me I absolutely understand the convenience of just adding a
parameter to the existing delete featuretype api. Heck, we would have saved
all the time invested in this discussion an my customers would get what
they want, very low hanging fruit.

Yet I'm getting involved, just like you, thinking of the bigger picture and
the general good for geoserver.

So let me try to distill my reasoning about this.

The create feature type and for instance, all (most?) of the cataloginfo
related endpoints in the core api, are RESTful-ish, in the sense they deal
with REpresentational State Transfer of Catalog resources. As such, they
are intend to be atomic and single purpose.

For the specific case of the create featuretype api, creating the schema on
the target store is a desired byproduct, as it's currently designed, and
there's no argument against its convenience.

One can say the operation's single purpose is to create a FeatureTypeInfo,
and as a convenience it'll create a LayerInfo and the underlying native
schema.
The main point being it's atomic, when it returns, the featuretype is
either created or it isn't. Worse case scenario, something failed and the
system left a dangling schema in the backing store, but no harm's done to
the system.

Once create featuretype succeeded, if eventually it created a native store
featuretype, now that's a separate entity on its own right, on a different
domain than the Catalog objects. So much that it can be used by several
other published featuretypes. Even in SQL-view published types.

Now, when it comes to purging the (or a, for instance) native featuretype,
we're no longer talking about the core api's delete featuretype semantics.
Those deal with removing a FeatureTypeInfo from the Catalog. As a
by-product it'd delete the associated LayerInfo. Again, it should be atomic.

What we're talking about now, instead, is a use case specific API. Why:
- because it has to deal with (destructive) operations at more than one
domain level (the catalog object and the native type)
- because it has to account for other published FeatureTypes that might be
depending on the same native FeatureType. In this case a decision has to be
made on whether also delete those catalog objects or fail the operation.
- because it can't be atomic. At the end of the operation, both the
published and native FTs shall be removed, or none. The Catalog doesn't
handle transactions, and also not all DataStores do. Much less distributed
ones. Hence, if an error occurs at any stage, you end up with an
inconsistent state: the database table was removed but you couldn't remove
the pusblished type, or the other way around. In the former case, you can't
recover the data. In the later, the state is intermediate, and the
operation didn't achieve its goal.

So that's IMO the distinction. Not about parameters or path variables,
which are essentially the same thing technically speaking.
I've nothing against operation parameters as long as they're input values
to the operation. What I'm against is flags that change its semantics.

It is so much very true that as it stands today, the api parameters allow
to support a wide range of use cases. But trying to be so flexible that a
single endpoint supports all possible use cases, to me, is a mistake.

>From the above reasoning my conclusion is the ability to delete a published
feature type, alongside the underlying data and schema, and possibly any
other published feature type depending on the native schema, is a use case
and not a REST operation, and hence deserves its own use-case specific api
endpoint, with its own request and response bodies. That or, you oughta
deal with the use case client-side, and call separate endpoints for the
different domains.

Finally, in your current-api-breakdown reply, you mention the only existing
related endpoint is to delete also the data for a coverage:

> DELETE
/workspaces/{workspaceName}/coveragestores/{storeName}?deleteType=none/all/metadata.
With "metadata" the configuration files are all removed (e.g., the mosaic
configuration files), with "all" the actual raster data is removed as well

Well, this is the perfect example of what I'm proposing. Note how it's an
operation on the CoverageStore and not on the Coverage.
I concede that's probably related to the fact that CoverageStores support a
single Coverage?

Point is, this is definitely a use case driven api end point, not the REST
endpoint for CoverageInfo. And what I'm proposing is just like that.

I think that exposes my point of view in a more civil way, and yes, we'll
probably have to go through the GSIP process.

Best regards,
*camptocamp*
INNOVATIVE SOLUTIONS
BY OPEN SOURCE EXPERTS

*Gabriel Roldán*
Geospatial Developer



On Fri, Feb 16, 2024 at 5:26 AM Andrea Aime <
andrea.a...@geosolutionsgroup.com> wrote:

> Ok,
> let's start over with more context. In particular, let's start with the
> existing API and how it works in terms of "native" resources, and what
> "dangerous" functionality is already there.
>
> *List of native resources*
> How does one get a list of resources (feature types, coverages, wms and
> wmts layers), all GET requests here:
>
>    - 
> /workspaces/{workspaceName}/datastores/{storeName}/featuretypes?list=all/available
>    (either all of them, or the ones non configured)
>    - /workspaces/{workspace}/coveragestores/{store}/coverages?list=all
>    (returns both configured and non configured, no option implies only
>    configured)
>    - /workspaces/{workspace}/wmsstores/{wmsstore}/wmslayers/{wmslayer} ->
>    no ability to list available layers here!
>    -
>    
> /workspaces/{workspace}/wmtsstores/{wmtsstore}/layers?list=available/configured
>
> When the parameters have the same name, they do the same thing, but the
> availability of them is spotty. The only type of native resource that
> cannot be enumerated in the WMS layer one.
>
> *Exploring a non configured native resource*
>
> This does not seem to be available for any type of native resource. This
> would be useful, although not part of Cecile's original request.
>
> *Creation of new native resources*
> How does one create a new native resource that was not there:
>
>    - POST 
> /workspaces/{workspaceName}/datastores/{storeName}/{method}.{format}?target=<targetSTore).
>    This uploads a file (e.g. shapefile) and dumps it on the server. If the
>    targetStore is set, a createSchema will be run and the data dumped in the
>    target store. If the target schema is already present and "overwrite=true"
>    is part of the request, all existing data is deleted and replaced with the
>    new data.
>    - POST
>    /workspaces/{workspaceName}/datastores/{storeName}/featuretypes, posting a
>    definition that is not yet there. This creates an empty table/file.
>    - POST 
> /workspaces/{workspaceName}/coveragestores/{storeName}/{method}.{format}
>    will upload some raster data or mosaic definition. If the mosaic is
>    configured to work with database, it will automatically create the
>    necessary index tables.
>    - POST 
> /workspaces/{workspaceName}/coveragestores/{storeName}/{method}.{format}.
>    This allows one to configure a new entry in a mosaic, potentially with data
>    upload. While it does not create a new layer per se, it makes new data
>    available and can write data on the server side.
>
> For WMS/WMTS layers there is no way to create a layer in the remote
> server. Interestingly enough, the PostGISDataStoreFactory has a
> CREATE_DB_IF_MISSING parameter that will run a "CREATE DATABASE" if the
> target database is not there. Database creation has been available since
> 2014.
>
> *Removal of native resources*
> How does one clean a native resource that's no longer needed:
>
>    - For feature types, we already know this is not available. The same
>    goes for vector stores. The story here is asymmetric, one can create via
>    FeatureType/DataStore controllers, but not remove.
>    - For WMS/WMTS layers, we can't control the remote server.
>    - DELETE 
> /workspaces/{workspaceName}/coveragestores/{storeName}?deleteType=none/all/metadata.
>    With "metadata" the configuration files are all removed (e.g., the mosaic
>    configuration files), with "all" the actual raster data is removed as well
>    - DELETE is also available at the single mosaic file entry as well,
>    same approach, one can either the registration in the mosaic, or both the
>    registration and the actual data.
>
> *Summary*
> The existing REST API abilities are scattered, but generally allows
> creation of both database, new tables, new data.
> Removal is fully developed only on raster data (typical use case of having
> to maintain a time window of data online, in environments where standing up
> a separate process to perform data management is seen as overhead).
> I hope this clarifies why I see it as more natural to make the API
> symmetric in the feature type case, rather than creating a new variant that
> is for feature types only.
> Or do you want to make it consistent and adopt the same for coverages as
> well? What about the existing functionality, should it be dropped? We don't
> have a history of deprecating REST APIs but I'm open to discussion.
>
> *Dangers of using the wrong API*
> Here are some considerations on "having a new API is less dangerous than a
> parameter".
> I'ìm trying to put myself in the place of someone writing a REST script.
> If one needs to just remove feature types, they will use:
>
>
> DELETE 
> /workspaces/{workspaceName}/datastores/{storeName}/featuretypes/{typeName}
>
> If they never need to drop tables, there is no chance that they'll add a
> "purgeData=true" by mistake (pick whatever parameter name you want, as long
> as it's clearly separate from "cascade" and clearly indicates data
> destruction).
> If they instead are playing with the idea, and are still undecided on what
> to do, they will likely have something like this, with the parmeter
> commented out but ready to be put back in.
> I'm simulating with color a bit of syntax highlighting, and keeping
> cascade=true in all calls because we most likely want to remove the
> associated LayerInfo as well.
>
>
> DELETE 
> /workspaces/{workspaceName}/datastores/{storeName}/featuretypes/{typeName}?cascade=true
> #?dropData=true
>
> If instead we have a separate API, they will likely have this:
>
>
> DELETE 
> /workspaces/{workspaceName}/datastores/{storeName}/featuretypes/{typeName}?cascade=true
> # DELETE
> /workspaces/{workspaceName}/datastores/{storeName}/nativeFeaturetypes/{typeName}?cascade=true
>
> I have a hard time figuring out how one is more dangerous than the other.
> Going by personal experience, I would be a bit more prone to make an error
> with the second case, because the difference between one and the other lies
> in the middle of the path rather that being separate at the end... but it's
> probably just me, I tend to focus and structure and glance over the text,
> rather than actually reading it word by word.
>
> *Small data management side note*
> Dealing with data import/management/removal is inherently dangerous
> no matter what tool is used.
> The documentation for the tools allowing it should be clear, and
> indications on proper security setup is also beneficial.
>
> *Conclusion*
> Please, let's try to go towards a direction of more API consistency,
> rather than less of it.
> So I'm ok with the a "native" API, but at least let's have it consistently
> available across the board, rather than having a "feature type exception".
> Let's also have a plan on how to handle the inevitable functional
> redundancy (deprecation? removal?)
>
> Andrea
>
>
> On Thu, Feb 15, 2024 at 8:19 PM Gabriel Roldan <
> gabriel.rol...@camptocamp.com> wrote:
>
>> Two wrongs don't make a right, IMHO.
>> I'd rather break convention than introduce such a dangerous parameter to
>> an existing API endpoint and change its semantics
>>
>> I'm not talking about a rewrite of the REST API, but a new verb to the
>> existing API. Don't see how that'd make it harder for the people that's
>> used to it, when it doesn't even exist. Used the the rest api having all
>> sorts of parameters I take it. So how much harder is it when I ask "how do
>> I delete a table in my database?"
>> - oh, just need to add this param to DELETE
>> ../stores/mystore/featuretypes/xxx
>> - or, oh, just need to call DELETE ../stores/mystore/drop/xxx
>>
>> For the former, you need to update the documentation saying if you pass
>> this, then it does this, if you pass this other parameter then it does this
>> other thing. Beware to check all your scripts, because if you accidentally
>> leave this parameter on, your company database will be destroyed.
>> As opposed to not changing the current api and adding an endpoint that
>> says:
>> This will delete the database table. Fails in case there's an existing
>> FeatureType using it.
>>
>> My point is this deserves its own endpoint. Way more explicit and less
>> error prone than a new param to the current delete featuretype operation,
>> which would change its semantics so drastically.
>>
>> In any case, that's just my opinion.
>>
>> Cheers,
>> Gabe
>>
>> *camptocamp*
>> INNOVATIVE SOLUTIONS
>> BY OPEN SOURCE EXPERTS
>>
>> *Gabriel Roldán*
>> Geospatial Developer
>>
>>
>>
>> On Thu, Feb 15, 2024 at 1:18 PM Andrea Aime <
>> andrea.a...@geosolutionsgroup.com> wrote:
>>
>>> On Thu, Feb 15, 2024 at 3:00 PM Gabriel Roldan <gabriel.rol...@gmail.com>
>>> wrote:
>>>
>>>> I mean "increases complexity and difficults understanding", of course.
>>>>
>>>
>>> But parameters are already widely used in the GeoServer REST API, and
>>> "cascade" is used in other places as well.
>>> This would be breaking convention, making the API harder to use for
>>> those that are already used to it.
>>>
>>> Don't get me wrong, I know the GeoServer REST API is old and would need
>>> a rewrite, but that's the key point, a rewrite would be the time to break
>>> compatibility and adopt new ways of doing things.
>>>
>>>
>>>>
>>>>
>>>>> question arises of how to determine which table to delete once you
>>>>> deleted the FeatureType. I guess it should be an operation of the 
>>>>> DataStore
>>>>> and not of the FeatureType, and use the FeatureType's nativeName to
>>>>> distinguish?
>>>>>
>>>>
>>> The REST API only returns configured feature types by default. There is
>>> (guess what?) a parameter in the "featuretypes" resource, called "list",
>>> that can take
>>> 3 different values:
>>>
>>>    - "configured" (default if not specified): only lists the configured
>>>    feature types (links to the feature type info resource)
>>>    - "available": returns the native feature types not yet configured
>>>    (mind, only their names)
>>>    - "available_with_geom": same as above, bon only spatial ones
>>>    - "all": returns all of them, configured or available (again, just
>>>    names)
>>>
>>> The FeatureTypeController delete mapping already has a "recurse" flag to
>>> delete layers while the feature type is removed.
>>> Now here there is a risk of confusion between "recurse" and "cascade", a
>>> "removeData" flag would probably avoid confusion.
>>>
>>> Cheers
>>> Andrea
>>>
>>> ==
>>>
>>> GeoServer Professional Services from the experts!
>>>
>>> Visit http://bit.ly/gs-services-us for more information.
>>> ==
>>>
>>> Ing. Andrea Aime
>>> @geowolf
>>> Technical Lead
>>>
>>> GeoSolutions Group
>>> phone: +39 0584 962313
>>>
>>> fax:     +39 0584 1660272
>>>
>>> mob:   +39  339 8844549
>>>
>>> https://www.geosolutionsgroup.com/
>>>
>>> http://twitter.com/geosolutions_it
>>>
>>> -------------------------------------------------------
>>>
>>> Con riferimento alla normativa sul trattamento dei dati personali (Reg.
>>> UE 2016/679 - Regolamento generale sulla protezione dei dati “GDPR”), si
>>> precisa che ogni circostanza inerente alla presente email (il suo
>>> contenuto, gli eventuali allegati, etc.) è un dato la cui conoscenza è
>>> riservata al/i solo/i destinatario/i indicati dallo scrivente. Se il
>>> messaggio Le è giunto per errore, è tenuta/o a cancellarlo, ogni altra
>>> operazione è illecita. Le sarei comunque grato se potesse darmene notizia.
>>>
>>> This email is intended only for the person or entity to which it is
>>> addressed and may contain information that is privileged, confidential or
>>> otherwise protected from disclosure. We remind that - as provided by
>>> European Regulation 2016/679 “GDPR” - copying, dissemination or use of this
>>> e-mail or the information herein by anyone other than the intended
>>> recipient is prohibited. If you have received this email by mistake, please
>>> notify us immediately by telephone or e-mail
>>> _______________________________________________
>>> Geoserver-users mailing list
>>>
>>> Please make sure you read the following two resources before posting to
>>> this list:
>>> - Earning your support instead of buying it, but Ian Turton:
>>> http://www.ianturton.com/talks/foss4g.html#/
>>> - The GeoServer user list posting guidelines:
>>> http://geoserver.org/comm/userlist-guidelines.html
>>>
>>> If you want to request a feature or an improvement, also see this:
>>> https://github.com/geoserver/geoserver/wiki/Successfully-requesting-and-integrating-new-features-and-improvements-in-GeoServer
>>>
>>>
>>> Geoserver-users@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/geoserver-users
>>>
>>
>
> --
>
> Regards,
>
> Andrea Aime
>
> ==
> GeoServer Professional Services from the experts!
>
> Visit http://bit.ly/gs-services-us for more information.
> ==
>
> Ing. Andrea Aime
> @geowolf
> Technical Lead
>
> GeoSolutions Group
> phone: +39 0584 962313
>
> fax:     +39 0584 1660272
>
> mob:   +39  339 8844549
>
> https://www.geosolutionsgroup.com/
>
> http://twitter.com/geosolutions_it
>
> -------------------------------------------------------
>
> Con riferimento alla normativa sul trattamento dei dati personali (Reg. UE
> 2016/679 - Regolamento generale sulla protezione dei dati “GDPR”), si
> precisa che ogni circostanza inerente alla presente email (il suo
> contenuto, gli eventuali allegati, etc.) è un dato la cui conoscenza è
> riservata al/i solo/i destinatario/i indicati dallo scrivente. Se il
> messaggio Le è giunto per errore, è tenuta/o a cancellarlo, ogni altra
> operazione è illecita. Le sarei comunque grato se potesse darmene notizia.
>
> This email is intended only for the person or entity to which it is
> addressed and may contain information that is privileged, confidential or
> otherwise protected from disclosure. We remind that - as provided by
> European Regulation 2016/679 “GDPR” - copying, dissemination or use of this
> e-mail or the information herein by anyone other than the intended
> recipient is prohibited. If you have received this email by mistake, please
> notify us immediately by telephone or e-mail
>

_______________________________________________
Geoserver-users mailing list

Please make sure you read the following two resources before posting to this 
list:
- Earning your support instead of buying it, but Ian Turton: 
http://www.ianturton.com/talks/foss4g.html#/
- The GeoServer user list posting guidelines: 
http://geoserver.org/comm/userlist-guidelines.html

If you want to request a feature or an improvement, also see this: 
https://github.com/geoserver/geoserver/wiki/Successfully-requesting-and-integrating-new-features-and-improvements-in-GeoServer


Geoserver-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-users

Re: [Geoserver-users] Deleting database table through REST API

Reply via email to