Niels:

Thanks for the analysis I believe we are now working on the same problem:
clarifying api design

org.geosever.platform.resource.Files and
org.geoserver.platform.resource.Paths are not strictly a ResourceStore
utility class. These classes helped the GeoServer codebase adjust to having
ResourceStore be such a strict API.

Files and Paths utility classes work with absolute paths (because our code
base does).
ResourceStore does not, and I believe we should take steps to remove the
test case that assumes it is willing to. And take steps to provide a
warning if an linux absolute path is used (and the ResourceStore will
assume that such a path is intended to be relative to the data directory).
I have found several cases where such paths were used (notably in the rest
Resource Controller) which have been addressed.
--
Jody Garnett


On Aug 3, 2023 at 4:21:31 AM, Niels Charlier via Geoserver-devel <
geoserver-devel@lists.sourceforge.net> wrote:

> Jody,
>
> Note that the current javadoc of org.geoserver.platform.resource.Files is
> highly problematic.
>
> From the original javadoc we get:
>
> "Utility class for handling Resource paths in a consistent fashion.
>
> This utility class is primarily aimed at implementations of ResourceStore
> ...."
>
> This utility class was indeed intended for resource paths and the resource
> store, not file paths and the file system.
>
> "The base location is represented with "",  * relative paths are not
> supported.
>
>
> The sentence 'relative paths are not supported' was meant with respect to
> the resource store. The meaning here is that all paths in the resource
> store are absolute, with respect to the _REsource Store root_, not the file
> system root!
>
>
> And then comes the new sentence:
>
>
> " Absolute paths are supported, with Linux systems using a leading {@code
> /}, and windows using * {@code L:}."
>
> This sentence contradicts everything before it. If this is a resourcestore
> utility class,
>
> why are we suddenly talking about absolute paths in the operating system?
> The resource store has nothing to do with this.
>
> There never should have been anything inside this class
>
> testing on operating system. There should never have been the method
> called "isAbsolute". This should have been put inside another class that is
> used for file paths.
>
> This will incredibly confuse new developers. They will not at all
> understand the purpose of the
>
> resource store as an abstraction that should not be assumed to be part of
> the file system.
>
> Instead of clarifying things for developers, make sure they understand how
> to use it and not abuse
>
> it, it seems we are giving up on its intended purpose completely and just
> making the confusion
>
> worse by writing nonsensical javadoc about it. Unless this is rolled back,
> we may as well
>
> delete the 'jdbcstore' module and give up entirely on the resource API.
>
> If that is the chose direction, fine, it is not up to me, but then be
> clear about it.
>
> Kind Regards
>
> Niels
>
>
>
>
> On 03/08/2023 08:46, Niels Charlier via Geoserver-devel wrote:
>
> Jody,
>
> Perhaps I can offer a suggestion that may be considered a reasonable
> compromise.
>
> The "resource:" URLs are not used because they were never documented in
> user docs. (The javadoc however clearly states that it always was the
> intention to be used for access to the 'data directory' via URLs.) I can
> see that one of the issues you have been working to resolve is the usage of
> URLs without a prefix, which are just naked paths and where there is
> ambiguity in interpretation.
>
> The resourcestore is low level API only accessible to developers, not
> users. Users will always access it via some layer in between, like rest or
> something that calls Resources.fromURL. When a developer call
> resourcestore.get he knows he is not necessarily using the file system, and
> the paths should not be considered file system paths. With
> Resources.fromURL this is not so clear.
>
> My suggestion is that we bring back / leave the resource store API to how
> it was, being lenient on the leading slash because there is no possible
> confusion there, and it avoids a huge amount of work. We may however agree
> (and document this) that the Resources.fromURL, as well as REST or other
> services accessible to users use the following guidelines (and we document
> this for both developers as well as users):
>
> - file: URLS will use the file system
> - resource: URLs will use the resource store
> - naked path without URL prefix: absolute paths are interpreted as file:
> URLs, relative paths are interpreted as resource: URLs.
>
> This way the issue is addressed at the level where it actually occurs, and
> OS/filesystem stuff does not creep into the resource store API.
>
> Kind Regards
>
> Niels
> On 03/08/2023 01:49, Jody Garnett wrote:
>
> Niels:
>
> Comments inline
> --
> Jody Garnett
>
>
> On Aug 2, 2023 at 12:20:43 PM, Niels Charlier <ni...@scitus.be> wrote:
>
>> Hi Jody,
>>
>> I do support checking everything is handled correctly.
>>
>
> Thanks Niels, sorry I am a bit frustrated these days so am probably not
> explaining clearly.
>
> I just don't seem the point in breaking huge amounts of code for nothing.
>> And that is what will happen if we start throwing exceptions in the
>> resource store if the path starts with a "/". Which is actually normal,
>> because the resource store has a root.
>>
> ResourceStore no longer uses `/` as a root.
> The work done earlier this year removed the use leading /  from the
> codebase (including the REST API Controller).
>
>> Please: define "absolute path". Is this OS dependent? Should the
>> behaviour of the resource store be OS dependent? What /exactly/
>>
> I put the definitions into the document I shared and the javadocs (not
> sure why I cannot find the javadocs online?).
>
> When I say "path" I mean the strings used by the class
> Paths.isAbsolute(String)
>
>    -
>    
> https://github.com/geoserver/geoserver/blob/main/src/platform/src/main/java/org/geoserver/platform/resource/Paths.java#L550
>    - This has nothing to do with the ResourceStore API.
>
>
> This is covered in the developers guide here:
> https://docs.geoserver.org/latest/en/developer/programming-guide/config/resource.html#paths
>
> Paths are broken down into a sequence of names, as listed by
> Paths.names(path):
>
>    -
>
>    Path.names("data/tasmania/roads.shp") is represented as a list of data
>    , tasmania, roads.shp.
>    -
>
>    On linux Path.names("/src/gis/cadaster/district.geopkg") starts with a
>    marker to indicate an absolute path, resulting in /, src, gis, cadaster
>    , district.geopkg.
>    -
>
>    On windows Path.names("D:/gis/cadaster/district.geopkg") starts with a
>    marker to indicate an absolute path, resulting in D:/, gis, cadaster,
>    district.geopkg.
>
> are you suggesting should be allowed or not allowed to access a resource
>> from the resource store?
>>
> I am suggesting paths that return true for this method be not allowed to
> access resources from the resource store.
>
>> Yes, "paths" are used across the application. But "path" is a vague term,
>> 'xpaths' are paths too, they have no relationship with file paths. Resource
>> paths are no different, they live in their own world and are not part of
>> the file system. The notion of a path being 'absolute' or 'relative' has no
>> meaning when you are talking about paths in general, this is file system
>> terminology.
>>
> Earlier in this email thread I shared the developer guide documentation
> which was written to define what Paths mean for GeoServer.
>
>> When someone gets a resource directly from the resource store, they are
>> using the resource store as a black box, and the "path" is always relative
>> to the _root_ of the resource store.
>>
> Correct, relative paths are with respect to the resource store location.
> Actually when I did he QA I found that there was at one point a search path
> that could be scanned to allow "data directory" to be spread across several
> folders (terrifying).
>
>> Resource.fromURL is a different story, these are not hard-coded URL's,
>> they are user provided, and then we allow both the file system as the
>> resource store to be accessed, where file: is for the file system and
>> resource: is for the resource store.
>>
> I have not seen users use "resource:", to my knowledge that was only used
> internally in the logic between in geoserver and geotools ( applied by a
> style visitor to help geotools resolve icons in the data directory.)
>
> See the examples Resource.fromURL(base, url)
> <https://docs.geoserver.org/latest/en/developer/programming-guide/config/resource.html#resources-fromurl>
> :
>
>    -
>
>    Resources.fromURL( null, "/src/gis/cadaster/district.geopgk") -
>    absolute file path (linux)
>    -
>
>    Resources.fromURL( baseDirectory, "D:\\gis\\cadaster\\district.geopkg") -
>    absolute file path (windows)
>    -
>
>    Resources.fromURL( baseDirectory,
>    "file:///D:/gis/cadaster/district.geopkg"
>    <file:///D:/gis/cadaster/district.geopkg>) - absolute file url
>    (windows)
>    -
>
>    Resources.fromURL( baseDirectory, "
>    ftp://veftp.gsfc.nasa.gov/bluemarble/";) - null (external reference)
>
> See the examples Paths.convert(base, filename)
> <https://docs.geoserver.org/latest/en/developer/programming-guide/config/resource.html#paths-convert>
>
>    -
>
>    Paths.convert(base,file) - uses URI relativize method to determine
>    relative path (between file and base)
>    -
>
>    Paths.convert(base,folder, fileLocation) - can resolve relative
>    location, limited to content within the base directory
>    -
>
>    Paths.convert(base, filename)
>
>
> See examples Files.fromURL( base, url)
> <https://docs.geoserver.org/latest/en/developer/programming-guide/config/resource.html#files-url>
>
>
>    -
>
>    Files.fromURL( null, "resource:styles/logo.svg"
>    <resource:styles/logo.svg>) - internal url format restricted to data
>    directory content
>    -
>
>    Files.fromURL( null, "/src/gis/cadaster/district.geopgk") - absolute
>    file path (linux)
>    -
>
>    Files.fromURL( baseDirectory, "D:\\gis\\cadaster\\district.geopkg") -
>    absolute file path (windows)
>    -
>
>    Files.fromURL( baseDirectory, "file:///D:/gis/cadaster/district.geopkg"
>    <file:///D:/gis/cadaster/district.geopkg>) - absolute file url
>    (windows)
>    -
>
>    Files.fromURL( baseDirectory, "ftp://veftp.gsfc.nasa.gov/bluemarble/";) -
>    null (external reference ignored as we cannot determine a file)
>    -
>
>    Files.fromURL( baseDirectory, "sde://user:pass@server:port") - null
>    (custom strings are ignored as we cannot determine a file)
>
>
> Folks use Paths (in the sense defined above) here also; the is the
> motivation for fixing absolute paths on windows (as the definition had
> changed in Java 11) and where the inconsistency of use was found.
>
>> I just haven't seen any valid argument why the original design is flawed.
>>
> I do not believe the origional design was flawed, it was not consistently
> applied in the codebase. Where a path is used
>
>> Kind Regards
>>
>> Niels
>>
> Does it help that path and absolute path is defined strictly above?
>
> Thanks,
> Jody
>
>> On 02/08/2023 16:40, Jody Garnett wrote:
>>
>> Niels:
>>
>> We may be on the same page, you indicate ResoueceStore should never use
>> absolute paths for the file system.
>>
>> That is the decision I wish to make, and then we should go do the work to
>> enforce it.  How much work? I am not sure, hence the work party.
>>
>> There is something subtle here though. You are focused on the
>> ResourceStore API, and I am asking you to consider the application as a
>> whole.
>>
>> We do have paths in the application including absolute paths. The QA
>> activity to check that these are handled correctly is the work.
>>
>> Paths are not only for ResourceStore.
>>
>> Jody
>>
>>
>> On Tue, Aug 1, 2023 at 11:55 PM Niels Charlier <ni...@scitus.be> wrote:
>>
>>> Hello Jody,
>>>
>>> Unfortunately I am not really on board with your plan. I really don't
>>> think we need to make a new branch and do a lot of work. This seems like a
>>> lot of unnecessary work. I think the original design is fine, and does not
>>> require breaking the code base. We just need to go back to how things were
>>> and perhaps fix a few cases where people have gone wrong since then.
>>>
>>>
>>>    -
>>>
>>>    Gabe and Niels are having trouble with the API changes made earlier
>>>    this year
>>>    -
>>>
>>>       The definition of path was inconsistent for windows absolute paths
>>>
>>> I disagree. The definition of 'path' for the resource store is not in
>>> any way related to the OS, because the resource store lives in a separate
>>> world from the file system. The resource store and the file system have no
>>> joint concept of 'path'.
>>>
>>>    -
>>>
>>>    Early resource store work was not strict about use of paths, and
>>>    when to use Resource and when to use File
>>>
>>> I disagree. It was always very clear about this. There should obviously
>>> never be hard coded absolute paths for the file system. In general the code
>>> should basically always use resources. Only where users can use URLs they
>>> have access to both the file system and the resource store, this is always
>>> how it was intended to work, as can be read from the javadoc.
>>>
>>>    -
>>>
>>>    The email conversation is going in circles because something has to
>>>    break
>>>
>>>
>>>    -
>>>
>>>    We have to accept that something that break
>>>
>>> I disagree. I think nearly all of the code base is fine with how it was
>>> designed to work.
>>>
>>> Regards
>>>
>>> Niels
>>> On 01/08/2023 19:31, Jody Garnett wrote:
>>>
>>> Niel, Gabe:
>>>
>>> Our email thread on this topic has explored the topic by now, I think
>>> action rather than more communication is needed next.
>>>
>>> Can we setup a 1/2 day breakout work-party to get this activity underway
>>> (prior to the Bolsena code sprint and the next release cycle). Even a
>>> couple hours to get the topic started on a branch would be great.
>>>
>>> I am proposing two times below, and am willing to wake up early to see
>>> this topic move along:
>>>
>>>
>>>    1. August 8th
>>>    
>>> <https://www.timeanddate.com/worldclock/meetingdetails.html?year=2023&month=8&day=8&hour=13&min=0&sec=0&p1=256&p2=45&p3=48>
>>>  15
>>>    CEST / 11 BRT / 6 PDT
>>>    2. August 15th
>>>    
>>> <https://www.timeanddate.com/worldclock/meetingdetails.html?year=2023&month=8&day=15&hour=13&min=0&sec=0&p1=256&p2=45&p3=48>:
>>>    15 CEST / 11 BRT / 6 PDT
>>>
>>>
>>> Is this a good idea :)
>>> --
>>> Jody Garnett
>>>
>>> --
>> --
>> Jody Garnett
>>
>>
>
> _______________________________________________
> Geoserver-devel mailing 
> listGeoserver-devel@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/geoserver-devel
>
> _______________________________________________
> Geoserver-devel mailing list
> Geoserver-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/geoserver-devel
>
_______________________________________________
Geoserver-devel mailing list
Geoserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel

Reply via email to