Hello Jody,

I feel you are not understanding me, it seems you are not really responding to any my points.

I disagree with your analysis. There is no need to break half to code base. In my opinion there is nothing wrong with the original idea and that is already how nearly all of the code base works. I think barely any work is necessary, if we stick with the original design. The original design is clear and makes a lot of sense.

What is your definition of an "absolute path"? Currently the method Paths.isAbsolute() is OS dependent. Should the syntax of the resource store paths be OS dependent, so that paths starting with a forward slash are allowed in windows but not in linux? That makes no sense to me, so please clarify.

I think there are two big confusions at the heart of this:

1) confusing paths with URLs. Only URLS support both the file system and the resource store at the same time. Apart from that the file system and the resource store live in separate worlds. So there is no joint concept of a "path". (they do not even necessarily have the same basis syntax: the resource store always uses forward slashes.) It is not true that we need to decide for a certain path whether it is for the file system or the resource store, based on it being relative or absolute. We only have to do this, and already do this, for URLs, using the the 'protocol' (resource: vs file:).

2) assuming that the resource store is part of the file system. As far the developer who uses the resource API is concerned, the resource store is not part of the file system. That is just one particular implementation.

The resource store literally has a "root". This root is commonly referred to as "/". It is very unnatural to start resource store paths immediately with a name, because the resource store is not relative to anything. It is its own thing and it has a root.

Regards

Niels



On 31/07/2023 16:12, Jody Garnett wrote:
Do I understand correctly,

You have made a long email about asking that an absolute path be treated as relative to data directory.

The problem we are faced with is that the code base has two inconsistent uses of absolute path.

No matter what decision made we are going to break half the code base.

With that in mind “causing problems all around” is expected.

I am also not sure if we are talking about the same thing exactly. If the documentation only paths (the Strings) are absolute.

When faced with a path we can clearly tell:
- relative: process as relative to the data directory use Resource API to access content. If needed resource.file() can be used to unpack File onto local file system. - absolute: process as File. If needed for a test case or something there is a wrapper that can pretend a random File location is a resource.

Jody


On Mon, Jul 31, 2023 at 4:23 AM Niels Charlier <ni...@scitus.be> wrote:

    Hello Jody,

    I have been looking into this, but I do not think that throwing an
    exception with absolute paths is a viable option. It is too
    problematic for backwards compatibility. For one it seems to
    completely break the "file chooser" in web-core, and it causes
    problems all around.

    The problem is the confusion of the term "absolute path" which can
    mean two things: absolute for the file system or absolute for the
    resource store's structure. The thing is, it has always been so
    that the resource store "root" has been treated as a linux style
    root: "/". It is normal to write ResourceStore.get("/a/b/c") or
    Resources.fromURL("resource:/a/b/c"). It is even much more natural
    than to write this as a relative path. That is because it actually
    is not relative but absolute: absolute for the resource store,
    separate from the file system.

    Also, checking for absolute paths would mean very different
    behaviour on windows than linux. On windows resource store style
    "/a/b/c/" paths would not be recognised as absolute and thus
    allowed, but they would be forbidden on linux. But the syntax of
    resource store paths is always the same, and is not dependent on
    the syntax of file system paths.

    This is exactly what is so problematic with the
    'theoryRootIsAbsoluteTest': it confirms whether a resource store
    path is absolute for the operating system file system. But as I
    said, the resource store path syntax does not depend on the
    operating system! So this is essentially a test that forces the
    resource store to be the file system, and not a custom implementation.

    We need to get rid of this test and get back to the old test that
    confirms that "/a/b/c" and "a/b/c" are one and the same thing,
    since the resource store has no concept of either a "current
    directory" or a file system root. It has its own root and it
    always starts from there, for any path submitted to it. This is
    very clear and lenient at the same time.

    Kind Regards

    Niels

    On 21/07/2023 16:06, Jody Garnett wrote:
    I think we make the change, so resource store fails with absolute
    path, and learn what other areas of the code assume resource
    store can access any location on disk.

    Then we will know how widespread this problems is.

    I am happy to split up the work, and we can add to the
    documentation a very clear sentence about resource store not
    supporting absolute paths.

    Jody

    On Fri, Jul 21, 2023 at 4:01 AM Niels Charlier <ni...@scitus.be>
    wrote:

        Hey Jody,

        How do we proceed practically?

        I am happy to make a PR to fix resource store and make the
        agreed upon core changes.

        Will you change the docs or would you like this to be part of
        my work?

        Kind Regards

        Niels

        On 13/07/2023 16:19, Jody Garnett wrote:
        I already tried to catch a lot of those when adjusting for
        the windows absolute paths.

        We could add some logic and provide a warning if any
        absolute paths get through? And the call
        Resources.fromURL(base, url) …

        Still we may be getting into a technical detail here.  What
        needs to be changed or clarified; then we can adjust the
        code to march.
        On Thu, Jul 13, 2023 at 1:42 AM Niels Charlier
        <ni...@scitus.be> wrote:

            In theory I agree but there is always a risk to changing
            lenient behavior to throwing exceptions, if there is
            code or production configs that have relied on paths
            starting with '/' just passing as relative...

            On 11/07/2023 22:27, Jody Garnett wrote:
            Thanks:

            Reading the docs, it is /okay/ but not /great/.

            I think we we made it so the ResourceStore
            implementations fail when provided an absolute path it
            would be best; and remove the test case that is causing
            problem.

            That way only Resources.fromURL(base, url) would
            support "absolute" paths and life would be more clear?
            --
            Jody Garnett


            On Jul 9, 2023 at 10:53:22 PM, Niels Charlier
            <ni...@scitus.be> wrote:

            Hello Jody,

            My comment about deprecating the file() was in
            response to what Gabriel said about that.

            Resource: urls are indeed for instance supported by
            SLD, if you want to link to a resource inside your
            data directory / resource store. But I think they can
            be used about anywhere where file: or other URLs are
            supported. As the javadoc for Resources.fromURL says:
            relative file: URLs are deprecated, use resource:
            instead. I don't know why they were never documented.

            Kind Regards

            Niels

            On 07/07/2023 16:48, Jody Garnett wrote:
            Niels,

            I do not understand the comment about deprecating the
            file() method? I agree with everything you said about
            how it is used and useful.

            I am not sure when resource URLs were added, I
            thought they were only added to pass information over
            to geotools side of things for SLD validation and did
            not expect to see them used directly in any user
            configuration. I guess we could talk resource: URL
            here in the developers docs explicitly. Do we know
            who wrote that (I hope it was not me).

            Still let's take this discussion and update the docs
            and API.
            --
            Jody Garnett


            On Jul 5, 2023 at 11:48:50 PM, Niels Charlier
            <ni...@scitus.be> wrote:

            Hello Gabriel and Jody,

            I agree with most of what Gabriel says, it is mainly
            the same point as mine except that I think it is
            unrealistic to deprecate the file() method. First of
            all we would need to to move the resource API to
            geotools because SLD files are stored in the
            resource store, but there are still tertiary tools
            that require files like template and image
            processors. The idea is that the file created on
            disk is only a cache for these tertiary tools and
            manipulating it will have no effect.

            Jody, most of the docs seem fine, except that the
            second blue box explanation for 'path' is confusing
            paths with URLs: 'file:/' URLs are not part of the
            resourcestore and resource rest API, and the other
            problems is that docs are missing any mention of the
            "resource:" URLs.  We must first of all distinguish
            between (1) resourcestore API and (2)
            Resources.fromURL method, I think there might be
            some confusion there. Then the distinction between
            file: and resource: urls need to be in the docs.

            The (1) Resourcestore does not deal with URLs or
            absolute paths, only relative paths. Internally,
            geoserver modules load their configuration files in
            the resource store directly from this API.

            The (2) Resources.fromURL deals with URLS and can
            handle both absolute file paths and resources from
            the resourcestore using the 'resource:' protocol
            prefix. This is for external use: the purpose is so
            that people can refer to both files on disk and
            resources in the resource store wherever they can
            specify a URL, for instance inside style files or in
            the parameters of a data store. File: URLs is for
            the entire file system, not the data directory and
            get 'converted' to a resource using the
            Files.asResource wrapping method. Resource: URLs
            refer to anything in the data directory or
            alternative resource store.

            As legacy from pre-resourcestore times, geoserver
            supports (supported?) 'file:' URLs relative to the
            data directory (this is a little bit 'dodgy' since
            'file:' urls are standardized and do not paths
            without leading slash -
            https://en.wikipedia.org/wiki/File_URI_scheme ).
            Note that the Resources.fromURL javadoc says this is
            deprecated and should be replaced with resource: URLs.

            Kind Regards

            Niels

            On 05/07/2023 15:06, Jody Garnett wrote:
            Hey folks, I was tired and not in position to act
            clearly when these gaps were noticed last year.

            We can tighten up the api definition, at least with
            the changes made we can now notice when an absolute
            path was used.

            I never quite managed to communicate the separation
            between using a file URL and File access (when
            DataStores wish to access local files) as distinct
            from use of a resource for managing configuration
            files.

             I suspect you both (Niels and Gabe) have met this
            distinction first hand - but it is hard to explain
            the value.

            Documentation is as added here:
            
https://docs.geoserver.org/latest/en/developer/programming-guide/config/resource.html

            This could your review and input.

            Jody

            On Wed, Jul 5, 2023 at 2:34 PM Gabriel Roldan
            <gabriel.rol...@gmail.com> wrote:

                If "the codebase had drifted away from the
                intended use over time" I think it's even more
                important to stick to the contract and not the
                other way around.
                As far as I can see, there are two
                abstractions, ResourceStore and Resource, the
                former clearly says
                "This abstraction assumes a unix like file
                system, all paths are relative and use forward
                slash  {@code /} as the separator",
                the latter "Resource used for configuration
                storage.".
                Only by adhering to this contract, we could
                provide alternative implementations.
                So if the changes were made to accommodate
                "common usage" on the specific case of a
                filesystem-based ResourceStore
                implementation, it should be the other way
                around, to find out usages that don't adhere to
                the spec and fix them.

                Concretely, I _think_ the only place where
                absolute URI's would be requested to the
                ResourceStore, is where a data, not configuration,
                file, is resolved, expecting the ResourceStore
                to be smart and resolve "file:data/roads.shp"
                relative to the datadir, and "file:/data/roads.shp"
                relative to the file system.
                Now, in doing so, we're asking the
                ResourceStore to do what it's not intended to.
                The test case Niels mentioned used to check
                that a leading "/"
                had no meaning for the resource store (i.e.
                /data/roads.shp == data/roads.shp), and that
                was changed to mean the opposite.
                So, IMHO, the responsibility of resolving files
                outside the datadir shouldn't be on
                ResourceStore, but on the client code.
                Something like:

                String path = ...
                File shp;
                if(Paths.isAbsolute(path))
                  shp = new File(path);
                else
                 shp = resourceStore.get(path).file();

                As a matter of fact,
                Resource.file():java.io.File should be
                deprecated and eventually removed. Resource is
                for config contents and
                has Resource.in():InputStream and
                Resource.out():OutputStream.

                It is hard enough already to adhere to lax
                interface contracts (catalog's default
                workspace hello) as to keep on making it more
                and more difficult.

                So what do we do? can we get to an agreement
                that the contract in the interfaces is
                mandatory and work from there?

                cheers,
                On Tue, 4 Jul 2023 at 17:49, Niels Charlier via
                Geoserver-devel
                <geoserver-devel@lists.sourceforge.net> wrote:

                    Hello Jody,

                    Yes, I admit it's my own fault for missing
                    this discussion at the time.

                    I think it would be a shame to let the
                    codebase further drift away from the
                    intended use of the resource store. We have
                    done the work to make the entirety of
                    geoserver generic with respect to the
                    implementation of the data directory and it
                    is only a minimal effort to keep it this
                    way. Even though jdbc store is still a
                    community module and the Redis based
                    geoserver project has been discontinued, it
                    has been proven that alternative
                    implementations for the data directory work
                    and the jdbc store module is  actually
                    being used in production successfully.

                    Now, interestingly, I have discovered that
                    there are contradictions in how it works /
                    is documented now.

                    - I discovered that while
                    'theoryRootIsAbsolute' test is successful,
                    it is actually very misleading. At least on
                    linux, paths starting with '/' *still
                    create a file relative to the data
                    directory*!. The only difference is the
                    string returned by the path() method. So
                    while the path might seem absolute, the
                    file you are accessing is not. So the
                    path() method is misleading and in reality
                    the leading slash is still being ignored
                    (again, at least on linux).

                    - Note that the javadoc of ResourceStore
                    still says:

                    This abstraction assumes a unix like file
                    system, *all paths are relative*

                    There it says it: all paths are relative
                    for the resourcestore. Absolute paths have
                    no meaning or function when it comes to the
                    resource store.

                    - I think this contradiction could be
                    resolved in two possible ways:

                    1) the test 'theoryIsRootSlashIsIgnored'
                    should come back instead of
                    'theoryRootIsAbsolute'. The root slash is
                    ignored and removed from the path.

                    2) The resource store throws an exception
                    when you ask for an absolute path.

                    Either way, all absolute paths should be
                    handled _outside of_ the ResourceStore, for
                    instance by calling Files.asResource(). So
                    problems with absolute paths in the rest
                    service should be resolved in the rest
                    service, or some other layer that is used
                    by the rest service.

                    Kind Regards
                    Niels

                    On 04/07/2023 21:59, Jody Garnett wrote:
                    Hello Niels,

                    The PRs for this activity contain
                    extensive discussion.

                    The fundamental issue was the handling of
                    absolute paths which was done differently
                    by different sections of code.
                    Specifically we found that the REST API
                    endpoint was allowing paths *//data* and
                    */data *to both reference content in the
                    data directory, rather than treating the
                    first one as an absolute path. In response
                    we tightened up the javadocs and added
                    test cases including the one you mentioned.

                    I agree that this goes against the goal of
                    resource store, but the codebase had
                    drifted away from the intended use over time.

                    Now that you are present in the discussion
                    it would be a good opportunity to discuss
                    this with the parties involved.
                    --
                    Jody Garnett


                    On Jul 3, 2023 at 2:50:02 PM, Niels
                    Charlier <ni...@scitus.be> wrote:

                    Hello Jody and others,

                    I am having trouble understanding the
                    changes that were made about 6 months ago
                    to the ResourceStore's expected behaviour.

                    In particular, in the class
                    'org.geoserver.platform.resource.ResourceTheoryTest',
                    the unit test 'theoryRootSlashIsIgnored'
                    was replaced by 'theoryRootIsAbsolute'. I
                    cannot make sense out of this theory test
                    at all.


                     This seems to be entirely contradictory
                    to the whole reason that the
                    ResourceStore API was created, that is to
                    make an abstraction of the /Data
                    Directory/, so that it can be replaced by
                    something else (such as jdbc store or
                    other implementations that have been
                    made). There was already support for
                    absolute file paths in all circumstances
                    by using "file:" URLs. This will bypass
                    the resource store and call
                    Files.asResource instead. But resource:
                    URLs are for the data directory or
                    alternative resource store only. How does
                    it make sense to get absolute paths from
                    the resource store?


                    In order to make jdbc-config pass the
                    tests, I will have to turn off this
                    particular method. But why should the
                    test even be there if the file resource
                    store is the only one that could ever
                    support it? Programmers and users will
                    rely on this behaviour and support for
                    all alternative implementations of
                    ResourceStore will be broken. In this
                    case we may as well do away with the API
                    and just use the file system directly again.


                    Kind Regards

                    Niels

                    _______________________________________________
                    Geoserver-devel mailing list
                    Geoserver-devel@lists.sourceforge.net
                    https://lists.sourceforge.net/lists/listinfo/geoserver-devel




-- Gabriel Roldán

-- --
            Jody Garnett

-- --
        Jody Garnett

-- --
    Jody Garnett

--
--
Jody Garnett
_______________________________________________
Geoserver-devel mailing list
Geoserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel

Reply via email to