Hi Ben,
sorry, hit the wrong button and did not CC the list in my previous mail...

Anyway, I think I have a working implementation:
https://github.com/geotools/geotools/pull/956

Unfortunately, I couldn't figure out a way to walk the dependency graph in
case of polymorphic feature chaining mappings
<http://docs.geoserver.org/latest/en/user/data/app-schema/polymorphism.html#polymorphism>...
I believe in that case it is impossible to determine the nested type
without evaluating the function expression in the linkElement tag, which
must be done on a per feature basis... something I absolutely don't want to
do.

Your feedback is highly appreciated :-)

On Wed, Sep 9, 2015 at 11:59 AM, Stefano Costa <
[email protected]> wrote:

> Hi Ben,
> thanks for the feedback. Please, see my comments inline.
>
> On Tue, Sep 8, 2015 at 11:03 PM, Ben Caradoc-Davies <[email protected]>
> wrote:
>
>> Stefano, this resource leak is certainly worth fixing. I have a few
>> thoughts:
>>
>> (1) app-schema supports cyclic references between features of different
>> type. For example, a feature of type A can have nested within it a feature
>> of type B which has nested within it the containing feature (there are
>> examples of this in GeoSciML). Some schemas may have cyclic types between
>> different features instances (I imagine something like an adjacentFeature
>> property). Or we can have A contains B contains C which contains A, and so
>> on. Any or all of these can be top-level accessible types, and several
>> could be non-feature types. The encoder sorts this out by encoding the
>> first instance by value and repeated instances by reference, to preserve
>> the uniqueness of gml:id (I do not know how deep across cycles filtering is
>> supported). Do cyclic type relationships change your implementation
>> strategy?
>>
>
> Good point: the algorithm I have in mind is similar to reference counting
> <https://en.wikipedia.org/wiki/Reference_counting>, but I'll introduce a
> variation to deal with reference cycles.
>
> However, I realized I'm reasoning too much in terms of dependencies
> between DataAccess instances; I should focus on dependencies between types
> instead. Reading the docs here
> <http://docs.geoserver.org/latest/en/user/data/app-schema/mapping-file.html#includedtypes-optional>
>  and
> here
> <http://docs.geoserver.org/latest/en/user/data/app-schema/feature-chaining.html#create-a-mapping-file-for-every-complex-type>,
> I was under the impression that the mapping file containing the definition
> of non-feature types had to be included by each and every data access
> instance that wanted to access them, and so I figured I'd only need to
> track included files per DataAccess. But this is not the case: as the
> example makes clear, the mapping file with the definition of a non-feature
> type (in the example, CGITermValue_MappingFile.xml) is included just once,
> in GeologicUnit_MappingFile.xml, and then it is referred to
> in GeologicEvent_MappingFile.xml, which does not include it.
>
> So, to make a long story short, for each DataAccess, I should follow
> feature chaining links to determine dependencies between DataAccesses and
> not just file inclusion.
> I will make sure to write unit tests to prove the code works as expected
> also in case of cyclic dependencies.
>
>
>> (2) The language in the code and manual is "non-feature" types. I wonder
>> if this covers all situations? Do you need a flag or can you use
>> instanceof? Should the DataAccessRegistry create and maintain a directed
>> dependency graph? It may not in general be a tree.
>>
>
> I think in this context "non-feature type" is equivalent to "whatever type
> is mapped in a separate mapping file with no associated datastore.xml file,
> which is then included in some other mapping file associated to a
> datastore.xml file" (a rather poor definition, couldn't come up with
> anything better :-) ).
>
> I will explore the possibility to use instanceof and get rid of the flag,
> but I'm not sure it will be enough: my understanding is that instanceof
> tests the actual "featuretypeness" of a type, regardless of where the type
> was defined, which is what I'm most interested in here.
>
> As for the need to maintain a dependency graph in DataAccessRegistry, I
> don't think it's necessary, as the information on dependencies is already
> available in the type mappings.
>
>
>
>>
>> (3) Can internal non-feature types be shared between accessible types?
>>
>
> Yes, they can. That's the reason why a non-accessible DataAccess is
> created for them and added to DataAccessRegistry.
>
>
>>
>> (4) DataAccessRegistry is an ugly, ugly hack. It contains a static
>> singleton, and exists because there is no GeoTools API by which a
>> DataAccess can discover other DataAccesses that it might need to respond to
>> queries. I could not think of a better way of doing it without adding a new
>> manager interface and modifying GeoServer to use it [1]. This was never a
>> problem before you added REST support, because all DataAccesses were
>> started and registered in a single GeoServer thread at startup time. Now
>> you have added REST support, and no good deed goes unpunished. Hopefully
>> catalog locking will protect against concurrent modification, but the
>> architecture of DataAccessRegistry would benefit from a redesign. In short,
>> please consider refactoring this class; I am also happy for you to adjust
>> it in a simple way like that you described because the cost of refactoring
>> may not be worth the benefit.
>>
>
> I'm afraid I won't have enough time (nor skills, probably) to refactor
> DataAccessRegistry, but I'll see what I can do.
>
>
>> Kind regards,
>> Ben.
>>
>
>> On 09/09/15 02:43, Stefano Costa wrote:
>>
>>> Dear developers,
>>> I have a workflow in which:
>>> 1. an app-schema mapping configuration involving multiple files is used
>>> to
>>> create an app-schema data store via REST as per this example:
>>>
>>> http://docs.geoserver.org/latest/en/user/rest/examples/curl.html#uploading-multiple-app-schema-mapping-files
>>> 2. the data store created in 1. is deleted and created again with a
>>> different mapping configuration (same mapped types, different attribute
>>> mappings)
>>>
>>> ----> a DataSourceException is thrown: "Duplicate mappingName or
>>> targetElement across FeatureTypeMapping instances detected..."
>>>
>>> I traced the problem to the fact that my configuration is mapping
>>> non-feature types separately
>>> <
>>> http://docs.geoserver.org/latest/en/user/data/app-schema/mapping-file.html#includedtypes-optional
>>> >:
>>>
>>> when
>>> the configuration is parsed, two distinct AppSchemaDataAccess instances
>>> are
>>> registered, one for the "real" datastore, another not directly accessible
>>> (does not appear in GetCapabilities nor GetFeature requests):
>>>
>>> https://github.com/geotools/geotools/blob/64bca8b6c87655b8f54c5e4b02b48e4ca371ffe7/modules/extension/app-schema/app-schema/src/main/java/org/geotools/data/complex/AppSchemaDataAccessFactory.java#L93
>>>
>>> ...but only one is removed from the data access registry on dispose:
>>>
>>> https://github.com/geotools/geotools/blob/64bca8b6c87655b8f54c5e4b02b48e4ca371ffe7/modules/extension/app-schema/app-schema/src/main/java/org/geotools/data/complex/AppSchemaDataAccess.java#L598
>>>
>>> https://github.com/geotools/geotools/blob/64bca8b6c87655b8f54c5e4b02b48e4ca371ffe7/modules/extension/app-schema/app-schema/src/main/java/org/geotools/data/complex/DataAccessRegistry.java#L309
>>>
>>> So, on the second creation attempt, non-feature types are still
>>> registered
>>> and a "duplicate mappingName or targetElement" exception is thrown.
>>>
>>> The issue does not occur if GeoServer is restarted after deleting the
>>> data
>>> store and before attempting to create it again, of course: that's likely
>>> the reason why this issue never came up before the introduction of the
>>> app-schema REST API, back in the days when app-schema data stores could
>>> only be configured manually and required a restart to be noticed anyway.
>>>
>>> To solve this inconvenience, I propose to modify AppSchemaDataAccess as
>>> such:
>>> - add a flag, called hidden or notAccessible, to mark AppSchemaDataAccess
>>> instances that should be automatically removed when no longer needed
>>> - add a couple of methods, addRelatedDataAccessInstance(DataAccess
>>> dataAccess) and Set<DataAccess> getRelatedDataAccessInstances(), to keep
>>> track of the related types needed by each "real" AppSchemaDataAccess
>>> instance
>>> - add some logic in DataAccessRegistry.unregister(DataAccess dataAccess)
>>> to
>>> automatically remove hidden or notAccessible AppSchemaDataAccess
>>> instances
>>> that are no longer referenced
>>>
>>> If you think my approach is sound, I can prepare a PR.
>>>
>>>
>>> Thanks for any feedback!
>>>
>>>
>>>
>>>
>>>
>>> ------------------------------------------------------------------------------
>>>
>>>
>>>
>>> _______________________________________________
>>> GeoTools-Devel mailing list
>>> [email protected]
>>> https://lists.sourceforge.net/lists/listinfo/geotools-devel
>>>
>>>
>> --
>> Ben Caradoc-Davies <[email protected]>
>> Director
>> Transient Software Limited <http://transient.nz/>
>> New Zealand
>>
>
>
>
> --
>
> Best regards,
> Stefano Costa
>
> ==
> GeoServer Professional Services from the experts! Visithttp://goo.gl/it488V 
> for more information.
> ==
> Dott. Stefano Costa
> Senior Software Engineer
>
> GeoSolutions S.A.S.
> Via Poggio alle Viti 1187
> 55054  Massarosa (LU)
> Italy
> phone: +39 0584 962313
> fax:     +39 0584 1660272
> http://www.geo-solutions.ithttp://twitter.com/geosolutions_it
>
> -------------------------------------------------------
> AVVERTENZE AI SENSI DEL D.Lgs. 196/2003
> Le informazioni contenute in questo messaggio di posta elettronica e/o
> nel/i file/s allegato/i sono da considerarsi strettamente riservate.
> Il loro utilizzo è consentito esclusivamente al destinatario del
> messaggio, per le finalità indicate nel messaggio stesso. Qualora
> riceviate questo messaggio senza esserne il destinatario, Vi preghiamo
> cortesemente di darcene notizia via e-mail e di procedere alla
> distruzione del messaggio stesso, cancellandolo dal Vostro sistema.
> Conservare il messaggio stesso, divulgarlo anche in parte,
> distribuirlo ad altri soggetti, copiarlo, od utilizzarlo per finalità
> diverse, costituisce comportamento contrario ai principi dettati dal
> D.Lgs. 196/2003.
>
> The information in this message and/or attachments, is intended solely
> for the attention and use of the named addressee(s) and may be
> confidential or proprietary in nature or covered by the provisions of
> privacy act (Legislative Decree June, 30 2003, no.196 - Italy's New
> Data Protection Code).Any use not in accord with its purpose, any
> disclosure, reproduction, copying, distribution, or either
> dissemination, either whole or partial, is strictly forbidden except
> previous formal approval of the named addressee(s). If you are not the
> intended recipient, please contact immediately the sender by
> telephone, fax or e-mail and delete the information in this message
> that has been received in error. The sender does not give any warranty
> or accept liability as the content, accuracy or completeness of sent
> messages and accepts no responsibility  for changes made after they
> were sent or for other risks which arise as a result of e-mail
> transmission, viruses, etc.
>
>


-- 

Best regards,
Stefano Costa

==
GeoServer Professional Services from the experts!
Visithttp://goo.gl/it488V for more information.
==
Dott. Stefano Costa
Senior Software Engineer

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054  Massarosa (LU)
Italy
phone: +39 0584 962313
fax:     +39 0584 1660272
http://www.geo-solutions.ithttp://twitter.com/geosolutions_it

-------------------------------------------------------
AVVERTENZE AI SENSI DEL D.Lgs. 196/2003
Le informazioni contenute in questo messaggio di posta elettronica e/o
nel/i file/s allegato/i sono da considerarsi strettamente riservate.
Il loro utilizzo è consentito esclusivamente al destinatario del
messaggio, per le finalità indicate nel messaggio stesso. Qualora
riceviate questo messaggio senza esserne il destinatario, Vi preghiamo
cortesemente di darcene notizia via e-mail e di procedere alla
distruzione del messaggio stesso, cancellandolo dal Vostro sistema.
Conservare il messaggio stesso, divulgarlo anche in parte,
distribuirlo ad altri soggetti, copiarlo, od utilizzarlo per finalità
diverse, costituisce comportamento contrario ai principi dettati dal
D.Lgs. 196/2003.

The information in this message and/or attachments, is intended solely
for the attention and use of the named addressee(s) and may be
confidential or proprietary in nature or covered by the provisions of
privacy act (Legislative Decree June, 30 2003, no.196 - Italy's New
Data Protection Code).Any use not in accord with its purpose, any
disclosure, reproduction, copying, distribution, or either
dissemination, either whole or partial, is strictly forbidden except
previous formal approval of the named addressee(s). If you are not the
intended recipient, please contact immediately the sender by
telephone, fax or e-mail and delete the information in this message
that has been received in error. The sender does not give any warranty
or accept liability as the content, accuracy or completeness of sent
messages and accepts no responsibility  for changes made after they
were sent or for other risks which arise as a result of e-mail
transmission, viruses, etc.
------------------------------------------------------------------------------
Monitor Your Dynamic Infrastructure at Any Scale With Datadog!
Get real-time metrics from all of your servers, apps and tools
in one place.
SourceForge users - Click here to start your Free Trial of Datadog now!
http://pubads.g.doubleclick.net/gampad/clk?id=241902991&iu=/4140
_______________________________________________
GeoTools-Devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/geotools-devel

Reply via email to