Re: [fcrepo-user] Cmodel discovery?

Scott Prater Mon, 01 Nov 2010 15:53:46 -0700

I think there's a tension here between what Fedora *can* do and what the 
vast majority of users *expect* Fedora to do. As an expert user and 
developer of Fedora, you don't want your hands tied by constraining 
validation logic that is hardwired into the application;  on the other 
hand, as relatively new user, or as someone who is happy with the way 
Fedora runs out of the box, you don't want to spend a lot of time 
puzzling through its underlying conceptual framework in order to build 
validation tools for your default environment.


So it sounds like loosely-coupled validation toolkits that can be used 
asynchronously to validate and report on the current state of relations 
among objects in the repository (as opposed to gatekeepers that restrict 
objects being ingested, removed, or updated) might be the way to go... a 
passive, asynchronous validation mechanism that reports, rather than a 
proactive, real-time validation mechanism that restricts how repository 
state changes occur.

I already have a script that walks the resource index to verify the 
parent/child object relations of composite objects (constructed using 
the atomistic approach);  it verifies that the the relations I've 
defined as necessary for parent/child objects exist, and reports if 
there are any discrepancies.  Something like that could be easily 
extended to include checking the relations of the default CMA 
environment.  Would this perhaps be a good tool to include in the next 
generation of Fedora command line tools?

  -- Scott

Benjamin Armintor wrote:
> Just a quick response:  Requiring a tightly validating ingest seems
> like a bad idea to me.  A lot of the way one participates in a
> conversation like this depends on your conception of type in Fedora.
> I prefer to think of it as mixed and inferential:  When a content
> model is being used for validation (I think by default it's really
> just documentation), the model of the content model would indicate the
> way that the validation should go forward.  There's currently only one
> model for content models, but that's not necessarily so.
> 
> Likewise, the other function of content models- the location of
> service deployments- depends only on the presence of an appropriate
> arc:
> A has Model B
> for all C such that C isContractorOf B, A has the services defined by C.
> 
> This has little to do with the particulars of type of B (or C, for
> that matter), and loading too much specificity of type will introduce
> some constraints on the way that other parts of the CMA (like
> services) can go forward.
> 
> So I wouldn't argue against an optional flag (I also probably wouldn't
> use it), but I'd be very wary of requiring a particular model (or list
> of models).  I'd prefer content-model driven type validation to be a
> runtime (post-ingest) affair, especially in the absence of transaction
> machinery.
> 
> - Ben
> 
> 
> On 10/30/10, Bill Parod <bill-pa...@northwestern.edu> wrote:
>> Steve,
>>
>> I like this idea.
>>
>> Configurable (default - soft/warning and optional - hard/failure) levels of
>> compliance, checked on ingest/update will allow those of us who might still
>> have cmodels without the appropriate hasModel assertions to discover that
>> fact gracefully. Would checking on access also be a feasible and appropriate
>> configuration option?
>>
>> I wonder too if validation checks could be invoked asynchronously as part of
>> a data integrity / preservation utility or suite of utilities. Having such
>> validation mechanisms available could encourage a broader suite of
>> preservation services facilitating audit/certification, such as in TRAC. I
>> imagine this could work really well with the Spring
>> refactoring/decomposition that's been discussed.
>>
>> - Bill
>> .
>> On Oct 30, 2010, at 3:20 AM, Steve Bayliss wrote:
>>
>>> Some very interesting discussions.  After digesting and thinking some
>>> more,
>>> my views are:
>>>
>>>> (a) when ingesting content model objects, should we enforce a RELS-EXT
>>> assertion to a valid content model for content model objects? or
>>>
>>> This is difficult, conceptually, I think.  We only know that it is a
>>> content
>>> model object by presence of the hasModel assertion to the
>>> <info:fedora/fedora-system:ContentModel-3.0> object, so can't directly
>>> validate for presence of that relationship.  We should not infer that it
>>> is
>>> a content model by any other means - for instance the presence of
>>> content-model-reserved datastream such as DS-COMPOSITE-MODEL as
>>> conceptually
>>> at least, the "reserved" status (or interpretation) of that datastream is
>>> only by virtue of it being a content model object - there should be
>>> nothing
>>> that prevents data objects for instance having a datastream of that name.
>>> (Furthermore, conceptually at least -- and certainly not currently
>>> implemented that way in code -- the interpretation of any "reserved"
>>> datastream should be through content models; for instance DC, RELS-EXT and
>>> RELS-INT *should* be interpreted as reserved by dint of the object
>>> belonging
>>> the default data object content model).
>>>
>>> In short, I think that any interpretation of the type/kind of object
>>> should
>>> be through explicit typing of the object through a hasModel relationship.
>>>
>>> However there is probably some useful validation that maybe could be done
>>> through ECM.  For instance validating that the target of a data object's
>>> hasModel relationship itself asserts membership of a content model content
>>> model, similarly for the other CMA relationships.  So this particular
>>> validation would in fact be validation on ingest of a data object -
>>> validation that the network of relationships associated with the data
>>> object
>>> are correct.
>>>
>>> I like the idea of configurable levels of validation, ie validation
>>> through
>>> an explicit call, "soft" (ie warnings only) validation on ingest, hard
>>> validation (error) on ingest, maybe configurable levels of what to
>>> validate.
>>> The default behaviour should be as it is at the moment (therefore not
>>> imposing any restrictions on order of ingest).
>>>
>>>> (b) should we create a Resource Index triple identifying the
>>> fedora-system:ContentModel-3.0 as a default for content model objects when
>>> none is specified in RELS-EXT?
>>>
>>> As for (a), if the only way of identifying that an object is a content
>>> model
>>> object is through its hasModel assertion, then this is conceptually
>>> difficult.
>>>
>>>> (c) should we stop CMA features working (eg the dissemination execution)
>>> if the object identified as the content model does not itself identify
>>> through RELS-EXT that it is a content model object?
>>>
>>> Probably.  A question as to when - should we explicitly introduce
>>> additional
>>> checks now, or should we leave this to when they are required by other
>>> features (for instance, introducing different types of system content
>>> models, allowing different ways of describing objects and the services
>>> available to them - if this was done then presumably it would be necessary
>>> for the hasModel assertion to be present in order to correctly interpret
>>> the
>>> cmodel/sdef/sdep objects).
>>>
>>> The downside of doing this sooner rather than later is that it could break
>>> existing, working, content models that don't make the hasModel assertion.
>>> (However it could be argued that it is a bug that these content models
>>> just
>>> happen to work at the moment.)
>>>
>>> Taking on board the thoughts on levels of validation, that could be
>>> applied
>>> here also, the implementation of this could consist of default "do
>>> nothing"
>>> behaviour, and configurable settings that alternatively generate warnings
>>> or
>>> errors when the missing assertions are discovered as part of service
>>> execution.  (Possibly the default should be warnings, to alert folks, as
>>> the
>>> hasModel assertion may be required in the future).
>>>
>>> Steve
>>>
>>>
>>>> -----Original Message-----
>>>> From: aj...@virginia.edu [mailto:aj...@virginia.edu]
>>>> Sent: 29 October 2010 23:12
>>>> To: Support and info exchange list for Fedora users.
>>>> Subject: Re: [fcrepo-user] Cmodel discovery?
>>>>
>>>>
>>>> Aaron--
>>>>
>>>> You don't have to convince me of the dangers. I have ugly
>>>> memories of watching Ross Wayland and Thorny Staples being
>>>> chafed by the straightjacket of strong integrity for
>>>> objects/bdefs/bmechs that was baked into the 2.x series. {grin}
>>>>
>>>> With time, though, I've forgotten the grim images enough to
>>>> reprice in my mind the guarantees that such strong
>>>> constraints buy. But that's only my situation, and I accept
>>>> your narrative and its import and the cautions they imply.
>>>>
>>>> Let me suggest the existence of a set of categories of
>>>> repository behavior desirable by different users, categories
>>>> that might be directly connected with scale. The cost of
>>>> correcting a failure that is publicly visible (e.g. a
>>>> dissemination that fails) is often in a direct relationship
>>>> with some function of the size of the repository. It may be
>>>> that such a cost is very high to some users (who would then
>>>> prefer to work in that comfortable, stylish, and supportive
>>>> straightjacket) but a marginal and uninteresting item to
>>>> others (who find it confining).
>>>>
>>>> I'm impressed by Scott's suggestion of parameterized
>>>> validation behavior, and I wonder if we could imagine
>>>> partitioning his proposed CMA validation flag further into a
>>>> module or service to include some of the options we've been
>>>> discussing, and perhaps others. Could we leverage some of the
>>>> enhanced content model validation functionality to support a
>>>> range of sizes of straightjacket? {grin}
>>>>
>>>> While I understand and accept that forcing every user to
>>>> construct workflows that fulfill these kinds of integrity
>>>> would be wrong, I believe that there are enough users who
>>>> would really benefit from the workflow feedback and long-term
>>>> stability that such integrities provide that it's worth
>>>> considering such provision.
>>>>
>>>> ---
>>>> A. Soroka
>>>> Digital Research and Scholarship R & D and Online Library Environment
>>>> the University of Virginia Library
>>>>
>>>>
>>>>
>>>>
>>>> On Oct 29, 2010, at 4:32 PM, Aaron Birkland wrote:
>>>>
>>>>>> The CMA is such a core part of the repository architecture
>>>> that I think a situation in which the repository can be said
>>>> to be working but the CMA can't be is a bad situation to enable.
>>>>> Ah, I see your perspective.  "working" is a bit of a sticky
>>>> point here.
>>>>> In conceptualizing the CMA, here were a few thoughts or
>>>> principles that
>>>>> motivated its design:
>>>>>
>>>>> - Users may choose to ignore the CMA - simply preserving
>>>> and providing
>>>>> access objects without an explicit model is a valid use case.
>>>>>
>>>>> - The core repository is not concerned with referential integrity of
>>>>> RELS-EXT relationships.
>>>>>
>>>>> - There shall not be a prescribed order in which objects must be
>>>>> ingested into the repository.
>>>>>
>>>>> - Service binding will occur dynamically.  If this cannot happen for
>>>>> some reason (missing objects, relationships, etc), then a
>>>> runtime error
>>>>> is reported.
>>>>>
>>>>> These thoughts were partly in response to problems
>>>> encountered with the
>>>>> precursor to the CMA.  In particular, the precursor *did*
>>>> enforce a kind
>>>>> of referential integrity - and this turned out to be a bit of a sore
>>>>> point.  In response, there was a trend to more lightweight
>>>> and dynamic
>>>>> behaviour.
>>>>>
>>>>> So, in other words, it was intentional that the core
>>>> repository would be
>>>>> capable of ingesting and preserving objects that don't yet
>>>> contribute to
>>>>> a functioning CMA behaviour.  I think it was viewed that
>>>> higher-level
>>>>> validation, correction, etc could occur later (if at all),
>>>> or as part of
>>>>> some other functionality on layered on top of the basic
>>>> core and enabled
>>>>> separately.
>>>>>
>>>>> Perhaps somebody can give more on the history and
>>>> motivation.  It could
>>>>> be worth revisiting if the resulting behaviours are seen as
>>>> generally
>>>>> unintuitive.
>>>>>
>>>>>> It's not obvious to me how feedback could be supplied to
>>>> avoid that, but perhaps the ingest method could continue
>>>> without the additional validation of a) but could provide a
>>>> response notation like "Ingested with PID", "Ingested as
>>>> content model with PID", "Ingested as service definition with
>>>> PID", etc. for the system-defined models? It could even be
>>>> extended to user-defined models, which could provide valuable
>>>> feedback in a workflow. But admittedly, that would induce
>>>> more complexity and even if carried out cleverly, might break
>>>> older fragile installed workflows...
>>>>> That is an interesting line of thought.
>>>>>
>>>>> -Aaron
>>>>>
>>>>>
>>>>>
>>>> --------------------------------------------------------------
>>>> ----------------
>>>>> Nokia and AT&T present the 2010 Calling All
>>>> Innovators-North America contest
>>>>> Create new apps & games for the Nokia N8 for consumers in
>>>> U.S. and Canada
>>>>> $10 million total in prizes - $4M cash, 500 devices, nearly
>>>> $6M in marketing
>>>>> Develop with Nokia Qt SDK, Web Runtime, or Java and Publish
>>>> to Ovi Store
>>>>> http://p.sf.net/sfu/nokia-dev2dev
>>>>> _______________________________________________
>>>>> Fedora-commons-users mailing list
>>>>> Fedora-commons-users@lists.sourceforge.net
>>>>> https://lists.sourceforge.net/lists/listinfo/fedora-commons-users
>>>>
>>>> --------------------------------------------------------------
>>>> ----------------
>>>> Nokia and AT&T present the 2010 Calling All Innovators-North
>>>> America contest
>>>> Create new apps & games for the Nokia N8 for consumers in
>>>> U.S. and Canada
>>>> $10 million total in prizes - $4M cash, 500 devices, nearly
>>>> $6M in marketing
>>>> Develop with Nokia Qt SDK, Web Runtime, or Java and Publish
>>>> to Ovi Store
>>>> http://p.sf.net/sfu/nokia-dev2dev
>>>> _______________________________________________
>>>> Fedora-commons-users mailing list
>>>> Fedora-commons-users@lists.sourceforge.net
>>>> https://lists.sourceforge.net/lists/listinfo/fedora-commons-users
>>>>
>>>
>>> ------------------------------------------------------------------------------
>>> Nokia and AT&T present the 2010 Calling All Innovators-North America
>>> contest
>>> Create new apps & games for the Nokia N8 for consumers in  U.S. and Canada
>>> $10 million total in prizes - $4M cash, 500 devices, nearly $6M in
>>> marketing
>>> Develop with Nokia Qt SDK, Web Runtime, or Java and Publish to Ovi Store
>>> http://p.sf.net/sfu/nokia-dev2dev
>>> _______________________________________________
>>> Fedora-commons-users mailing list
>>> Fedora-commons-users@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/fedora-commons-users
>> Bill Parod
>> Library Technology Division - Enterprise Systems
>> Northwestern University Library
>> bill-pa...@northwestern.edu
>> 847 491 5368
>>
>>
>>
>>
> 
> ------------------------------------------------------------------------------
> Nokia and AT&T present the 2010 Calling All Innovators-North America contest
> Create new apps & games for the Nokia N8 for consumers in  U.S. and Canada
> $10 million total in prizes - $4M cash, 500 devices, nearly $6M in marketing
> Develop with Nokia Qt SDK, Web Runtime, or Java and Publish to Ovi Store 
> http://p.sf.net/sfu/nokia-dev2dev
> _______________________________________________
> Fedora-commons-users mailing list
> Fedora-commons-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/fedora-commons-users


-- 
Scott Prater
Library, Instructional, and Research Applications (LIRA)
Division of Information Technology (DoIT)
University of Wisconsin - Madison
pra...@wisc.edu

------------------------------------------------------------------------------
Nokia and AT&T present the 2010 Calling All Innovators-North America contest
Create new apps & games for the Nokia N8 for consumers in  U.S. and Canada
$10 million total in prizes - $4M cash, 500 devices, nearly $6M in marketing
Develop with Nokia Qt SDK, Web Runtime, or Java and Publish to Ovi Store 
http://p.sf.net/sfu/nokia-dev2dev
_______________________________________________
Fedora-commons-users mailing list
Fedora-commons-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/fedora-commons-users

Re: [fcrepo-user] Cmodel discovery?

Reply via email to