On Mar 5, 2008, at 4:36 PM, Stefano Mazzocchi wrote:

> Mark Diggory wrote:

>> Do I need to have all equivalency statements present at the same time
>> as my statements that are being loaded?
>
> No, equivalences can be loaded at any time, before or after the actual
> statements. There is a performance penalty because of this, but I
> thought that otherwise one would just smoosh the data before  
> loading it.
>>
>> In otherwords the Banach Sail cannot post process existing stored
>> Sesame content upon receiving a new equivalency? Correct?
>
> I'm not sure what you mean by "post-process".

I think I've just repeated myself.  I think your previous answer that  
I can add the equivalencies after I've added the original statements  
holds.

>
>> ...
>> that in "off the shelf" Longwell, to attain this sort of mapping
>> capability, the entire rdf dataset + equivalencies will still need
>> reloading for the equivalences to be in effect in the resulting
>> stored statements?
>
> The banach smoosher was designed so that you could throw  
> equivalences at
> longwell at any time and it would deal. For example, you could add the
> "hal abelson" equivalence after having loaded all the data in longwell
> and the UI would change accordingly.

Ok, clearly I misinterpreted the capability and understand now what  
it can do.  I'm most grateful.

>>> There are pros and cons about 'pre-massaging' data: having longwell
>>> perform the equivalences allows you to add such equivalences at
>>> runtime
>>> and, eventually, store enough information to be able to roll back  
>>> and
>>> return to a previous state.
>>
>> If state can be "rolled back" on Longwell, then maybe I misinterpret
>> what is actually being stored above. I hope you can clarify this  
>> for me?
>
> Currently, Banach's smooshing capabilities are destructive, meaning  
> that
> it cannot return the graph to the original pre-smooshed state after
> operating on a new fed equivalence.
>
> In order for the graph to return to the original state, it currently
> needs to be discarded and reloaded.
>
> So, in short: longwell can react to any new equivalence added at any
> point in time, before or after the subject and object of the
> equivalences are stored in the triple store. But removing an  
> equivalence
> does not return the graph to the previous state.

That is an important point to know and suggests that we still need to  
maintain the original data somewhere where it can be managed and  
recovered from independently of longwell.  You know, perhaps if the  
original statements effected by the transform were preserved in other  
sesame contexts when being replaced in the banache transform, then  
there would be a means to recover previous state.  A sort of version  
control on the statements.

>>> If this usecase is not necessary, then I agree that it's probably
>>> easier
>>> to 'cleanup' the data up front, using either the banach smoosher or
>>> even
>>> just another Banach operator that is written specifically for that
>>> purpose (Banach is a general purpose RDF transformer, sort of a
>>> pipeline
>>> for RDF processing).
>>
>> This is interesting and I will look at it further, I've been looking
>> for an RDF transformation pipeline tool.
>
> You'll understand that a "pipeline" (think of SAX-based cocoon  
> pipeline,
> for example) is unfortunately not easy to achieve for a graph-based  
> data
> (while it's relatively straightforward for XML trees).

What I want to do is introduce variants of RDF about the same  
"subject". For instance, in the way that SemanticWiki is presenting  
its RDF.

http://simile.mit.edu/mediawiki/index.php/Special:ExportRDF/Exhibit
http://simile.mit.edu/mediawiki/index.php? 
title=Exhibit&action=creativecommons
http://simile.mit.edu/mediawiki/index.php? 
title=Exhibit&action=dublincore

The Goal being to make the DSpace UI more descriptive in a Semantic  
Web / Linked Data sense.

>
> Banach was my attempt to have something equivalent to cocoon's XML
> transformation pipeline for RDF (where the pipeline stages are called
> 'operators').

And interestingly, Cocoon may end up a consumer where this ends up  
output (i.e. Manakin) to produce inline RDFa and/or the above formats  
linked in the response. Ultimately, it may be overkill and staying  
with RDFXML and XSLT transforms may prove easier given we're already  
talking about cocoon.

All this really points to needing greater reuse of the contents of  
Longwell (or Sesame) back upstream in the application where it can be  
fed back into the loop and used to inform the application... the  
subject of my next email.

thanks,
Mark



~~~~~~~~~~~~~
Mark R. Diggory - DSpace Developer and Systems Manager
MIT Libraries, Systems and Technology Services
Massachusetts Institute of Technology





_______________________________________________
General mailing list
[email protected]
http://simile.mit.edu/mailman/listinfo/general

Reply via email to