Re: [topbraid-users] Re: How to Automatically Create a URI

Tim Smith Fri, 30 Mar 2012 13:33:49 -0700

Sorry about the last one - Gmail sent it too soon...

Thanks for the help.  To clarify the scenario, consider the following where
each line is from a different spreadsheet:


>From file #1 with two columns - Car and Color:

Car   Color
Car1 Blue

>From file #2 with two columns - Car and # of Doors

Car   # of Doors
Car1  2

Assume each file is processed in its own SM script.

Also assume Car1 could already have been created in the past, so a URI for
Car1 might already exist.

In order to attach the Color and # of Doors properties to the correct URI,
I have two choices:

1.  Look up (via a label) a matching instance URI for Car1 which requires a
query and could be expensive depending on the size of the graph that
contains all the cars and will be executed for each car in each spreadsheet
which could be 100,000+

2.  Create a URI for Car1 in a "standard" way so that it will match a
potentially existing URI for Car1.  This should be fast since no look-up is
required.

The key is "standardizing" the URI creation across scripts in a way that
makes it specific, preferably at the class level.

I agree that a SPIN Constructor has the best potential.  I had not thought
about replacing ?this.

I have a couple of follow-up questions:

What is the scope of the SPIN constructor during execution?

Will the constructor have access to the variables within the CONSTRUCT
query?

I would normally do something like this:

Consider this Construct query running in an SM module:

CONSTRUCT {
?carURI a ex:Car .
?carURI ex:hasColor ?color .
}
WHERE {
?row a ss:Row .
?row ss:Color ?color .
?row ss:Car ?CarID .
}

In that scenario, when is the constructor called?  Will it be able to pull
in the ?CarID variable?  If so, how do I receive a reference to the created
URI so that I can use it later in the CONSTRUCT query, i.e. assign
properties to it such as ?carURI ex:hasColor ?color

Using your example constructor as a base, the constructor for the ex:Car
class would look like:

DELETE
{  ?this ?p ?o .
  ?s1 ?p1 ?this .
}
INSERT
{  ?uri ?p ?o .
  ?s1 ?p1 ?uri
}
WHERE
{  ?this ?p ?o .
  ?s1 ?p1 ?this .
  BIND (spif:buildURI("http://www.example.com#{?1}";, ?CarID) AS ?uri)
}

Is the value of ?uri somehow made available to the calling script?

I think making a number of SPIN functions could serve the purpose but I
would like to keep them associated with a Class to simplify usage.

Thanks in advance for your thoughts,

Tim



On Fri, Mar 30, 2012 at 4:05 PM, Tim Smith <[email protected]> wrote:

> Hi Scott,
>
> Thanks for the help.  To clarify the scenario, consider the following
> where each line is from a different spreadsheet:
>
> From file #1:
>
> Car
>
> On Fri, Mar 30, 2012 at 11:29 AM, Scott Henninger <
> [email protected]> wrote:
>
>> Tim; I'm not entirely clear on your scenario.  If the below doesn't
>> help then a well-defined example may be necessary to explain what you
>> mean.  See some responses inline:
>>
>> On Mar 30, 7:56 am, Tim Smith <[email protected]> wrote:
>> > Hi,
>> >
>> > I have a need to automatically create the URI for new instances.
>> >
>> > The use case typically involves mashing up a bunch of data from
>> different
>> > sources.  Each source typically contains the data necessary to create a
>> URI.
>> >
>> > I use SPARQLMotion to process the data sources either in batch
>> (typically
>> > spreadsheets) or real-time (RSS, webservices).  In order to (easily)
>> link
>> > it all together I need to use the same URI for the same thing
>> (owl:sameAs
>> > is nice for inferencing but is miserable in practice).
>> >
>> > There are two ways to do this.  I can attempt to look up the new
>> instance
>> > in the current graph (very expensive when you have hundreds of
>> thousands of
>> > new instances coming in...) or I can simply create the same URI for the
>> > same thing across sources.  This requires the use of the same "rules" to
>> > create the URI.  Currently, I must copy & paste and maintain these
>> rules in
>> > every SM module that needs to create the URIs.
>>
>> If you have hundreds of thousands of new instances coming, then
>> solutions are bound to be expensive.  There tends to be an absence of
>> magic wands for big data - some data engineering strategy is usually
>> required.
>>
>> Instead of copy/paste, why not create a SPIN function that is called
>> by the different scripts?
>>
>> > The "rules" are typically an smf:buildURI(......) where the content is
>> > standardized.  However, since buildURI does not always create valid URIs
>> > (for example, it will not strip $'s from the input strings)
>>
>> You can use spif:encodeURL() to create valid URIs.
>>
>> > and sometimes
>> > my data sources represent the data for the URI in all caps, or mixed
>> case,
>> > I have to do some formatting and clean up to ensure that the same URI is
>> > created for the same object - I typically use a number of things like
>> > Trim(), replaceAll(), etc...
>>
>> fn:lower-case() works also, and spif:regex() is very flexible.
>>
>> > It would be nice if there was a way to create the URI automatically
>> using a
>> > pre-defined template that is applied when the new instance is created as
>> > part of a CONSTRUCT query.
>>
>> Yes, you could use spin:constructor to run a rule when you create an
>> instance.  If the creation is performed in a SM script, then use
>> ApplyConstruct or ApplyTopSpin to execute the rule (which presumably
>> pulls the right data in).
>>
>> > I was thinking that SPIN could do this but I don't know how to make it
>> > create the URI versus just calculating other properties for the
>> instance.
>>
>> I think the piece you're looking for is spin:constructor.  You could
>> do a SPARQL Update query in the rule.  It would look something like:
>>
>> DELETE
>> {  ?this ?p ?o .
>>   ?s1 ?p1 ?this .
>> }
>> INSERT
>> {  ?uri ?p ?o .
>>   ?s1 ?p1 ?uri
>> }
>> WHERE
>> {  ?this ?p ?o .
>>   ?s1 ?p1 ?this .
>>   BIND (spif:buildURI(...) AS ?uri)
>> }
>>
>> > Is this possible?  Are there other ways to create standardized URI
>> > templates on a per class basis?  Ideally I would like to define the
>> > template on an upper level class (like owl:Thing) and have it
>> automatically
>> > applied to all subclasses while still allowing for per class
>> specialization
>> > of the template that overrides the parent class template just like SWPs
>> use
>> > Instance-Views.
>>
>> SPIN rules do this.  The per-class specialization is applied by
>> creating a SPIN rule on that class.  This effectively overrides the
>> superclass spin:rule definitions.
>>
>> -- Scott
>>
>> > Thanks in advance for your thoughts,
>> >
>> > Tim
>>
>> --
>> You received this message because you are subscribed to the Google
>> Group "TopBraid Suite Users", the topics of which include Enterprise
>> Vocabulary Network (EVN), TopBraid Composer,
>> TopBraid Live, TopBraid Ensemble, SPARQLMotion and SPIN.
>> To post to this group, send email to
>> [email protected]
>> To unsubscribe from this group, send email to
>> [email protected]
>> For more options, visit this group at
>> http://groups.google.com/group/topbraid-users?hl=en
>>
>
>

-- 
You received this message because you are subscribed to the Google
Group "TopBraid Suite Users", the topics of which include Enterprise Vocabulary 
Network (EVN), TopBraid Composer,
TopBraid Live, TopBraid Ensemble, SPARQLMotion and SPIN.
To post to this group, send email to
[email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/topbraid-users?hl=en

Re: [topbraid-users] Re: How to Automatically Create a URI

Reply via email to