See some responses inline:

On Mar 30, 3:33 pm, Tim Smith <[email protected]> wrote:
> Sorry about the last one - Gmail sent it too soon...
>
> Thanks for the help.  To clarify the scenario, consider the following where
> each line is from a different spreadsheet:
>
> From file #1 with two columns - Car and Color:
>
> Car   Color
> Car1 Blue
>
> From file #2 with two columns - Car and # of Doors
>
> Car   # of Doors
> Car1  2
>
> Assume each file is processed in its own SM script.
>
> Also assume Car1 could already have been created in the past, so a URI for
> Car1 might already exist.

At this point, I'd suggest using the sml:schemaNamespace property to
align the namespace.  (I assume you are processing by a script - the
Import Tab-Delimited Spreadsheet File also has a "Base namespece"
which is the equivalent of sml:schemaNamespace.)  If you then pull in
multiple spreadsheets with "Car1" that have the same
sml:schemaNamespace, the URIs will align.

> In order to attach the Color and # of Doors properties to the correct URI,
> I have two choices:
>
> 1.  Look up (via a label) a matching instance URI for Car1 which requires a
> query and could be expensive depending on the size of the graph that
> contains all the cars and will be executed for each car in each spreadsheet
> which could be 100,000+
>
> 2.  Create a URI for Car1 in a "standard" way so that it will match a
> potentially existing URI for Car1.  This should be fast since no look-up is
> required.

My belief is that the previous response takes care of these (?) If
not, let me know.

> The key is "standardizing" the URI creation across scripts in a way that
> makes it specific, preferably at the class level.
>
> I agree that a SPIN Constructor has the best potential.  I had not thought
> about replacing ?this.
>
> I have a couple of follow-up questions:
>
> What is the scope of the SPIN constructor during execution?

I may have created a misconception.  In the previous response I sent,
I meant to be clear that I was using spin:constructor only as a
semantic indicator of what I intended the query to do - act like an
object constructor.  But your script will need to explicitly call the
constructor.  Hence the reference to ApplyTopSPIN specifying
spin:constructor as the predicate.

I.e. Composer has specialized code that invokes the spin:constructor
rule when one creates an instance via the UI.  For any other place,
including SM scripts, you will need to do the same - i.e. specify when
the constructor query should be invoked.

> Will the constructor have access to the variables within the CONSTRUCT
> query?
>
> I would normally do something like this:
>
> Consider this Construct query running in an SM module:
>
> CONSTRUCT {
> ?carURI a ex:Car .
> ?carURI ex:hasColor ?color .}
>
> WHERE {
> ?row a ss:Row .
> ?row ss:Color ?color .
> ?row ss:Car ?CarID .
>
> }
>
> In that scenario, when is the constructor called?

It is not.  You will need to call the constructor.  ApplyTopSPIN on
some set of instances (with appropriate type statements) is one
option.  Another is to place the "constructor" in an ApplyConstruct or
PerformUpdate.

> Will it be able to pull
> in the ?CarID variable?  If so, how do I receive a reference to the created
> URI so that I can use it later in the CONSTRUCT query, i.e. assign
> properties to it such as ?carURI ex:hasColor ?color
>
> Using your example constructor as a base, the constructor for the ex:Car
> class would look like:
>
> DELETE
> {  ?this ?p ?o .
>   ?s1 ?p1 ?this .}
>
> INSERT
> {  ?uri ?p ?o .
>   ?s1 ?p1 ?uri}
>
> WHERE
> {  ?this ?p ?o .
>   ?s1 ?p1 ?this .
>   BIND (spif:buildURI("http://www.example.com#{?1}";, ?CarID) AS ?uri)
>
> }
>
> Is the value of ?uri somehow made available to the calling script?

Again, I suspect the first part may resolve the issue.  But to follow
along this question, no, the ?uri won't be passed along with the rest
of the script when a SPARQL Update is used.  An Update will change the
graph directly.  If you want the data to be available down the triple
stream, then I'd suggest doing a BindBySelect to create the ?uri
variable, then use that in subsequent queries, including
ApplyConstruct, PerformUpdate, ApplyTopSPIN, etc.  If a CONSTRUCT
query is used in any of these, triples will be created that will be
passed down the triple stream.  But the only way to pass a variable
binding down the triple steam is to use one of the Bind modules.

> I think making a number of SPIN functions could serve the purpose but I
> would like to keep them associated with a Class to simplify usage.

You may want to think about trying SpinMap to specify how the
transformations occur.  See Help > Application Development Tools >
SPIN > Ontology Mapping with SPINMap.  This give you a nice class-
centric view of how to transform triples that can then be given to the
SPIN engine when ingesting various spreadsheets.

-- Scott

> Thanks in advance for your thoughts,
>
> Tim
>
>
>
>
>
>
>
> On Fri, Mar 30, 2012 at 4:05 PM, Tim Smith <[email protected]> wrote:
> > Hi Scott,
>
> > Thanks for the help.  To clarify the scenario, consider the following
> > where each line is from a different spreadsheet:
>
> > From file #1:
>
> > Car
>
> > On Fri, Mar 30, 2012 at 11:29 AM, Scott Henninger <
> > [email protected]> wrote:
>
> >> Tim; I'm not entirely clear on your scenario.  If the below doesn't
> >> help then a well-defined example may be necessary to explain what you
> >> mean.  See some responses inline:
>
> >> On Mar 30, 7:56 am, Tim Smith <[email protected]> wrote:
> >> > Hi,
>
> >> > I have a need to automatically create the URI for new instances.
>
> >> > The use case typically involves mashing up a bunch of data from
> >> different
> >> > sources.  Each source typically contains the data necessary to create a
> >> URI.
>
> >> > I use SPARQLMotion to process the data sources either in batch
> >> (typically
> >> > spreadsheets) or real-time (RSS, webservices).  In order to (easily)
> >> link
> >> > it all together I need to use the same URI for the same thing
> >> (owl:sameAs
> >> > is nice for inferencing but is miserable in practice).
>
> >> > There are two ways to do this.  I can attempt to look up the new
> >> instance
> >> > in the current graph (very expensive when you have hundreds of
> >> thousands of
> >> > new instances coming in...) or I can simply create the same URI for the
> >> > same thing across sources.  This requires the use of the same "rules" to
> >> > create the URI.  Currently, I must copy & paste and maintain these
> >> rules in
> >> > every SM module that needs to create the URIs.
>
> >> If you have hundreds of thousands of new instances coming, then
> >> solutions are bound to be expensive.  There tends to be an absence of
> >> magic wands for big data - some data engineering strategy is usually
> >> required.
>
> >> Instead of copy/paste, why not create a SPIN function that is called
> >> by the different scripts?
>
> >> > The "rules" are typically an smf:buildURI(......) where the content is
> >> > standardized.  However, since buildURI does not always create valid URIs
> >> > (for example, it will not strip $'s from the input strings)
>
> >> You can use spif:encodeURL() to create valid URIs.
>
> >> > and sometimes
> >> > my data sources represent the data for the URI in all caps, or mixed
> >> case,
> >> > I have to do some formatting and clean up to ensure that the same URI is
> >> > created for the same object - I typically use a number of things like
> >> > Trim(), replaceAll(), etc...
>
> >> fn:lower-case() works also, and spif:regex() is very flexible.
>
> >> > It would be nice if there was a way to create the URI automatically
> >> using a
> >> > pre-defined template that is applied when the new instance is created as
> >> > part of a CONSTRUCT query.
>
> >> Yes, you could use spin:constructor to run a rule when you create an
> >> instance.  If the creation is performed in a SM script, then use
> >> ApplyConstruct or ApplyTopSpin to execute the rule (which presumably
> >> pulls the right data in).
>
> >> > I was thinking that SPIN could do this but I don't know how to make it
> >> > create the URI versus just calculating other properties for the
> >> instance.
>
> >> I think the piece you're looking for is spin:constructor.  You could
> >> do a SPARQL Update query in the rule.  It would look something like:
>
> >> DELETE
> >> {  ?this ?p ?o .
> >>   ?s1 ?p1 ?this .
> >> }
> >> INSERT
> >> {  ?uri ?p ?o .
> >>   ?s1 ?p1 ?uri
> >> }
> >> WHERE
> >> {  ?this ?p ?o .
> >>   ?s1 ?p1 ?this .
> >>   BIND (spif:buildURI(...) AS ?uri)
> >> }
>
> >> > Is this possible?  Are there other ways to create standardized URI
> >> > templates on a per class basis?  Ideally I would like to define the
> >> > template on an upper level class (like owl:Thing) and have it
> >> automatically
> >> > applied to all subclasses while still allowing for per class
> >> specialization
> >> > of the template that overrides the parent class template just like SWPs
> >> use
> >> > Instance-Views.
>
> >> SPIN rules do this.  The per-class specialization is applied by
> >> creating a SPIN rule on that class.  This effectively overrides the
> >> superclass spin:rule definitions.
>
> >> -- Scott
>
> >> > Thanks in advance for your thoughts,
>
> >> > Tim
>
> >> --
> >> You received this message because you are subscribed to the Google
> >> Group "TopBraid Suite Users", the topics of which include Enterprise
> >> Vocabulary Network (EVN), TopBraid Composer,
> >> TopBraid Live, TopBraid Ensemble, SPARQLMotion and SPIN.
> >> To post to this group, send email to
> >> [email protected]
> >> To unsubscribe from this group, send email to
> >> [email protected]
> >> For more options, visit this group at
> >>http://groups.google.com/group/topbraid-users?hl=en

-- 
You received this message because you are subscribed to the Google
Group "TopBraid Suite Users", the topics of which include Enterprise Vocabulary 
Network (EVN), TopBraid Composer,
TopBraid Live, TopBraid Ensemble, SPARQLMotion and SPIN.
To post to this group, send email to
[email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/topbraid-users?hl=en

Reply via email to