Question on moving linked data sets

2012-04-19 Thread Antoine Isaac

Dear all,

We have a question on an what to do when a linked data set is moved from one 
namespace to the other. We searched for recipes to apply, but did not really find 
anything 'official'  around...
 
The VU university of Amsterdam has published a Linked Data SKOS representation of RAMEAU [1] as a prototype, several years ago. For example we have

http://stitch.cs.vu.nl/vocabularies/rameau/ark:/12148/cb14521343b

Recently, BnF implemented its own production service for RAMEAU. The previous 
concept is at:
http://data.bnf.fr/ark:/12148/cb14521343b
(see RDF at http://data.bnf.fr/14521343/web_semantique/rdf.xml)

The production services makes the prototype obsolete. Our issue is how to properly 
transition from one to the other. Several services are using the URIs of the 
prototype. For example at the Library of Congress:
http://id.loc.gov/authorities/subjects/sh2002000569

We can ask for the people we know to change their links. But identifying the 
users of URIs seems too manual, error-prone a process. And of course in general 
we do not want links to be broken.

Currently we have done the following:

- a 301 moved permanently redirection from the stitch.cs.vu.nl/rameau 
prototype to data.bnf.fr.

- an owl:sameAs statement between the prototype URIs and the production ones, 
so that a client searching for data on the old URI gets data that enables it to 
make the connection with the original resource (URI) it was seeking data about.

Does that seem ok? What should we do, otherwise?

Thanks for any feedback you could have,

Antoine Isaac (VU Amsterdam side)
Romain Wenz (BnF side)

[1] RAMEAU is a vocabulary (thesaurus) used by the National Library of France 
(BnF) for describing books.



Re: Question on moving linked data sets

2012-04-19 Thread Kingsley Idehen

On 4/19/12 10:23 AM, Antoine Isaac wrote:

Dear all,

We have a question on an what to do when a linked data set is moved 
from one namespace to the other. We searched for recipes to apply, but 
did not really find anything 'official'  around...


The VU university of Amsterdam has published a Linked Data SKOS 
representation of RAMEAU [1] as a prototype, several years ago. For 
example we have

http://stitch.cs.vu.nl/vocabularies/rameau/ark:/12148/cb14521343b

Recently, BnF implemented its own production service for RAMEAU. The 
previous concept is at:

http://data.bnf.fr/ark:/12148/cb14521343b
(see RDF at http://data.bnf.fr/14521343/web_semantique/rdf.xml)

The production services makes the prototype obsolete. Our issue is how 
to properly transition from one to the other. Several services are 
using the URIs of the prototype. For example at the Library of Congress:

http://id.loc.gov/authorities/subjects/sh2002000569

We can ask for the people we know to change their links. But 
identifying the users of URIs seems too manual, error-prone a process. 
And of course in general we do not want links to be broken.


Currently we have done the following:

- a 301 moved permanently redirection from the 
stitch.cs.vu.nl/rameau prototype to data.bnf.fr.


- an owl:sameAs statement between the prototype URIs and the 
production ones, so that a client searching for data on the old URI 
gets data that enables it to make the connection with the original 
resource (URI) it was seeking data about.


Does that seem ok? What should we do, otherwise?


Seems OK to me :-)


Kingsley


Thanks for any feedback you could have,

Antoine Isaac (VU Amsterdam side)
Romain Wenz (BnF side)

[1] RAMEAU is a vocabulary (thesaurus) used by the National Library of 
France (BnF) for describing books.






--

Regards,

Kingsley Idehen 
Founder  CEO
OpenLink Software
Company Web: http://www.openlinksw.com
Personal Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca handle: @kidehen
Google+ Profile: https://plus.google.com/112399767740508618350/about
LinkedIn Profile: http://www.linkedin.com/in/kidehen








smime.p7s
Description: S/MIME Cryptographic Signature


Re: Question on moving linked data sets

2012-04-19 Thread Kingsley Idehen

On 4/19/12 10:23 AM, Antoine Isaac wrote:

Dear all,

We have a question on an what to do when a linked data set is moved 
from one namespace to the other. We searched for recipes to apply, but 
did not really find anything 'official'  around...


The VU university of Amsterdam has published a Linked Data SKOS 
representation of RAMEAU [1] as a prototype, several years ago. For 
example we have

http://stitch.cs.vu.nl/vocabularies/rameau/ark:/12148/cb14521343b

Recently, BnF implemented its own production service for RAMEAU. The 
previous concept is at:

http://data.bnf.fr/ark:/12148/cb14521343b
(see RDF at http://data.bnf.fr/14521343/web_semantique/rdf.xml)

The production services makes the prototype obsolete. Our issue is how 
to properly transition from one to the other. Several services are 
using the URIs of the prototype. For example at the Library of Congress:

http://id.loc.gov/authorities/subjects/sh2002000569

We can ask for the people we know to change their links. But 
identifying the users of URIs seems too manual, error-prone a process. 
And of course in general we do not want links to be broken.


Currently we have done the following:

- a 301 moved permanently redirection from the 
stitch.cs.vu.nl/rameau prototype to data.bnf.fr.


- an owl:sameAs statement between the prototype URIs and the 
production ones, so that a client searching for data on the old URI 
gets data that enables it to make the connection with the original 
resource (URI) it was seeking data about.


Does that seem ok? What should we do, otherwise?

Thanks for any feedback you could have,

Antoine Isaac (VU Amsterdam side)
Romain Wenz (BnF side)

[1] RAMEAU is a vocabulary (thesaurus) used by the National Library of 
France (BnF) for describing books.




Forgot to paste this into my prior response .

Here's why its OK:

1. 
http://linkeddata.uriburner.com/describe/?url=http%3A%2F%2Fdata.bnf.fr%2Fark%3A%2F12148%2Fcb14521343b 
-- new URI


2. 
http://linkeddata.uriburner.com/describe/?url=http%3A%2F%2Fstitch.cs.vu.nl%2Fvocabularies%2Frameau%2Fark%3A%2F12148%2Fcb14521343b 
-- old (prototype) URI


3. 
http://linkeddata.uriburner.com/describe/?url=http%3A%2F%2Fdata.bnf.fr%2Fark%3A%2F12148%2Fcb14521343bsas=yes 
-- new URI and effects of enabling owl:sameAs inference .



--

Regards,

Kingsley Idehen 
Founder  CEO
OpenLink Software
Company Web: http://www.openlinksw.com
Personal Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca handle: @kidehen
Google+ Profile: https://plus.google.com/112399767740508618350/about
LinkedIn Profile: http://www.linkedin.com/in/kidehen








smime.p7s
Description: S/MIME Cryptographic Signature


RE: Question on moving linked data sets

2012-04-19 Thread Ford, Kevin
This sounds like the best way to manage this transition.  I believe the 301 
redirection is precisely what Ed (CC'ed on the mail) did when directing traffic 
from the lcsh.info URIs to the new (at the time) id.loc URIs.   (Speaking, of 
which, this email answers a question we asked ourselves here at LC about the 
relationship between Rameau at VU and Rameau at data.bnf, after Romain 
announced the addition of RAMEAU at data.bnf.  I'll try to move a little faster 
updating the URIs in ID.)

 - an owl:sameAs statement between the prototype URIs and the production
 ones, so that a client searching for data on the old URI gets data that
 enables it to make the connection with the original resource (URI) it
 was seeking data about.
-- The answer seems obvious, butThis would be expressed in the data 
available from data.bnf, correct?

Yours,

Kevin

--
Kevin Ford
Network Development and MARC Standards Office
Library of Congress
Washington, DC



 -Original Message-
 From: Antoine Isaac [mailto:ais...@few.vu.nl]
 Sent: Thursday, April 19, 2012 10:23 AM
 To: public-lod@w3.org
 Cc: romain.wenz; Antoine Isaac
 Subject: Question on moving linked data sets
 
 Dear all,
 
 We have a question on an what to do when a linked data set is moved
 from one namespace to the other. We searched for recipes to apply, but
 did not really find anything 'official'  around...
 
 The VU university of Amsterdam has published a Linked Data SKOS
 representation of RAMEAU [1] as a prototype, several years ago. For
 example we have
 http://stitch.cs.vu.nl/vocabularies/rameau/ark:/12148/cb14521343b
 
 Recently, BnF implemented its own production service for RAMEAU. The
 previous concept is at:
 http://data.bnf.fr/ark:/12148/cb14521343b
 (see RDF at http://data.bnf.fr/14521343/web_semantique/rdf.xml)
 
 The production services makes the prototype obsolete. Our issue is how
 to properly transition from one to the other. Several services are
 using the URIs of the prototype. For example at the Library of Congress:
 http://id.loc.gov/authorities/subjects/sh2002000569
 
 We can ask for the people we know to change their links. But
 identifying the users of URIs seems too manual, error-prone a process.
 And of course in general we do not want links to be broken.
 
 Currently we have done the following:
 
 - a 301 moved permanently redirection from the stitch.cs.vu.nl/rameau
 prototype to data.bnf.fr.
 
 - an owl:sameAs statement between the prototype URIs and the production
 ones, so that a client searching for data on the old URI gets data that
 enables it to make the connection with the original resource (URI) it
 was seeking data about.
 
 Does that seem ok? What should we do, otherwise?
 
 Thanks for any feedback you could have,
 
 Antoine Isaac (VU Amsterdam side)
 Romain Wenz (BnF side)
 
 [1] RAMEAU is a vocabulary (thesaurus) used by the National Library of
 France (BnF) for describing books.




Re: Question on moving linked data sets

2012-04-19 Thread Bernard Vatant
Hello Antoine

My take on this would be to use dcterms:isReplacedBy links rather than
owl:sameAs
Description of the concepts by BNF might change in the future and although
the original identifier is the same, the description might be out of sync
at some point.

Bernard

Le 19 avril 2012 16:23, Antoine Isaac ais...@few.vu.nl a écrit :

 Dear all,

 We have a question on an what to do when a linked data set is moved from
 one namespace to the other. We searched for recipes to apply, but did not
 really find anything 'official'  around...
  The VU university of Amsterdam has published a Linked Data SKOS
 representation of RAMEAU [1] as a prototype, several years ago. For example
 we have
 http://stitch.cs.vu.nl/**vocabularies/rameau/ark:/**12148/cb14521343bhttp://stitch.cs.vu.nl/vocabularies/rameau/ark:/12148/cb14521343b

 Recently, BnF implemented its own production service for RAMEAU. The
 previous concept is at:
 http://data.bnf.fr/ark:/12148/**cb14521343bhttp://data.bnf.fr/ark:/12148/cb14521343b
 (see RDF at 
 http://data.bnf.fr/14521343/**web_semantique/rdf.xmlhttp://data.bnf.fr/14521343/web_semantique/rdf.xml
 )

 The production services makes the prototype obsolete. Our issue is how to
 properly transition from one to the other. Several services are using the
 URIs of the prototype. For example at the Library of Congress:
 http://id.loc.gov/authorities/**subjects/sh2002000569http://id.loc.gov/authorities/subjects/sh2002000569

 We can ask for the people we know to change their links. But identifying
 the users of URIs seems too manual, error-prone a process. And of course in
 general we do not want links to be broken.

 Currently we have done the following:

 - a 301 moved permanently redirection from the 
 stitch.cs.vu.nl/rameauprototype to
 data.bnf.fr.

 - an owl:sameAs statement between the prototype URIs and the production
 ones, so that a client searching for data on the old URI gets data that
 enables it to make the connection with the original resource (URI) it was
 seeking data about.

 Does that seem ok? What should we do, otherwise?

 Thanks for any feedback you could have,

 Antoine Isaac (VU Amsterdam side)
 Romain Wenz (BnF side)

 [1] RAMEAU is a vocabulary (thesaurus) used by the National Library of
 France (BnF) for describing books.




-- 
*Bernard Vatant
*
Vocabularies  Data Engineering
Tel :  + 33 (0)9 71 48 84 59
 Skype : bernard.vatant
Linked Open Vocabularies http://labs.mondeca.com/dataset/lov


*Mondeca**  **   *
3 cité Nollez 75018 Paris, France
www.mondeca.com
Follow us on Twitter : @mondecanews http://twitter.com/#%21/mondecanews


Re: Question on moving linked data sets

2012-04-19 Thread Ed Summers
Hi Antoine,

First, congratulations on http://data.bnf.fr/ that is a major milestone!

On Thu, Apr 19, 2012 at 10:23 AM, Antoine Isaac ais...@few.vu.nl wrote:
 We can ask for the people we know to change their links. But identifying the
 users of URIs seems too manual, error-prone a process. And of course in
 general we do not want links to be broken.

 Currently we have done the following:

 - a 301 moved permanently redirection from the stitch.cs.vu.nl/rameau
 prototype to data.bnf.fr.

 - an owl:sameAs statement between the prototype URIs and the production
 ones, so that a client searching for data on the old URI gets data that
 enables it to make the connection with the original resource (URI) it was
 seeking data about.

 Does that seem ok? What should we do, otherwise?

As you know when the SKOS concepts published at lcsh.info moved to
id.loc.gov I had a similar situation :-) Like you I chose to do a
mixture of technical and social things:

- publish information about the move to relevant discussion lists
- put some information up at lcsh.info about the move
- permanently redirect (301) all resources to their new location (easy
since it was the same app, and mod_rewrite could do it)
- after a year of redirects I shut down lcsh.info and did not renew
the domain (interestingly someone is squatting on it right now
attempting to sell it, I think)

Generally I think the linked data community should be encouraged to
check their links, respect 301 redirects, and update their own link
database appropriately. This is what Google and other major search
engines do [1], and it's how the Web was designed to work, and
continues to grow.

While it's certainly cool when URIs don't change [1] I think it is
somewhat irrational to expect URIs to be permanent. I hear people
gripe about broken URLs in the digital preservation community quite a
bit and it is pretty irritating, since any data that isn't actively
used tends to rot...URLs really aren't that different.

Of course there is certainly value in stable identifiers [1], but I
think there is an opportunity for documenting and encouraging best
practices on how to manage change (especially with respect to
identifiers) in Linked Data. URNs, Namespaces and Registries [2] is
partly helpful here, but a more succinct and URI focused presentation
is needed. Or maybe a best practice document like this already exists
and I haven't seen it yet. If that is the case I trust someone will
let me know :-)

//Ed

[1] http://www.w3.org/Provider/Style/URI.html
[2] http://www.w3.org/2001/tag/doc/URNsAndRegistries-50



Re: Question on moving linked data sets

2012-04-19 Thread Ed Summers
On Thu, Apr 19, 2012 at 11:02 AM, Bernard Vatant
bernard.vat...@mondeca.com wrote:
 My take on this would be to use dcterms:isReplacedBy links rather than
 owl:sameAs
 Description of the concepts by BNF might change in the future and although
 the original identifier is the same, the description might be out of sync at
 some point.

I really like the idea of avoiding the owl:sameAs quagmire with an
assertion that has less ontological consequences. When publishing data
about data.bnf resources perhaps it might be a bit simpler to use
dcterms:replaces instead, e.g.

http://data.bnf.fr/ark:/12148/cb14521343b dcterns:replaces
http://stitch.cs.vu.nl/vocabularies/rameau/ark:/12148/cb14521343b .

//Ed



Re: Question on moving linked data sets

2012-04-19 Thread Ed Summers
On Thu, Apr 19, 2012 at 1:23 PM, Ed Summers e...@pobox.com wrote:
 http://data.bnf.fr/ark:/12148/cb14521343b dcterns:replaces
 http://stitch.cs.vu.nl/vocabularies/rameau/ark:/12148/cb14521343b .

s/dcterns/dcterms/

//Ed



Re: Question on moving linked data sets

2012-04-19 Thread Jon Phipps
DC:Terns...
http://blog.terrain.org/wp-content/uploads/2011/04/LeastTern_Tom-Grey_U.jpg

Jon


On Thu, Apr 19, 2012 at 1:23 PM, Ed Summers e...@pobox.com wrote:

 On Thu, Apr 19, 2012 at 11:02 AM, Bernard Vatant
 bernard.vat...@mondeca.com wrote:
  My take on this would be to use dcterms:isReplacedBy links rather than
  owl:sameAs
  Description of the concepts by BNF might change in the future and
 although
  the original identifier is the same, the description might be out of
 sync at
  some point.

 I really like the idea of avoiding the owl:sameAs quagmire with an
 assertion that has less ontological consequences. When publishing data
 about data.bnf resources perhaps it might be a bit simpler to use
 dcterms:replaces instead, e.g.

 http://data.bnf.fr/ark:/12148/cb14521343b dcterns:replaces
 http://stitch.cs.vu.nl/vocabularies/rameau/ark:/12148/cb14521343b .

 //Ed




Re: Question on moving linked data sets

2012-04-19 Thread Richard Wallis
Hi All,

I do not think that there is much else you could do.  I presume the server
doing the 301 is going to stay around for a while.

~Richard.

On 19 April 2012 15:23, Antoine Isaac ais...@few.vu.nl wrote:

 Dear all,

 We have a question on an what to do when a linked data set is moved from
 one namespace to the other. We searched for recipes to apply, but did not
 really find anything 'official'  around...
  The VU university of Amsterdam has published a Linked Data SKOS
 representation of RAMEAU [1] as a prototype, several years ago. For example
 we have
 http://stitch.cs.vu.nl/**vocabularies/rameau/ark:/**12148/cb14521343bhttp://stitch.cs.vu.nl/vocabularies/rameau/ark:/12148/cb14521343b

 Recently, BnF implemented its own production service for RAMEAU. The
 previous concept is at:
 http://data.bnf.fr/ark:/12148/**cb14521343bhttp://data.bnf.fr/ark:/12148/cb14521343b
 (see RDF at 
 http://data.bnf.fr/14521343/**web_semantique/rdf.xmlhttp://data.bnf.fr/14521343/web_semantique/rdf.xml
 )

 The production services makes the prototype obsolete. Our issue is how to
 properly transition from one to the other. Several services are using the
 URIs of the prototype. For example at the Library of Congress:
 http://id.loc.gov/authorities/**subjects/sh2002000569http://id.loc.gov/authorities/subjects/sh2002000569

 We can ask for the people we know to change their links. But identifying
 the users of URIs seems too manual, error-prone a process. And of course in
 general we do not want links to be broken.

 Currently we have done the following:

 - a 301 moved permanently redirection from the 
 stitch.cs.vu.nl/rameauprototype to
 data.bnf.fr.

 - an owl:sameAs statement between the prototype URIs and the production
 ones, so that a client searching for data on the old URI gets data that
 enables it to make the connection with the original resource (URI) it was
 seeking data about.

 Does that seem ok? What should we do, otherwise?

 Thanks for any feedback you could have,

 Antoine Isaac (VU Amsterdam side)
 Romain Wenz (BnF side)

 [1] RAMEAU is a vocabulary (thesaurus) used by the National Library of
 France (BnF) for describing books.




-- 
Richard Wallis
Technology Evangelist, OCLC: richard.wal...@oclc.org
Founder, Data Liberate: richard.wal...@dataliberate.com
http://dataliberate.com
Tel: +44 (0)7767 886 005

Linkedin: http://www.linkedin.com/in/richardwallis
Skype: richard.wallis1
Twitter: @rjw
IM: rjw3...@hotmail.com


Announcing OWLIM 5.0 - with new transaction mechanism, performance improvements, SPARQL 1.1 graph store protocol and more

2012-04-19 Thread Barry Bishop
Ontotext are pleased to announce the release of OWLIM version 5.0 
http://www.ontotext.com/owlim featuring a new transaction mechanism, 
performance improvements, SPARQL 1.1 graph store protocol, integration 
with TopBraid Composer/Live 
http://www.topquadrant.com/products/TB_Suite.html and many other 
improvements. The single most important new feature is the new 
transaction management mechanism which allows for much *more reliable 
and efficient handling of workloads where queries from multiple clients 
are combined with frequent updates* of the data. As benchmark results 
http://www.ontotext.com/owlim/benchmark-results/owlim-5 demonstrate, 
OWLIM 5.0 is *43% faster* than v.4.3 on the BSBM Explore and Update 
http://www4.wiwiss.fu-berlin.de/bizer/BerlinSPARQLBenchmark/spec/ 
scenario. As a result of several changes in the index structures, OWLIM 
now requires *between 25% and 70% less storage space*.


Some of the most important improvements are listed below:

 * *Transaction management and isolation mechanisms* have been
   completely refactored. The previous strategy used lazy writing of
   modified database pages, such that dirty pages were only flushed to
   disk when further updates occur and no more memory is available.
   While extremely fast, the problem with this approach is that there
   is a considerable recovery time associated with replaying the
   transaction log after an abnormal termination. The new mechanism
   uses two modes: 'bulk-loading' (fast) with similar behaviour to
   previous versions and 'normal' (safe) where database modifications
   are flushed to disk as part of the commit operation. When running in
   safe mode, *database recovery is instant* and there is a
   *significant improvement in concurrency between updates and queries*.

 * *New context indices* can be used to improve query performance when
   data is modelled using many named graphs. These are switched on and
   off using a single configuration parameter enable-context-index

 * The *SPARQL 1.1 Graph Store HTTP Protocol* is now supported
   according to the W3C Working Draft
   http://www.w3.org/TR/sparql11-http-rdf-update/ from the 12th May
   2011. This provides a REST interface for managing collections of
   graphs, using either directly or indirectly named graphs.

 * *Sesame http://www.openrdf.org* *2.6.5* with many bug-fixes and
   updates to bring SPARQL 1.1 Query
   http://www.w3.org/TR/2012/WD-sparql11-query-20120105/ support up
   to the latest W3C Working Draft from the 5th January 2012.

 * *Significant reduction in disk-space requirements* is achieved with
   the following modifications:
 o *Index compression* can now be used to reduce disk storage
   requirements by using zip compression on database pages. This
   feature if off by default, but can be switched on when creating
   a new repository. The configuration parameter
   index-compression-ratio can be set to -1 (the default value
   indicating no compression) or a value in the range 10-50
   
https://confluence.ontotext.com/pages/createpage.action?spaceKey=OWLIMinttitle=10-50linkCreation=truefromPageId=17596523
   indicating the desired percentage reduction in page sizes. Any
   pages that can not be compressed by the specified amount are
   stored uncompressed. Therefore a compression ratio that is too
   aggressive will not bring many benefits. Experiments have shown
   that for large datasets a value of about 30% is close to optimal
   and leads to a total disk space saving of around 50%.
 o *Restructuring of the triple indices* has also led to a
   reduction in disk-space requirements of around 18% independent
   of the compression functionality
 o *Entity compression* is a modification that reduces the storage
   requirements for the lookup table that maps between internal
   identifiers and resources. This is transparent to the user and
   happens automatically. More disk space reductions are apparent
   using this version.

 * A new *literal index* is created automatically for numeric and
   date/time data-types. The index is used during query evaluation if a
   query or a sub-query (e.g. union) has a filter that is comprised of
   a conjunction of literal constraints, e.g. FILTER(?x = 3  ?y = 5
?start  2001-01-01^^xsd:date). Other patterns, including those
   that use negation, will not use the index for this version of OWLIM.

 * Tighter integration with TopQuadrant http://www.topquadrant.com/'s
   TopBraid Composer
   http://www.topquadrant.com/products/TB_Composer.html (a graphical
   development environment for modelling data) and TopBraid Live
   http://www.topquadrant.com/products/TB_Live.html (an enterprise
   SOA-capable Semantic Web application platform). Contact the OWLIM
   team directly mailto:owlim-i...@ontotext.com for details of how to
   obtain the OWLIM plug-in.

 * All *control queries now use SPARQL Update syntax* (used mostly