Re: 200 OK with Content-Location might work: But maybe it can be simpler?
On 05/11/10 17:26, Nathan wrote: Giovanni Tummarello wrote: How about something that's totally independant from HEADER issues? think normal people here. absolutely 0 interest to mess with headers and http responses.. absolutely no business incentive to do it. as a baseline think someone wanting to annotate with RDFa a hand crafted, apached served html file. really.. as simple as serving this people. as simple as anyone who's using opengraph just copy pastes into their HTML template.. as simple as this really, please, its the only thing that can work? +1 from me - all this uri and 303 nonsense, now other codes and any form of HTTP awareness is best completely removed. uri#frag gives us that semantic indirection we need, without anybody even noticing (and allows 200 OK). What about 404 ;-) ? What about http://iandavis.com/2010/303/toucan#FredFlintstone Best, Nathan -- Robert Fuller Research Associate Sindice Team DERI, Galway http://sindice.com/
Re: Is 303 really necessary - demo
On 05/11/10 16:50, Ian Davis wrote: On Fri, Nov 5, 2010 at 4:42 PM, Robert Fuller wrote: I submitted both urls to sindice earlier. Both were indexed and have the same content. In the search results[1] one displays with title "A Toucan", the other with title, "A Description of a Toucan". http://sindice.com/search?q=toucan+domain%3Aiandavis.com&qt=term So SIndice see them as distinct resources and doesn't concern itself with the lack of a 303 redirect? Both pages returned http status code of 200 and some content. Sindice extracted metadata from the content (using any23), and associated that content with the requested url's. Sindice doesn't "expect" 303's, but it follows them. This isn't always a good thing... http://inspector.sindice.com/inspect?url=http://xmlns.com/foaf/0.1/IanDavis Ian -- Robert Fuller Research Associate Sindice Team DERI, Galway http://sindice.com/
Re: Is 303 really necessary - demo
Hi, I submitted both urls to sindice earlier. Both were indexed and have the same content. In the search results[1] one displays with title "A Toucan", the other with title, "A Description of a Toucan". http://sindice.com/search?q=toucan+domain%3Aiandavis.com&qt=term Robert. On 05/11/10 09:43, Ian Davis wrote: Hi all, To aid discussion I create a small demo of the idea put forth in my blog post http://iand.posterous.com/is-303-really-necessary Here is the URI of a toucan: http://iandavis.com/2010/303/toucan Here is the URI of a description of that toucan: http://iandavis.com/2010/303/toucan.rdf As you can see both these resources have distinct URIs. I created a new property http://vocab.org/desc/schema/description to link the toucan to its description. The schema for that property is here: http://vocab.org/desc/schema (BTW I looked at the powder describedBy property and it's clearly designed to point to one particular type of description, not a general RDF one. I also looked at http://ontologydesignpatterns.org/ont/web/irw.owl and didn't see anything suitable) Here is the URI Burner view of the toucan resource and of its description document: http://linkeddata.uriburner.com/about/html/http://iandavis.com/2010/303/toucan http://linkeddata.uriburner.com/about/html/http/iandavis.com/2010/303/toucan.rdf I'd like to use this demo to focus on the main thrust of my question: does this break the web and if so, how? Cheers, Ian P.S. I am not fully caught up on the other thread, so maybe someone has already produced this demo -- Robert Fuller Research Associate Sindice Team DERI, Galway http://sindice.com/
Re: What would break, a question for implementors? (was Re: Is 303 really necessary?)
On 05/11/10 15:06, Ian Davis wrote: On Fri, Nov 5, 2010 at 12:12 PM, Nathan wrote: However, if you use 303's the then first GET redirects there, then you store the ontology against the redirected-to URI, you still have to do 40+ GETs but each one is fast with no response-body (ontology sent down the wire) then the next request for the 303'd to URI comes right out of the cache. It's still 40+ requests unless you code around it in some way, but it's better than 40+ requests and 40+ copies of the single ontology. But in practice, don't you look in your cache first? If you already have a label for foaf:knows because you looked up foaf:mbox a few seconds ago why would you issue another request? Sindice would, because Fred could also define a label for foaf:knows in the flintstone schema. The Sindice contextualised reasoning is performed in a sandbox to ensure that Fred's malicious schema isn't going to pollute any inferencing from your document, unless your document also references Fred's schema. Without checking we can't be sure that foaf:knows and foaf:mbox are defined in the same ontology. -- Robert Fuller Research Associate Sindice Team DERI, Galway http://sindice.com/
Re: What would break, a question for implementors? (was Re: Is 303 really necessary?)
So here's a couple of questions for those of you on the list who have implemented Linked Data tools, applications, services, etc: * Do you rely on or require HTTP 303 redirects in your application? Or does your app just follow the redirect? For sindice - no we do not rely on or require them, merely follow. * Would your application tool/service/etc break or generic inaccurate data if Ian's pattern was used to publish Linked Data. It wouldn't break sindice. However... with regard to publishing ontologies, we could expect additional overhead if same content is delivered on retrieving different Resources for example http://example.com/schema/latitude and http://example.com/schema/longitude . In such a case ETag could be used to suggest the contents are identical, but not sure that is a practical solution. I expect that without 303 it will be more difficult in particular to publish and process ontologies. Rob. -- Robert Fuller Research Associate Sindice Team DERI, Galway http://sindice.com/
Re: Is 303 really necessary?
It has been pointed out to me that the many resources we are encountering for http://opengraphprotocol.org/schema/latitude are actually wrong - so deserving a 404, the resource should correctly be written: http://ogp.me/ns#latitude But never mind, that doesn't resolve either... On 04/11/10 18:38, Robert Fuller wrote: Hi, Feel free anyone to suggest opengraph use 301, 302, 303, 307 (we support them all), since at the moment with a 404 they are missing out on all the benefit of the sindice reasoner ;-) http://opengraphprotocol.org/schema/latitude It is common when publishing an ontology to have the url for each property redirect to the rdf schema. It works great. I would expect that a request for the aforementioned url (with accept header set correctly) would redirect me to (probably) http://opengraphprotocol.org/schema Which would download nicely with a 200 status code (it doesn't, you need to get the ontology from here http://opengraphprotocol.org/schema/?format=rdf ) Later, when we encounter another opengraph property http://opengraphprotocol.org/schema/longitude We would also hope to get a 303, which would again redirect us to http://opengraphprotocol.org/schema Of course, we don't want to bring down opengraph server, so we have already cached the schema the first time we downloaded (if it worked) and know not to fetch it again now. In my experience processing millions of rdf documents daily, the 303 has proven quite useful and very efficient, and I would definitely recommend it's use to opengraph and other publishers of ontologies. Robert. On 04/11/10 13:22, Ian Davis wrote: Hi all, The subject of this email is the title of a blog post I wrote last night questioning whether we actually need to continue with the 303 redirect approach for Linked Data. My suggestion is that replacing it with a 200 is in practice harmless and that nothing actually breaks on the web. Please take a moment to read it if you are interested. http://iand.posterous.com/is-303-really-necessary Cheers, Ian -- Robert Fuller Research Associate Sindice Team DERI, Galway http://sindice.com/
Re: Is 303 really necessary?
Hi, Feel free anyone to suggest opengraph use 301, 302, 303, 307 (we support them all), since at the moment with a 404 they are missing out on all the benefit of the sindice reasoner ;-) http://opengraphprotocol.org/schema/latitude It is common when publishing an ontology to have the url for each property redirect to the rdf schema. It works great. I would expect that a request for the aforementioned url (with accept header set correctly) would redirect me to (probably) http://opengraphprotocol.org/schema Which would download nicely with a 200 status code (it doesn't, you need to get the ontology from here http://opengraphprotocol.org/schema/?format=rdf ) Later, when we encounter another opengraph property http://opengraphprotocol.org/schema/longitude We would also hope to get a 303, which would again redirect us to http://opengraphprotocol.org/schema Of course, we don't want to bring down opengraph server, so we have already cached the schema the first time we downloaded (if it worked) and know not to fetch it again now. In my experience processing millions of rdf documents daily, the 303 has proven quite useful and very efficient, and I would definitely recommend it's use to opengraph and other publishers of ontologies. Robert. On 04/11/10 13:22, Ian Davis wrote: Hi all, The subject of this email is the title of a blog post I wrote last night questioning whether we actually need to continue with the 303 redirect approach for Linked Data. My suggestion is that replacing it with a 200 is in practice harmless and that nothing actually breaks on the web. Please take a moment to read it if you are interested. http://iand.posterous.com/is-303-really-necessary Cheers, Ian -- Robert Fuller Research Associate Sindice Team DERI, Galway http://sindice.com/
Re: Subjects as Literals
+1 On 06/07/10 09:23, Danny Ayers wrote: I've been studiously avoiding this rat king of a thread, but just on this suggestion: On 2 July 2010 11:16, Reto Bachmann-Gmuer wrote: ... Serialization formats could support "Jo" :nameOf :Jo as a shortcut for [ owl:sameAs "Jo"; :nameOf :Jo] and a store could (internally) store the latter as "Jo" :nameOf :Jo for compactness and efficiency. what about keeping the internal storage idea, but instead of owl:sameAs, using: :Jo rdfs:value "Jo" together with :Jo rdf:type rdfs:Literal ? Cheers, Danny. -- Robert Fuller Research Associate Sindice Team DERI, Galway http://sindice.com/
Re: Show me the money - (was Subjects as Literals)
Saw them, smiled, threw them in the bin. I can't present a use case for "Literals as Subject", but I did have a relevant experience recently when having written a reasoner for sindice I was briefly intrigued to discover that executing some owl rules leads to a production of statements where literals appear in the subject position. As the reasoner was written primarily with performance and memory constraints in mind, it never occurred to me to investigate whether the principles of rdf inferencing prohibit generating such statements. But since triples with literal in the subject position are currently not of any interest to us, we simply discard them during a filtering phase. Kind regards, Robert On 01/07/10 17:05, John Erickson wrote: RE getting "a full list of the benefits," surely if it's being discussed here, "Literals as Subjects" must be *somebody's* Real(tm) Problem and the benefits are inherent in its solution? And if it isn't, um, why is it being discussed here? ;) On Thu, Jul 1, 2010 at 11:46 AM, Henry Story wrote: Jeremy, the point is to start the process, but put it on a low burner, so that in 4-5 years time, you will be able to sell a whole new RDF+ suite to your customers with this new benefit. ;-) On 1 Jul 2010, at 17:38, Jeremy Carroll wrote: I am still not hearing any argument to justify the costs of literals as subjects I have loads and loads of code, both open source and commercial that assumes throughout that a node in a subject position is not a literal, and a node in a predicate position is a URI node. but is that really correct? Because bnodes can be names for literals, and so you really do have literals in subject positions No? Of course, the "correct" thing to do is to allow all three node types in all three positions. (Well four if we take the graph name as well!) But if we make a change, all of my code base will need to be checked for this issue. This costs my company maybe $100K (very roughly) No one has even showed me $1K of advantage for this change. I agree, it would be good to get a full list of the benefits. It is a no brainer not to do the fix even if it is technically correct Jeremy -- Robert Fuller Research Associate Sindice Team DERI, Galway http://sindice.com/
Re: Please stop massive crawling against http://openean.kaufkauf.net/id/
Kingsley Idehen wrote: The LOD Cloud Cache at DERI is a live Virtuoso instance with 15 Billion+ Triples loaded. It covers as much of the LOD Cloud as we've be able to get our hands on plus 6.4 Billion Triples from the Data.Gov effort. I'll drop a more detailed note about this instance (via blog post) once we are done with data loading (there's a massive collection of eCommerce oriented Products & Services data to be loaded amongst others). I wonder is this data load the culprit responsible for the "massive crawling"? -- Robert Fuller Research Associate Sindice Team DERI, Galway http://sindice.com/
Re: Please stop massive crawling against http://openean.kaufkauf.net/id/
Hi, Sindice clearly identifies itself in the user agent http header. Currently we use these user agents: 1. "Mozilla/5.0 (compatible; sindice-fetcher/0.1.0 +http://sindice.com/developers/bot)" 2. "SindiceFetcher/Ping Manager (http://sindice.com/developers/bot"; 3. "sindice.net ontology fetcher" Niceness is implemented in our main fetcher. In some cases there may be bursts on sites providing distributed ontologies. Speaking with the group here it seems unlikely that we have not been hitting kaufkauf.net, however if you can provide an IP address I can do some further verification. I understand that http://lod.openlinksw.com/sparql is now hosted at DERI, and I wonder could some of the traffic be related to that? Again, if you can provide an IP address I will do some further verification. Kind regards, Rob. -- Robert Fuller Research Associate DERI, Galway