Re: BOF meeting on Semantic Web Search Engines at WWW 2008

2008-04-17 Thread Giovanni Tummarello
with us, as we are planning to use skype or similar things to allow distant participations. Cheers, Mathieu d'Aquin (Watson) Giovanni Tummarello (Sindice) [1] http://tinyurl.com/3m7ufj (note that the page is editable if you want to add your view

Re: Using Linking Open Data datasets

2008-05-29 Thread Giovanni Tummarello
endpoints/sites? Cheers, Peter 2008/5/30 Giovanni Tummarello [EMAIL PROTECTED]: A validator in sindice is possible and has been discussed but the list of things to do is now quite scary :-) poor man validator: plese post us about yout sitemap here http://forum.sindice.com/index.php . Free report

Re: The king is dressed in void

2008-06-12 Thread Giovanni Tummarello
as a void:example_file) The rest of the descriptions seem to be allowed for by current vocabularies such as foaf and dc so the actual specification will be very highly modular and hence easy to implement and agree on IMO. Cheers, Peter 2008/6/12 Giovanni Tummarello [EMAIL PROTECTED]: Wasnt RDF

Re: The king is dressed in void

2008-06-12 Thread Giovanni Tummarello
- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Giovanni Tummarello Sent: Thursday, June 12, 2008 12:08 AM To: Hausenblas, Michael Cc: public-lod@w3.org; Semantic Web Subject: The king is dressed in void Wasnt RDF all aabout being self describing? if i say giovanni

Re: The king is dressed in void

2008-06-12 Thread Giovanni Tummarello
Hi Michael, let me clarify that it wasnt really meant to be to: michael you were just there when i replied to the general idea and not you in particular :-) step after semantic sitemaps (it actually is thought to extend it in terms of using the sc:datasetURI as the entry point, see also

Re: The king is dressed in void

2008-06-12 Thread Giovanni Tummarello
the slicing and sparql graph parts to describe a data set? Cheers, Peter 2008/6/13 Giovanni Tummarello [EMAIL PROTECTED]: All of your described functionalities are a subset of what semantic sitemaps are for [1]. Specs aside, the paper [2] might be of interest to some is that we went to some

Re: The king is dressed in void

2008-06-13 Thread Giovanni Tummarello
need to dig our hole deeper by showing yet more reinventing. Giovanni On Fri, Jun 13, 2008 at 1:30 AM, Peter Ansell [EMAIL PROTECTED] wrote: 2008/6/13 Giovanni Tummarello [EMAIL PROTECTED]: XML is a step forward. The thing started in RDF with something called semantic crawling ontology (sorry

Re: Semantic Web Search

2008-06-22 Thread Giovanni Tummarello
Hi Hugh, as far as Sindice is concerned,please just post your message on http://forum.sindice.com and we'll be able to follow your data case closely. as far as large datasets are concerned, the indexing is currently manual that is we must personally know of the dataset (e.g. from a post in the

Re: How do you deprecate URIs? Re: OWL-DL and linked data

2008-07-05 Thread Giovanni Tummarello
://sws.geonames.org/2950157/ Actually what we need is a namespace and vocabulary for all those flavors of URI similarity and equivalence to be used on the Web, diffferent from OWL and RDFS namespace. Bernard Giovanni Tummarello a écrit : http:... or something equivalent, not a reference http

Re: Announcing Open GUID

2008-09-25 Thread Giovanni Tummarello
Hi Jason, i believe you're persuing exactly the same goal as the Okkam project (http://okkam.org). unlike okkam however you have something up alrady at a nice visible, uncluttered website. This mail of mine is just so that you know tha tther eis this common research effort and in fact to say

Re: Oh No. The Global Mind does not know Madonna divorced Guy Ritchie!

2008-10-19 Thread Giovanni Tummarello
Just out of the RDF playground for a second, http://www.evri.com/mainline-ui/jsp/index.jsf seems to know.. Madonna divorce Guy Ritchie it will give you 5 sources on the web that say that. (last 3 days..) Madonna * Guy Ritchie returns many more things (someof which are noisy etc..) but

Re: Linked data at Freebase

2008-10-29 Thread Giovanni Tummarello
Should be possible, same way as google indexes the other pages. If they get a a semantic sitemap online it would be much better, will ask for it. Giovanni On Wed, Oct 29, 2008 at 10:20 AM, Andreas Langegger [EMAIL PROTECTED] wrote: ain't that funny? After Alan's talk last week at WOD-PD we

Re: Size matters -- How big is the danged thing

2008-11-19 Thread Giovanni Tummarello
Hi Jim, honestly, a count job we launched some time ago gave us a something less than a billion on Sindice actually (But we currently dont index uniprot which is a big one). We'll be publishng live stats soon. But what about wrappers (e.g. flickr wrappers of keyword searches), that's a

Re: Size matters -- How big is the danged thing

2008-11-19 Thread Giovanni Tummarello
Hi when people liked to draw maps of the WWW, and these really quickly disappeared when it got big. I hope that happens to the Data Web, too. Hopefully soon. But my current estimate is that the Data Web is probably This has happened already, for the Data Web as in Microformat world and

Re: Size matters -- How big is the danged thing

2008-11-20 Thread Giovanni Tummarello
dbtune.org provides at least 14 billion triples (see http://blog.dbtune.org/post/2008/04/02/DBTune-is-providing-131-billion-triples + the Musicbrainz D2R server at http://dbtune.org/musicbrainz/, so I guess you'd need a pretty big phone to aggregate all that :-) .. thus the problem with

Re: Size matters -- How big is the danged thing

2008-11-21 Thread Giovanni Tummarello
Overall, that's about 17 billion. IMO considering myspace 12 billion triples as part of LOD, is quite a stretch (same with other wrappers) unless they are provided by the entity itself (E.g. i WOULD count in livejournal foaf file on the other hand, ok they're not linked but they're not less

Re: linked data mashups

2008-11-24 Thread Giovanni Tummarello
rdfs:seeAlso links, and by querying the Sindice search engine. The library Cool this is the original number 1 task sindice was conceived to do, that is provide the inverse of the seeAlso, the inverse links for automatic mashups. Happy to be of use :-) (now all that people have to di is reuse

Re: Some FOAF services

2008-11-30 Thread Giovanni Tummarello
Hi Misha, would you have a comparison between this and the google social graph api? I understand that also follows FOAF links (e.g. see livejournal etc). I guess they're less specialized however? Giovanni On Wed, Nov 26, 2008 at 3:25 PM, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: Hello, Am

Re: Do we need another list(s)? Was other things

2008-12-05 Thread Giovanni Tummarello
i agree on all your comments and believe me by talking to actual web 2.0 people you're way ahead. i'll try to answer some of your questions I then asked if they new the value of Linked Data. The answer I got was well, i would think that my site would be easier to find right? i mean, i would

Re: Exercise: LOD questions (R)†was ( Do we need another list(s)? )

2008-12-06 Thread Giovanni Tummarello
- My company has recently released an API for access to structured (database) data about 55 million companies and 35 million people. Do you think I should release this in an LOD format? How would my customers benefit. could be tricky usually one such api involves looking up and finind

Re: Can we lower the LD entry cost please (part 1)?

2009-02-07 Thread Giovanni Tummarello
Yves, just on the side, yes there is not much dbtune in sindice. just a few http://sindice.com/search?q=dbtuneqt=term if you have an RDF dump of the site or of part of it and you express it in a semantic sitemap you would be indexed full in very short time . Otherwise we should have the ne

Re: [ANN] DBpedia Lookup

2009-02-10 Thread Giovanni Tummarello
I hope that DBpedia Lookup is useful for you, and I'd appreciate any feedback. URI lookup as well as other searches are important, so to facilitate other LOD dataset providers to also do this i'd suggest they simply wrap around Sindice and take the first results, e.g.

ANN: DERI Pipes 0.6

2009-02-14 Thread Giovanni Tummarello
/?group_id=227929 WWW 2009 Research Track paper: Danh Le Phuoc, Axel Polleres, Christian Morbidoni, and Manfred Hauswirth, Giovanni Tummarello. Rapid semantic web mashup development through semantic web pipes. In Proceedings of the 18th World Wide Web Conference (WWW2009), Madrid, Spain, April 2009

Re: Update re. Virtuoso 6.0 Cluster Edition with LOD Cloud Hosting

2009-02-19 Thread Giovanni Tummarello
Wow. lots of stuff.. how many triples in total then? How many machines and of which kind? very interested Giovanni On Thu, Feb 19, 2009 at 10:38 PM, Kingsley Idehen kide...@openlinksw.com wrote: All, We now have part 1 of the Virtuoso 6.0 Cluster Edition with LOD hosting that includes: 1.

Fwd: DERI Pipes 0.6.5 available

2009-02-26 Thread Giovanni Tummarello
Forwarding from Robert Fuller, main guy now the primary contac tfor support on the project. We have moved broken pipes away, sorry for the problems in the previous release. Giovanni -- Forwarded message -- From: Robert Fuller [DERI] robert.ful...@deri.org Date: Thu, Feb 26, 2009

Re: New LOD Cloud - Please send us links to missing data sources

2009-02-28 Thread Giovanni Tummarello
congrats and kudos to all those who've made this happen. I think the cloud diagrams are proving a very compelling visual for people who don't care about nerdy detail but understand the idea of interlinked datasets. Yes they're great for handwaving if the audience has never seen it, otherwise

Re: New LOD Cloud - Please send us links to missing data sources

2009-03-01 Thread Giovanni Tummarello
Hi Andreaz :-) I don't see the difference between the LOD model and the data (including links) itself. At least to us at Zemanta it is immensely helpful to have a lot of those links done. It brings down the cost of doing really innovative stuff to us and I believe to many others too. We

Re: Finding SPAQL endpoints?

2009-03-06 Thread Giovanni Tummarello
Hi Daniel, the Semantic Sitemap Extention does that well. (also has the imporant task to tell the world that dbpedia is not 6 million RDF model but a single one which is split on the fly) http://sw.deri.org/2007/07/sitemapextension/ http://dbpedia.org/sitemap.xml Giovanni On Fri, Mar 6, 2009

Re: Finding SPAQL endpoints?

2009-03-07 Thread Giovanni Tummarello
Hi I could query the site for its sitemap extension (would it always be home url/sitemap.xml? doesn't seem so...), as Giovanni suggests, and see if I get a result; in the affirmative case, I have to parse it and look for the sc:sparqlEndpointLocation element. Sitemaps are either at

Re: Finding SPAQL endpoints?

2009-03-09 Thread Giovanni Tummarello
I know what's missing :-): any a real application that need to do the automatic discovery etc and that someone would really want to use, i.e. not another academic demonstrator. If there was one such application people would put in the last 10 minutes of work. People do at this point go to the

ann: DERI Pipes 0.7 + dbpedia/opencalais example

2009-03-10 Thread Giovanni Tummarello
=== New Features [2009-03-04] Added sesame xpath functions library including concat, lowercase and uppercase Example query using fn:concat follows: PREFIX fn: http://www.w3.org/2005/xpath-functions# select ?name where {?s ?p ?name . FILTER ( ?name=fn:concat('Giovanni ','Tummarello

Re: Parsing Freebase RDF

2009-03-14 Thread Giovanni Tummarello
Hi Jamie, i see that your RDF per URI is more expressive than the usual instead of just giving triples out of (or into) the subject of the page you also give the description of other notable entities inside for example in the blade runner movie you give the full description of all the film

Re: LDOW2009 Workshop now publishing Linked Data (was Re: Linked Data on the Web (LDOW2009) workshop papers online.)

2009-03-19 Thread Giovanni Tummarello
The only reason to mint resolvable URIs is to allow fetching of a description i'd say that minting in other's people spaces is really calling for troubles and should be discouraged? one should, could, possibly put sameas if some URI exists somewhere else. honestly? i dont even see the reasonw hy

Re: LDOW2009 Workshop now publishing Linked Data (was Re: Linked Data on the Web (LDOW2009) workshop papers online.)

2009-03-19 Thread Giovanni Tummarello
if its one source, then fine, the source is changed and its indexed again if it has been copied.. everybody loses, i'd say :-) Yes, Data Access by Reference is about not having to interact with Data by Value which requires localization of data in order to actually use the values :-) ..

Re: LDOW2009 Workshop now publishing Linked Data (was Re: Linked Data on the Web (LDOW2009) workshop papers online.)

2009-03-19 Thread Giovanni Tummarello
The point was this: _if_ you would like your data to be incorporated into the dogfood site, then it should have dogfood namespace URIs, otherwise we cannot serve it. We hope to offer people who want to contribute to the site what about creating local, arbitrary URIs, linked with sameas to the

Re: Keeping crawlers up-to-date

2009-04-28 Thread Giovanni Tummarello
Hi YVes, nothing can beat having a semantic sitemap [1]. Basically you say that you change 1nce a day and give a link to the dump. Done :-) if you put it i am ready to show in sindice the information updated every day, and with no other cost for you than a single dump download. also the sitemap

Re: Keeping crawlers up-to-date

2009-04-28 Thread Giovanni Tummarello
Forced to mention RDFSync then (ISWC 2007) Giovanni Tummarello, Christian Morbidoni, Reto Bachmann-Gmür, Orri Erling RDFSync: efficient remote synchronization of RDF models http://semanticweb.deit.univpm.it/papers/RDFSyncISWC2007.pdf there was an implementation but it was just a proof

Re: Segment RDF on BBC Programmes

2009-05-04 Thread Giovanni Tummarello
RDFa will not generally negate the essential separation of Name (via URI.URN-URL) and Address (via URI.URL) since Linked Data oriented triples will still contain de-referencable URIs :-) if you can put the RDF and the human legible HTML version in the same address there is absolutely no

Re: bootstrapping decentralized sparql

2009-05-17 Thread Giovanni Tummarello
, May 17, 2009 at 3:08 AM, Peter Ansell ansell.pe...@gmail.com wrote: 2009/5/17 Giovanni Tummarello g.tummare...@gmail.com: for graphs which use a (specific) FOAF term.  It's a bit like PingTheSemanticWeb or Sindice, but decentralized based on the ontologies used. [] Isnt this like

Re: Dereferencing a URI vs querying a SPARQL endpoint

2009-05-21 Thread Giovanni Tummarello
Hi, there isnt a single answer unfortunately. Lets take symetric concise bound descriptions (SCBD) which basically means from the uri you'll get triples around it recursively until you find other URIs. (so when you find a blank node you keep on going). This seems a pretty good way to provide

Re: BobQL? Boxes of (related) boxes ...

2009-05-31 Thread Giovanni Tummarello
Hi Dan, storing (and being able to re-execute) this journey reminds me of the driving inspiration behind DERI Pipes. Pipes have an underlying XML representation language which stores the recipie for processing one or more RDFs. arbitrary operators can select data out of it, returns another RDF

Re: ANN: sameas.org

2009-06-03 Thread Giovanni Tummarello
Cool Hugh :-) great ajaxi thing as well. if you dont do this already it might make sense to also add from your page yu say There is currently no a service to enable arbitrary contribution to the contents. If you have significant data you would be prepared to give us, then please conact us at the

Re: sameas.org

2009-06-04 Thread Giovanni Tummarello
a New Zealander and a Kiwifruit) throws up a radio station, an animated cartoon and lots of wordnet links to a juggle of plumbing but no juice.  No sign of http://dbpedia.org/resource/Kiwi however Ah. We only look at the first n results from Sindice, and clearly kiwi is a popular name.

Re: Common Tag - semantic tagging convention

2009-06-12 Thread Giovanni Tummarello
On Fri, Jun 12, 2009 at 9:44 AM, Toby Inkstert...@g5n.co.uk wrote: On Fri, 2009-06-12 at 01:33 +0200, Andraz Tori wrote: also to note is that there exist proper mappings to other efforts at tagging ontologies: http://commontag.org/mappings The question is though, will Search Monkey, Sindice,

Re: http://ld2sd.deri.org/lod-ng-tutorial/

2009-06-23 Thread Giovanni Tummarello
Just a remark about what we're doing in Sindice, for all who want to be indexed properly by us. we recursively dereference the properties that are used thus trying to obtain a closure over the description of the properties that are used. We also consider OWL imports. When the recursive fetching

Re: RDFa vs RDF/XML and content negotiation

2009-06-23 Thread Giovanni Tummarello
Just RDFa and live happy IMO. A machine doesnt care about the messy part of the markup. The advantage of a single URL to access it too much to be a match for anything. It is a fact that people like us like to look at RDF directly as well. But it should be a problem to use a firefox plugin to

Re: http://ld2sd.deri.org/lod-ng-tutorial/

2009-06-23 Thread Giovanni Tummarello
Martin, partially you could solve the problem yourself by putting the owl:import triples in your ontology fragments e.g. the fragment, when served, says owl import so that you're sure the ontology is used as a whole.. would this do it? :-) fixing the problem in a single location might be so much

Re: http://ld2sd.deri.org/lod-ng-tutorial/

2009-06-25 Thread Giovanni Tummarello
/, Giovanni Tummarello http://www.deri.ie/about/team/member/giovanni_tummarello/, Stefan Decker http://www.deri.ie/about/team/member/stefan_decker/ *Context Dependent Reasoning for Semantic Documents in Sindice.* In *Proceedings of the 4th International Workshop on Scalable Semantic Web Knowledge Base

Re: Dons flame resistant (3 hours) interface about Linked Data URIs

2009-07-10 Thread Giovanni Tummarello
I answer to Toby just becouse its handy to do so but i just want to make a general statement. Toby is stating the classical view, clean knowledge representation, 0% dealing with ambiguity. Hugh is hinting at is that the complexity of the clean solution is overwhelming since it is

Sig.ma - live views on the web of data

2009-07-22 Thread Giovanni Tummarello
Dear Web of Data enthusiasts, we are very happy to share with you today the first public version of Sigma, http://sig.ma , a browser, a mashup engine and an API for the web of data. here is blog post with screencast, sample sigma embedded mashup etc.

Re: Sig.ma - live views on the web of data

2009-07-27 Thread Giovanni Tummarello
/23 Giovanni Tummarello giovanni.tummare...@deri.org: Dear Web of Data enthusiasts, we are very happy to share with you today the first public version of Sigma,  http://sig.ma ,  a browser, a mashup engine and an API for the web of data. here is blog post with screencast, sample sigma embedded

ANN: Sparallax! - Browse sets of things together (now those on your SPARQL endpoint)

2009-07-27 Thread Giovanni Tummarello
Dear Semi Structured Data Enthusiasts, we are today pleased to announce version 1 of Sparallax Sparallax is an adaptation of the  FreeBase Parallax to use SPARQL endpoints. Thanks to a proxy and query translation modules (SPARQL to MQL and results translated back), Sparallax is minimally

Re: ANN: Sparallax! - Browse sets of things together (now those on your SPARQL endpoint)

2009-07-27 Thread Giovanni Tummarello
Hi Kingsely, we are a bit unsure about your complaint, please clarify, do you mean to say that sparallax give that user agent when trying to connect to an external sparql endpoint? we tried and got the user agent of the browser. Not sure how it is important to use a user agent instead of

Re: ANN: BestBuy.com starts publishing full catalog as RDF/XML using GoodRelations - 27 million triples

2009-09-01 Thread Giovanni Tummarello
-unibw.org wrote: Hi Giovanni: Giovanni Tummarello wrote: Hi Martin, all, the sitemap exposed is not a Semantic Sitemap Semantic Sitemap: http://products.semweb.bestbuy.com/sitemap.xml but simply gives the location of the dumps. As far as I see, the sitemap at http

Re: ANN: BestBuy.com starts publishing full catalog as RDF/XML using GoodRelations - 27 million triples

2009-09-01 Thread Giovanni Tummarello
-unibw.org] Sent: Tuesday, September 01, 2009 8:14 AM To: giovanni.tummare...@deri.org Cc: public-lod@w3.org Subject: Re: ANN: BestBuy.com starts publishing full catalog as RDF/XML using GoodRelations - 27 million triples Hi Giovanni: Giovanni Tummarello wrote: Hi Martin, all

Re: dbpedia not very visible, nor fun

2009-09-15 Thread Giovanni Tummarello
*Promotion* :-) Accessing dbpedia with sparallax http://sparallax.deri.ie

New tools: Sindice Inspector, Full Cache API – all with Online Data Reasoning

2009-10-12 Thread Giovanni Tummarello
Full announcement at http://blog.sindice.com/2009/10/12/new-inspector-full-cache-api-all-with-online-data-reasoning/ quotable text: --- We’re happy to release today 2 distinct yet interplaying features in Sindice: The Sindice Inspector and the Sindice Cache API (both including

Re: Mirror for PIPS food ontology?

2009-10-15 Thread Giovanni Tummarello
Kind of make me thing.. we could put it virtually back in the same place as originally on our Sindice cache [1] i wonder if the operation make sense.. on the one hand a chace is usually intended for reflecting reality on the other i'd see obvious practical advantages. Maybe we could offer an

Re: The Power of Virtuoso Sponger Technology

2009-10-17 Thread Giovanni Tummarello
With respect to crawling and scraping or sponging or .. trying to guess based on partial fragments of structured information i can say 3 thngs a) No, we're not doing it at the moment, we are only covering those who chose to put structured semantics. Some book stuff shows up in Sig.ma .. e.g.

Re: The Power of Virtuoso Sponger Technology

2009-10-18 Thread Giovanni Tummarello
I'd say, if i understand well that that works only for queries where you need the extra dereferenced data just additionally e.g. to add a label to your result se if you need the remote, on the fly reference data to e.g. sort by price you'd have to fetch all from the remote site .. Gio On Sun,

Re: The Power of Virtuoso Sponger Technology

2009-10-18 Thread Giovanni Tummarello
Giovanni Tummarello wrote: With respect to crawling and scraping or sponging or .. trying to guess based on partial fragments of structured information i can say 3 thngs a) No, we're not doing it at the moment, we are only covering those who chose to put structured semantics. Some book stuff

Re: The Power of Virtuoso Sponger Technology

2009-10-18 Thread Giovanni Tummarello
A) The wrapper's Semantic Sitemap points you at the original Sitemap, and says how it is doing the wrapping. And because you know how the wrapper is behaving, you can process the standard Sitemap to get the information you want about what the wrapping site provides. Actually, the slicing in

Re: ISWC2009 Metadata Available

2009-10-23 Thread Giovanni Tummarello
- general chair Enrico Motta: http://data.semanticweb.org/person/enrico-motta (see that is general chair 2009) - a paper from the research track: http://data.semanticweb.org/conference/iswc/2009/paper/research/311 - a workshop at ISWC2009:

Re: Triple materialization at publisher level

2010-04-06 Thread Giovanni Tummarello
Wrt this, i feel like sharing how we address this issue in Sindice and the tools we provide. We do materialization at central level following recursively the links to ontologies e.g. by resolving property names. This allows data producers to be consideraly more concise in the markup (e.g. think

Re: Triple materialization at publisher level

2010-04-06 Thread Giovanni Tummarello
Hi Vasily yes, you can use Sindice for that purpose. either from asking data from the full reasoned cache (ask away ,we can serve plenty) or from the reasoning API (with a bit of moderation, it is an intense process although we do have many layers of caching) a blog post about the details

Re: [Patterns] Materialize Inferences (was Re: Triple materialization at publisher level)

2010-04-07 Thread Giovanni Tummarello
change or better reasoning happens or new data etc) + serialization not fully perfomred automatically would seem an irrealistic On Wed, Apr 7, 2010 at 12:38 PM, Vasiliy Faronov vfaro...@gmail.com wrote: Giovanni Tummarello wrote: In this casematerialization is likely not going to happen much

Re: What would you build with a web of data?

2010-04-11 Thread Giovanni Tummarello
+1 thanks Nathan for pointing this out, very very relevant. luckly so far it seems a bit too rooted in MS stack of things (just looking at it very very superficially) :-)? Gio ps: realistically there's the whole microsoft thing to keep in the back of our minds; they have pretty much a

Re: RDF Dataset Notifications

2010-04-17 Thread Giovanni Tummarello
Hi Leigh i tell you what we're going to be supporting in Sindice very soon and it would be great if you could add it to the table: simple existing sitemaps :-). Sitemaps provide the list of URLs to crawl and for each one either a last updated field or update frequerncy. If the website cares to

Re: Semantic black holes at sameas.org Re: [GeoNames] LOD mappings

2010-04-23 Thread Giovanni Tummarello
sws.geonames URIs, SPARQL endpoint etc. Bearing in mind that Geonames.org has no dedicated resources for it, who will care of that in a scalable way? What is the business model? Good questions. Volunteers, step forward :) Bernard Hi Bernard, the need to automatically interlink at large

Re: Semantic black holes at sameas.org Re: [GeoNames] LOD mappings

2010-04-23 Thread Giovanni Tummarello
so hang on tight a bit.. we're working on this, just continue publishing high quality data with good entity descriptions (as much as you know about YOUR stuff), and the links will come to you just like that at some point. I promise :) WOW ... rings a bell ...and all these things will be

Sindice real time widget/api, and news feed

2010-04-26 Thread Giovanni Tummarello
Hi all, A new version of the Sindice frontend with some interesting improvements. e.g. a realtime data widget on the homepage, and the new API to restrict to new day documents (or weekly) etc. http://sindice.com Also Facebook support for RDFa is making the web now bubble with new triples. See

Hiring opportunities in Sindice

2010-06-12 Thread Giovanni Tummarello
For the interested, within several new EU projects there are now hiring opportunities available to work on Sindice current and future services: cloud computing postdoc/researcher, cloud/semantic/integration developers. Internships also available with possible ph.d continuation. Good community

Re: DBpedia-Live and Delta Exposure

2010-06-19 Thread Giovanni Tummarello
Hi there :-) looks very cool. could you please point us to the specifics of protocol? so we can start considering integrating in Sindice Note: we're about to announce (monday?) delta support in Sindice based on Sitemaps lastmod which seems to be the easiest possible for the HTML+ RDFa world.

Efficient Data discovery and Sync Support - proposed method and Sindice implementation

2010-07-08 Thread Giovanni Tummarello
Apologies for cross posting - Dear all So far semantic web search engines and semantic aggregation services have been inserting datasets by hand or have been based on random walk like crawls with no data completeness or freshness guarantees. After quite some work, we are happy to

Re: Linked Data and IRI dereferencing (scale limits?)

2010-08-05 Thread Giovanni Tummarello
Jorn you're right. linked data with plain dereferenciable URIs it plain doesnt work once you move from the simplest examples. This is for some of the reasons you mention as well as other others (e.g. how do you really ask what are the 1000 URis most visited (assuming this was in the DB) or the

Re: Linked Data and IRI dereferencing (scale limits?)

2010-08-06 Thread Giovanni Tummarello
Only solution for you now is to use SPARQL instead of resolving the URI. Much less traffic and it would actually work SPARQL doesn't make the problem go away, it just pushes the limits further out. SPARQL endpoints that see significant traffic have similar restrictions built in, either on

Re: Linked Data and IRI dereferencing (scale limits?)

2010-08-06 Thread Giovanni Tummarello
Thanks Paul, this sort of feedback is indeed tremeoudly useful, I somehow just wish you had had 1/10th of the replies of the subjects as literal thread.:-) Gio (obviously we're talking business of LOD at large and the true state of it despite the growing number of lines in the lod cloud diagram.

Re: Deltas of RDF files from Sindice or other site?

2010-10-03 Thread Giovanni Tummarello
Hi Mattihas, sorry for the delay. it is indeed a possible API which we call longstanding query or notification api . Not yet available , but we have many requests for it so it wil come. my advice at the moment would be to do it yourself client side using say a DB state and fetching the data from

Re: AW: ANN: LOD Cloud - Statistics and compliance with best practices

2010-10-21 Thread Giovanni Tummarello
But again: I agree that crawling the Web of Data and then deriving a dataset catalog as well as meta-data about the datasets directly from the crawled data would be clearly preferable and would also scale way better. Thus: Could please somebody start a crawler and build such a catalog? As

Re: Is 303 really necessary?

2010-11-04 Thread Giovanni Tummarello
Hi Ian no its not needed see this discussion http://lists.w3.org/Archives/Public/semantic-web/2007Jul/0086.html pointing to 203 406 or thers.. ..but a number of social community mechanisms will activate if you bring this up, ranging from russian style you're being antipatriotic criticizing the

Re: Is 303 really necessary?

2010-11-04 Thread Giovanni Tummarello
I think it's an orthogonal issue to the one RDFa solves. How should I use RDFa to respond to requests to http://iandavis.com/id/me which is a URI that denotes me? hashless? mm one could be to return HTML + RDFa describing yourself. add a triple saying http://iandavis.com/id/me

Re: Is 303 really necessary - demo

2010-11-05 Thread Giovanni Tummarello
I might be wrong but I dont like it much . Sindice would index it as 2 documents. http://iandavis.com/2010/303/toucan http://iandavis.com/2010/303/toucan.rdf i *really* would NOT want to different URLs resolving to the same thing thanks Giovanni On Fri, Nov 5, 2010 at 10:43 AM, Ian Davis

200 OK with Content-Location might work: But maybe it can be simpler?

2010-11-05 Thread Giovanni Tummarello
How about something that's totally independant from HEADER issues? think normal people here. absolutely 0 interest to mess with headers and http responses.. absolutely no business incentive to do it. as a baseline think someone wanting to annotate with RDFa a hand crafted, apached served html

Re: A(nother) Guide to Publishing Linked Data Without Redirects

2010-11-10 Thread Giovanni Tummarello
Bravo Harry :-) let me also add without adding anythng to the header.. *keeping HTTP completely outside the picture* http header are for pure optimization issues, almos networking level. Caching fetching crawling, nothing to do with semantics. A conjecture: the right howto document is about 2

Re: survey: who uses the triple foaf:name rdfs:subPropertyOf rdfs:label?

2010-11-12 Thread Giovanni Tummarello
Yes Sig.ma heavily checks for properties that are subclass of label and uses them. I think sparallax as well. Gio On Fri, Nov 12, 2010 at 12:08 PM, Dan Brickley dan...@danbri.org wrote: Dear all, The FOAF RDFS/OWL document currently includes the triple  foaf:name rdfs:subPropertyOf

Re: Is 303 really necessary?

2010-11-28 Thread Giovanni Tummarello
- the rest of the web continue to use 200 Tim yes but the rest of the web will use 200 also to show what we would consider 208, e.g. http://www.rottentomatoes.com/celebrity/antonio_banderas/ see the trilples

Re: ANN: geometry2rdf software library

2011-01-17 Thread Giovanni Tummarello
Boris would you be able to provide a bit of explanation on why would you want o do that e.g. what evidence are there (nice use cases) were an rdf export of low level features in the map is of use thanks! Gio On Mon, Jan 17, 2011 at 2:34 AM, Boris Villazón Terrazas bvilla...@fi.upm.es wrote:

Re: data schema / vocabulary / ontology / repositories

2011-03-13 Thread Giovanni Tummarello
To the best of my knowledge there isnt anything that one could call modern, updated out there. something modern and credible would be actual data + social backed (votes, comments, etc) . . as said in the past we in Sindice we'd be delighted to provide the data part if anyone wanted to

Re: How many instances of foaf:Person are there in the LOD Cloud?

2011-04-13 Thread Giovanni Tummarello
sindice.com main index has 37,312,159 documents occurrences of foaf:person. http://sindice.com/search?q=foaf%3Aperson (a lot of these come from microformats via the any23 library but anyway) which means there are many more actual persons inside. Gio On Wed, Apr 13, 2011 at 10:15 AM, Bernard

Re: How many instances of foaf:Person are there in the LOD Cloud?

2011-04-13 Thread Giovanni Tummarello
, Apr 13, 2011 at 4:48 PM, Giovanni Tummarello giovanni.tummare...@deri.org wrote: sindice.com main index has 37,312,159 documents occurrences of  foaf:person. http://sindice.com/search?q=foaf%3Aperson (a lot of these come from microformats via the any23 library but anyway) which means

Re: Minting URIs: how to deal with unknown data structures

2011-04-16 Thread Giovanni Tummarello
Hi Frank, my 2c from the Sindice.com point of view.. (as we struggle to actually make use and make easy for others to use all this) i wouldn't really worry too much, just give to the machines what you'd give to humans, that technically means simply make sure all the pages you display (and that

Re: Schema.org in RDF ... expected Types in RDFS

2011-06-06 Thread Giovanni Tummarello
So, can someone clarify, if possible, whether if I publish a page using RDFa and schema.rdf.org syntax, it will be properly parsed and indexed in any of those search engines? that's all they'd have to say not to piss people off but they decided not to do it. didnt cost anything. pretty

Re: Schema.org in RDF ...

2011-06-09 Thread Giovanni Tummarello
my2c i would seriously advice against using triples with http://schema.rdfs.org . That would be totally and entirely validating their claim that either you impose things or fragmentation will distroy everything and that talking to the community is a waste of time. For how little this matters

Re: Schema.org in RDF ...

2011-06-09 Thread Giovanni Tummarello
Ireland, Europe Tel. +353 91 495730 http://linkeddata.deri.ie/ http://sw-app.org/about.html On 9 Jun 2011, at 09:54, Giovanni Tummarello wrote: my2c i would seriously advice against using  triples with http://schema.rdfs.org  . That  would be totally and entirely validating their claim

Re: Schema.org in RDF ...

2011-06-11 Thread Giovanni Tummarello
My sincere congratulations, i had someone overlooked at this level of detail needed here. The choices are pragmatic and - in my personal opinion having talked directly at SemTech with a lot of people involved in this - should serve the community as good as possible. will you be posting this as a

Re: Squaring the HTTP-range-14 circle

2011-06-16 Thread Giovanni Tummarello
Hi Tim , documents per se (a la HTTP response 200 response) on the web are less and less relevant as opposed to the conceptual entities that are represented by this document and held e.g. as DB records inside CMS, social networks etc. e.g. a social network is about people those are the

Re: Semantic Web Challenge 2011 CfP and Billion Triple Challenge 2011 Data Set published.

2011-06-17 Thread Giovanni Tummarello
This year, the Billion Triple Challenge data set consists of 2 billion triples. The dataset was crawled during May/June 2011 using a random sample of URIs from the BTC 2010 dataset as seed URIs. Lots of thanks to Andreas Harth for all his effort put into crawling the web to compile this

Re: Squaring the HTTP-range-14 circle [was Re: Schema.org in RDF ...]

2011-06-19 Thread Giovanni Tummarello
particular confusion is so destructive. Unlike the dogs-vs-bitches case, the difference between the document and its topic, the thing, is that one is ABOUT the other. This is not simply a matter of ignoring some Could it be exactly the other way around? that documents and things described in

Re: ANN: Sudoc bibliographic ans authority data

2011-07-09 Thread Giovanni Tummarello
Hi Nicolas, Its getting in Sindice indeed - quite politely e.g. 1 every 5 secs- we'll monitor speed and completeness. iff you think its ok for us to crawl faster please say so via robot.txt directive or just say so

Re: ANN: Sudoc bibliographic ans authority data

2011-07-10 Thread Giovanni Tummarello
) channels for data publication over the web, which serve different goals. Maybe we need to better articulate the practices and expectations, though... Cheers, Antoine  Hi Giovanni, Le 09/07/2011 23:10, Giovanni Tummarello a écrit : Hi Nicolas, Its getting in Sindice indeed - Yes, I

  1   2   >