Re: What is the current thinking about how to tie a concept or tag to a webpage via LinkedData?
On Fri, 2009-07-24 at 20:15 -0500, Peter DeVries wrote: I was wondering what the current thinking is on ontologies that tie a concept or tag to a web page? You might want to look at http://commontag.org It is made for the purpose of being able to say what certain page and certain terms on the page are about - using Linking Open Data Freebase as a vocabulary of possible meanings. It is very simplified for the purpose of being used as RDFa. Some more tools are coming out in the future to natively support it. bye andraz I was thinking of creating links between species entities and webpages that are about that species. For instance the species entity Puma concolor v6n7p represented with this uri: http://www.taxonconcept.org/ses/v6n7p and a web page that is primarily about that species like this EOL page. http://www.eol.org/pages/311910 In the RDF for the species entity I describe it this way, since the page is primarily about this species. foaf:isPrimaryTopicOf rdf:resource=http://www.eol.org/pages/311910; / But I was looking through various bookmarking ontologies and thought it might be useful to create RDF that ties the concept and a web page together. There seems to be two ways to do this, one via something like delicious (RDF representation) Or via the Annotea ontology. So I marked up this RDFa demo page. http://www.taxonconcept.org/bookmarks/v6n7p_1000 I thought I should ask this list about the current best practices for creating these links. Thanks in Advance :-) - Pete --- Pete DeVries Department of Entomology University of Wisconsin - Madison 445 Russell Laboratories 1630 Linden Drive Madison, WI 53706 Email: pdevr...@wisc.edu GeoSpecies Knowledge Base About the GeoSpecies Knowledge Base -- Andraz Tori, CTO Zemanta Ltd, New York, London, Ljubljana www.zemanta.com mail: and...@zemanta.com tel: +386 41 515 767 twitter: andraz, skype: minmax_test
Re: Common Tag - semantic tagging convention
On Tue, 2009-06-23 at 12:37 +0100, Pierre-Antoine Champin wrote: Le 18/06/2009 16:46, Alexandre Passant a écrit : I just reply to an e-mail from Toby on the topic on the commontag ml. Since the archives are not yet public, let-me repost my point about the mappings here. A Tag in common tag is a tag a seen in Newman's ontology. My understanding of Newman's ontology is that the URI tag:cheese (for example) represent every occurence of the string cheese used as a tag. This cannot work with common tag, since cheese can be used: - on resource R1 at date D1 as an AuthorTag - on resource R2 at date D2 as a ReaderTag This practically forces us to define two distinct Tag resources, both with the label cheese, but with different types and taggingDate. Of course, nothing prevents us from using the same resource, but in that case, we would not know anymore which date and which tag type correspond to which resource... So the design of Common Tag implies that Tag resources should not be reused across different tagging actions, which IMHO makes them quite different from Newman's Tags (and close to Tagging, in fact). Yes, exactly, that is the case. pa -- Andraz Tori, CTO Zemanta Ltd, New York, London, Ljubljana www.zemanta.com mail: and...@zemanta.com tel: +386 41 515 767 twitter: andraz, skype: minmax_test
Re: Common Tag - semantic tagging convention
Hi Francsois others, I really like the turn this debate has taken! Practical considerations about what is useful and what not! Going with Common Tag standard through many iterations, I think I can explain some choices taken. First, we were aware of other existing possibilities for semantic tagging. The idea was to approach this from the other side - from what companies are already doing with semantic tags and what they wanted to do next. So we were asking ourselves: can this be made interoperable?. We tried to create a specification that would be minimal, yet useful for practical aims of already existing applications and services. The parties involved didn't see the immediate need to exactly specify the the one who placed the tag, but we did feel the need for the ability to specify how tag was created (inferred by machine, fully by human, etc...) So we included that. The date of tagging came into the game mostly as an example of how the standard can be extended if the practical need arises. I don't think any of the people involved actually publish tagging date along the tags right now. And I agree with Peter Mika, that this can be quite important piece of information [I can imagine one reason why taggingDate can be technically very important - because vocabulary entity (in DBpedia for example, I think less so in Freebase) can change meaning through time. While this was definitely not the reason to add tagging date as part of the specification, it can be justified that way] As Jamie said, this is the basic skeleton that was needed, now we'll see how it is used and adjust accordingly through time. bye andraz On Fri, 2009-06-12 at 19:13 +0200, François Dongier wrote: 2009/6/12 Peter Mika pm...@yahoo-inc.comy t Maybe others can comment as well, but I do think it's [taggingDate] an important piece of information, e.g. to determine recently popular tags. In my very humble opinion, **who** tagged a resource with ctag T at time t could also be a very useful information to store. I'm expecting that in the near future we will have rich user profiles, based on sets of semantic tags (semantic tagclouds, if you prefer). Communities of interest, not just individual people, could also be defined in terms of such semantic cIouds. I think lots of interesting computations could be done over that kind of information: personalised reading recommendations obviously, but also relativisation of the popularity of a tagset (and I agree the timestamp is useful for that) to a particular community of users. For this sort of thing, don't we need a taggedBy property? -- Andraz Tori, CTO Zemanta Ltd, New York, London, Ljubljana www.zemanta.com mail: and...@zemanta.com tel: +386 41 515 767 twitter: andraz, skype: minmax_test
Re: Common Tag - semantic tagging convention
On Thu, 2009-06-11 at 16:39 -0400, Kingsley Idehen wrote: Can you point me to a live example of a Tag with a de-referencable HTTP URI? I need that to comment :-) http://blog.commontag.org Extraction: http://www.w3.org/2007/08/pyRdfa/extract?uri=http://blog.commontag.orgformat=pretty-xmlwarnings=falseparser=laxspace-preserve=truesubmit=Go! or: http://faviki.com/topic/Semantic_web Extraction: http://www.w3.org/2007/08/pyRdfa/extract?uri=http%3A%2F% 2Ffaviki.com%2Ftopic% 2FSemantic_webformat=pretty-xmlwarnings=falseparser=laxspace-preserve=truesubmit=Go!) or http://zigtag.com/tag/Web%20Design/1570830 Extraction: http://www.w3.org/2007/08/pyRdfa/extract?uri=http://zigtag.com/tag/Web% 20Design/1570830format=pretty-xmlwarnings=falseparser=laxspace-preserve=truesubmit=Go! [hmm something is wrong here, probably zigtag has some glitch currently] -- Andraz Tori, CTO Zemanta Ltd, New York, London, Ljubljana www.zemanta.com mail: and...@zemanta.com tel: +386 41 515 767 twitter: andraz, skype: minmax_test
Re: Common Tag - semantic tagging convention
On Thu, 2009-06-11 at 16:39 -0400, Kingsley Idehen wrote: Can you point me to a live example of a Tag with a de-referencable HTTP URI? I need that to comment :-) Ok, now ZigTag now works correctly also [they fixed it fast!] http://zigtag.com/tag/Web%20Design/1570830 Extraction: http://www.w3.org/2007/08/pyRdfa/extract?uri=http://zigtag.com/tag/Web% 20Design/1570830format=pretty-xmlwarnings=falseparser=laxspace-preserve=truesubmit=Go! also to note is that there exist proper mappings to other efforts at tagging ontologies: http://commontag.org/mappings If anyone has any ideas of what could be missing, let us know! Also important: the initial nucleus of organizations welcomes any new participants that would support Common Tag meaningfully inside their services ... there's a lot of space on that diagram that needs to be filled up... :) -- Andraz Tori, CTO Zemanta Ltd, New York, London, Ljubljana www.zemanta.com mail: and...@zemanta.com tel: +386 41 515 767 twitter: andraz, skype: minmax_test
Event: Freebase+Zemanta meetup in San Francisco
Hi, I am not sure if such an announcement belongs to this list. Both companies are dealing with LOD and the primary purpose of the event is to see what people can come up with when meshing-up data and services close to LOD. So here it is: Freebase+Zemanta developer meetup in San Francisco http://freebasezemanta.eventbrite.com/ Saturday, March 21st (tomorrow) at 12pm - 5pm at Freebase offices. There are going to be presentations about both Zemanta and Freebase APIs, how to mash them together, discussion about simple semantic tagging standard, and hopefully some hacking time to come up with prototypes of cool new stuff. LOD-aficionados from the Bay area, you are welcome! -- Andraz Tori, CTO Zemanta Ltd, London, Ljubljana www.zemanta.com mail: and...@zemanta.com tel: +386 41 515 767 twitter: andraz, skype: minmax_test
Re: The Guardian Open Platform and Data Store
On Tue, 2009-03-10 at 21:00 +, John Goodwin wrote: Some/all of you probably heard that the Guardian released their new Open Platform/Data Store today [1]. Wouldn't it be great to see this as part of the linked data web?? [2] Yeah, we were building one of the demos for them and this would have helped immensely (luckily we had at disposal our own API to get linked data out of their articles :). We actually created mappings between Guardian tags and Linked Data entities, if anyone wants to explore that path further... (they use controlled vocabulary). bye andraz John [1] http://simonwillison.net/2009/Mar/10/openplatform/ [2] http://johngoodwin225.wordpress.com/2009/03/10/the-guardian-open-platform-and-data-store/ . This email is only intended for the person to whom it is addressed and may contain confidential information. If you have received this email in error, please notify the sender and delete this email which must not be copied, distributed or disclosed to any other person. Unless stated otherwise, the contents of this email are personal to the writer and do not represent the official view of Ordnance Survey. Nor can any contract be formed on Ordnance Survey's behalf via email. We reserve the right to monitor emails and attachments without prior notice. Thank you for your cooperation. Ordnance Survey Romsey Road Southampton SO16 4GU Tel: 08456 050505 http://www.ordnancesurvey.co.uk -- Andraz Tori, CTO Zemanta Ltd, London, Ljubljana www.zemanta.com mail: and...@zemanta.com tel: +386 41 515 767 twitter: andraz, skype: minmax_test
RE: The Guardian Open Platform and Data Store
On Tue, 2009-03-10 at 22:16 +0100, Georgi Kobilarov wrote: Hi Andraz, We actually created mappings between Guardian tags and Linked Data entities, if anyone wants to explore that path further... (they use controlled vocabulary). very interesting, can you tell a little bit more about that? what kind of controlled vocabulary is that, which sources to you map to and how? Well you can get a full set of tags from their API (around 7k of them). We took that and a bit more info that we were provided with and reconciliated those 'controlled tags' with DBpedia Freebase. I am not sure if we also used MusicBrainz and Semantic Crunchbase, can look it up tomorrow. So basically links between those 7k guardian tags and corresponding LOD entities were established where possible. But there are some messy details. What I can do is provide this mapping we created for anyone interested (tomorrow). We then took different route to create a demo with their api (http://labs.zemanta.com/guardian), so we didn't use them. How did we do it? For each tag in vocabulary, we looked up Guardian stories tagged with it in our aggregator (guardian puts those tags into their RSS, so they land in our engine). This provided us with background knowledge about each tag (= what kind of stories it was used for). Then we disambiguated the tags (with that background knowledge) into LOD by calling Zemanta API. However there are some messy details to take care of if anyone picks it up from here. bye andraz Cheers, Georgi -- Georgi Kobilarov Freie Universität Berlin www.georgikobilarov.com -- Andraz Tori, CTO Zemanta Ltd, London, Ljubljana www.zemanta.com mail: and...@zemanta.com tel: +386 41 515 767 twitter: andraz, skype: minmax_test
Re: The Guardian Open Platform and Data Store
On Tue, 2009-03-10 at 17:35 -0400, Kingsley Idehen wrote:y). Andraz, So you are saying that Zemanta payloads carry Gaurdian data, and all the data items have de-referencable URIs (of course from DBpedia and other LOD spaces where matches exist) ? What I am saying is that we created mappings. We have them listed in a file. two columns - in one there are guardian tags, in the other there are LOD uris. We then took the different route for creating a demo, so we never actually integrated this anywhere. -- Andraz Tori, CTO Zemanta Ltd, London, Ljubljana www.zemanta.com mail: and...@zemanta.com tel: +386 41 515 767 twitter: andraz, skype: minmax_test
RE: Can we lower the LD entry cost please (part 1)?
On Sun, 2009-02-08 at 15:56 +0100, Georgi Kobilarov wrote: Hi Andraz, I disagree, those two goals are not completely different in a sense that different groups should address it separately. I had a delighting conversation with Andreas Harth of SWSE about that a week ago in Berlin. Search Engines can't clean up other people's mess. It's even harmful if they try. Data providers need incentives to provide clean data. See the Google example: Google started indexing the web, and the webpages with clean markup and site structure showed up in their search. And Google's search provided real benefit to end-users. Oh, I agree it's good for the web that publishers provide as clean data as possible. What I am saying is that publishers have the data and might be convinced to provide it, but requiring them to also provide advanced search technology is contraproductive argument. On the other hand, the major problem of semantic web is lack of _incentives_ for publishers to publish data in clean semantic form. I am working on one of the initiatives to change that and it will hopefully see light of the day soon. Hence web publishers started to do SEO (search engine optimization), so that their stuff shows up in Google as well (or ranked higher). If we don't reward the Linked Data publishers who provide clean data and penalize those who don't, there will never be an incentive to do it right. Yes exactly, I 100% agree. If you want publishers to provide good data, provide incentives for them to do so. (most) Publishers only care about traffic or better ad targeting, so make sure they get one or another. I am not seeing many initiatives in that direction. Instead of putting requirements on the publishers we should be working on creating incentives for them. And demanding that google rewards them won't work. Semweb community needs to create its own way of rewarding them. bye andraz Cheers, Georgi -- Georgi Kobilarov Freie Universität Berlin www.georgikobilarov.com -Original Message- From: public-lod-requ...@w3.org [mailto:public-lod-requ...@w3.org] On Behalf Of Andraz Tori Sent: Saturday, February 07, 2009 4:02 PM To: Hugh Glaser Cc: public-lod@w3.org Subject: Re: Can we lower the LD entry cost please (part 1)? Hi Hugh, I think you are mixing two completely different goals. Why can't one set of people provide the data while the other set of people provide search technologies over that data? It takes two completely different technologies, processes, etc. BTW: an easy way to search is also to write meaningful sentence or paragraph (using the phrase/entity/concept) and put it into Zemanta or Calais. You will usually get properly disambiguated URIs back. bye andraz On Sat, 2009-02-07 at 13:23 +, Hugh Glaser wrote: My proposal: *We should not permit any site to be a member of the Linked Data cloud if it does not provide a simple way of finding URIs from natural language identifiers.* Rationale: One aspect of our Linking Data (not to mention our Linking Open Data) world is that we want people to link to our data - that is, I have published some stuff about something, with a URI, and I want people to be able to use that URI. So my question to you, the publisher, is: How easy is it for me to find the URI your users want? My experience suggests it is not always very easy. What is required at the minimum, I suggest, is a text search, so that if I have a (boring string version of a) name that refers in my mind to something, I can hope to find an (exciting Linked Data) URI of that thing. I call this a projection from the Web to the Semantic Web. rdfs:label or equivalent usually provides the other one. At the risk of being seen as critical of the amazing efforts of all my colleagues (if not also myself), this is rarely an easy thing to do. Some recent experiences: OpenCalais: as in my previous message on this list, I tried hard to find a URI for Tim, but failed. dbtune: Saw a Twine message about dbtune, trundled over there, and tried to find a URI for a Telemann, but failed. dbpedia: wanted Tim again. After clicking on a few web pages, none of which seemed to provide a search facility, I resorted to my usual method:- look it up in wikipedia and then hack the URI and hope it works in dbpedia. (Sorry to name specific sites, guys, but I needed a few examples. And I am only asking for a little more, so that the fruits of your amazing labours can be more widely appreciated!) wordnet: [2] below So I have access to Linked Data sites that I know (or at least strongly suspect) have URIs I might want, but I can't find them. How on earth do we expect your average punter to join this world? What have I missed? Searching, such as Sindice: Well yes, but should I really have to go off
RE: Granular dereferencing ( prop by prop ) using REST + LinkedData; Ideas?
On Tue, 2008-12-30 at 12:29 +0100, Georgi Kobilarov wrote: Hi Aldo, How dynamic is ex:dynamic1? Does the value change for every request? Can't you use caching? I'd say you should try to optimize your backend instead of breaking conventions. Think of pre-computing all rdf-documents and then serving static files, and updating them periodically for example. This really isn't a useful suggestion for most of the cases of dynamic content or content analysis. Especially when either intensive processing needs to be done (NLP, etc) or when third party services have to be called. Approach from MusicBrainz is interesting where they basically provide a way to express how deep structure should be returned by simply adding another part (a number) to main url. bye andraz Best, Georgi -- Georgi Kobilarov Freie Universität Berlin www.georgikobilarov.com -Original Message- From: public-lod-requ...@w3.org [mailto:public-lod-requ...@w3.org] On Behalf Of Aldo Bucchi Sent: Monday, December 29, 2008 4:51 PM To: public-lod@w3.org Subject: Granular dereferencing ( prop by prop ) using REST + LinkedData; Ideas? Hi All, I am in the process of LODing a dataset in which certain properties are generated on the fly ( props derived from aggregate calculations over the dataset, remote calls, etc ). I would like to let the clients choose which of these expensive properties they need on demand and on a granular level. For example, lets say I am interested in knowing more about resource http://ex.com/a. Per LD conventions, dereferencing http://ex.com/a ( via 303 ) returns http://ex.com/a a ex:Thing ; rdfs:label a sample dynamic resource; ex:dynamic1 45567 . The problem is that the value for ex:dynamic1 is very expensive to compute. Therefore, I would like to partition the document in such a way that the client can ask for the property on a lazy, deferred manner ( a second call in the future ). The same is true for dynamic2, dynamic3, dynamic4, etc. All should be retrievable independently and on demand. * I am aware that this can be achieved by extending SPARQL in some toolkits. But I need LOD. * I am also aware that most solutions require us to break URI obscurity by stuffing the subject and predicate in the uri for a doc. * Finally, seeAlso is too broad as it doesn't convey the information I need. Anyone came up with a clean pattern for this? Ideas? Something as simple as: GET http://x.com/sp?s={subject}p={predicate} --- required RDF works for me... but... . If possible, I would like to break conventions in a conventional manner ;) Best, A -- Aldo Bucchi U N I V R Z Office: +56 2 795 4532 Mobile:+56 9 7623 8653 skype:aldo.bucchi http://www.univrz.com/ http://aldobucchi.com PRIVILEGED AND CONFIDENTIAL INFORMATION This message is only for the use of the individual or entity to which it is addressed and may contain information that is privileged and confidential. If you are not the intended recipient, please do not distribute or copy this communication, by e-mail or otherwise. Instead, please notify us immediately by return e-mail. INFORMACIÓN PRIVILEGIADA Y CONFIDENCIAL Este mensaje está destinado sólo a la persona u organización al cual está dirigido y podría contener información privilegiada y confidencial. Si usted no es el destinatario, por favor no distribuya ni copie esta comunicación, por email o por otra vía. Por el contrario, por favor notifíquenos inmediatamente vía e-mail. -- Andraz Tori, CTO Zemanta Ltd, London, Ljubljana www.zemanta.com mail: and...@zemanta.com tel: +386 41 515 767 twitter: andraz, skype: minmax_test
DMoz categories as LOD ?
Hi, one of the things we are doing categorization into DMoz hierarchy. However while it was easy to create sameAs links for recognized entities to dbPediaothers, I was dazzled by the lack of DMoz URIs where RDF data would reside that would represent the categories and their relationships. So what we did in API responses was to create a dummy non-dereferencable URI and quickly described basic properties. rdf:Description rdf:about=http://d.zemanta.com/cats/dmoz/Top/Computers/Security; rdf:type rdf:resource=http://s.zemanta.com/ns#Target/ z:targetType rdf:resource=http://s.zemanta.com/targets#category/ z:titleTop/Computers/Security/z:title z:categorization rdf:resource=http://s.zemanta.com/cat/dmoz/ /rdf:Description However this is naturally sub-optimal, and non-LOD compatible. It does not allow developers to walk the hierarchy and similar. As we are not trying to be the center of the universe and thus gladly link to LOD resources as first-class citizens, I am wondering if there might be a better solution. So my question is: does anyone know about dereferencable-uris version of DMoz data that could be used in API responses to denote specific categories? (and thanks to Kingsley Idehen for reminding me this is probably the right list to ask this question) -- Andraz Tori, CTO Zemanta Ltd, London, Ljubljana www.zemanta.com mail: and...@zemanta.com tel: +386 41 515 767 twitter: andraz, skype: minmax_test
Re: semantic technology solutions for multilinguall sites
Hi, an interesting case of using semantic web to provide some cross language functionality is Faviki bookmarking site. More about how they use it is described here http://faviki.wordpress.com/2008/09/23/faviki-is-featured-on-google-code/ just a few days ago a W3C SW case study was published about it http://faviki.wordpress.com/2008/12/10/w3c-sw-case-study-by-faviki-semantic-tags/ Probably this does not directly map to your case, but it can give you some ideas about how you can leverage LOD and especially dbPedia for cross-language solutions. bye Andraz Tori On Thu, 2008-12-11 at 18:35 -0800, Semantics-ProjectParadigm wrote: Sorry for typos in previous message subject line Dear all, I am writing proposals for setting up web portals to access digital repositories on a large number of knowledge domains. The point is that the same information needs to be made available in a large number of languages. Are there any computer programs, research programs or projects out there that deal with providing semantic web solutions for information that needs to be made available in a large number of different languages? My first impression was to look within the European Union, but there must be more out there. Another issue inherent to the same question is that there are few Content Management Systems out there that provide the option of creating basically the same site in many languages. Milton Ponson GSM: +297 747 8280 Rainbow Warriors Core Foundation PO Box 1154, Oranjestad Aruba, Dutch Caribbean www.rainbowwarriors.net Project Paradigm: A structured approach to bringing the tools for sustainable development to all stakeholders worldwide www.projectparadigm.info NGO-Opensource: Creating ICT tools for NGOs worldwide for Project Paradigm www.ngo-opensource.org MetaPortal: providing online access to web sites and repositories of data and information for sustainable development www.metaportal.info SemanticWebSoftware, part of NGO-Opensource to enable SW technologies in the Metaportal project www.semanticwebsoftware.info Milton Ponson GSM: +297 747 8280 Rainbow Warriors Core Foundation PO Box 1154, Oranjestad Aruba, Dutch Caribbean www.rainbowwarriors.net Project Paradigm: A structured approach to bringing the tools for sustainable development to all stakeholders worldwide www.projectparadigm.info NGO-Opensource: Creating ICT tools for NGOs worldwide for Project Paradigm www.ngo-opensource.org MetaPortal: providing online access to web sites and repositories of data and information for sustainable development www.metaportal.info SemanticWebSoftware, part of NGO-Opensource to enable SW technologies in the Metaportal project www.semanticwebsoftware.info -- Andraz Tori, CTO Zemanta Ltd, London, Ljubljana www.zemanta.com mail: and...@zemanta.com tel: +386 41 515 767 twitter: andraz, skype: minmax_test
Zemanta API: from text to LOD
Hi, I am Andraz Tori, CTO at Zemanta. This is my first post to this list. I've read the discussion about commercial announcements just a day ago, so I hope this this announcement is relevant enough albeit it comes from a for-profit company. I'd like to announce that Zemanta today launched a semantic API that is able to take plain text and disambiguate most important entities found to their meanings in Linking Open Data (currently dbPedia, MusicBrainz, Semantic CrunchBase and Freebase). As far as we know this is first such undertaking of this scale. API is free to use up to 10k calls per day (default limit is 1k, email needs to be sent to raise it). Response can be in XML, JSON or RDF/XML. Developer info can be found at http://developer.zemanta.com and less technical explanations at http://www.zemanta.com/api I am very interested if this will make any new types of Mashups happen. Also I am interested in feedback on RDF/XML format/ontology. I see ourselves a bit like a stargate portal from unstructured text into the LOD, enabling LOD to be used in broader number of situations. Comments welcome :) -- Andraz Tori, CTO Zemanta Ltd, London, Ljubljana www.zemanta.com mail: [EMAIL PROTECTED] tel: +386 41 515 767 twitter: andraz, skype: minmax_test