Re: Breaking News: GoodRelations data now shows up in Yahoo!
Hi David, Daniel O'Connor wrote: http://goodrelations.doconnor.user.dev.freebaseapps.com/ Freebase data being rendered as Good Relations (Or Barbie and Ken's Semantic Web Playset) thanks for the initiative - very valuable! What's the best way to validate this / check it would show up in Yahoo search results? You can use http://developer.search.yahoo.com/help/objects/product A few comments as for the data: What you find in Freebase are likely gr:ProductOrServiceModel instances, not offers. So you should create instances of gr:ProductOrServiceModel for each Product in Freebase first. Those define the properties of the model - e.g. that what you would usually find in a manufacturer's datasheet: - description - image - EAN/UPC - weight etc. Note that the price is not a feature of the product model, but a property of one specific offer to sell such objects, i.e. a gr:Offering. (I assume there will be way more product models in Freebase than those which you find currently, and it could be that querying for a price is the reason.) If you have a business entity and a price, you could also add a gr:Offering etc. as you are doing right now. But note that Mattel will often not sell individual barbie dolls to end users at the suggested retail price. So the offer must be constrained to resellers. And then you don't have price... This is why I would suggest to limit the export to the model data. Those can be linked in the LOD cloud to actual offers, e.g. from BestBuy or from eBay (via OpenLink's new eBay sponger). So the basic structure should be a) Model data (Datasheets) foo:Barbie1234 a gr:ProductOrServiceModel; rdfs:label blabla@en; rdfs:comment blabla@en; gr:hasEAN_UCC-13 1234567890123^^xsd:string; gr:hasManufacturer foo:Mattel. #etc. foo:Mattel a gr:BusinessEntity. gr:legalName Mattel Toys Inc.@en. #etc. You can add the statement that Mattel also offers individuals of that type: foo:Offer a gr:Offering; gr:includes foo:SomeBarbie1234s. foo:Mattel gr:offers foo:Offer. foo:SomeBarbie1234s a gr:ProductOrServicesSomeInstancesPlaceholder; rdfs:label blabla@en; rdfs:comment blabla@en; gr:hasEAN_UCC-13 1234567890123^^xsd:string; gr:hasMakeAndModel foo:Barbie1234. #etc. But then you should not attach a UnitPriceSpecification. For your reference, I add a list of properties for gr:Offering, gr:ProductOrServiceModel, and gr:ProductOrServicesSomeInstancesPlaceholder. Also, I recommend the UML diagram at http://www.ebusiness-unibw.org/wiki/File:Goodrelations-UML-2009-07-18.pdf gr:Offering owl:includes owl:hasBusinessFunction owl:availableDeliveryMethods owl:eligibleCustomerTypes owl:includesObject owl:availableAtOrFrom owl:hasPriceSpecification owl:hasWarrantyPromise owl:acceptedPaymentMethods owl:eligibleRegions owl:hasEAN_UCC-13 owl:hasGTIN-14 owl:hasStockKeepingUnit owl:validFrom owl:validThrough gr:ProductOrServiceModel specific: none inherited: owl:isAccessoryOrSparePartFor owl:qualitativeProductOrServiceProperty owl:isSimilarTo owl:isConsumableFor owl:quantitativeProductOrServiceProperty owl:hasManufacturer owl:datatypeProductOrServiceProperty owl:hasEAN_UCC-13 owl:hasGTIN-14 owl:hasStockKeepingUnit gr:ProductOrServicesSomeInstancesPlaceholder specific: owl:hasInventoryLevel owl:hasMakeAndModel inherited: owl:isAccessoryOrSparePartFor owl:qualitativeProductOrServiceProperty owl:isSimilarTo owl:isConsumableFor owl:quantitativeProductOrServiceProperty owl:hasManufacturer owl:datatypeProductOrServiceProperty owl:hasEAN_UCC-13 owl:hasGTIN-14 owl:hasStockKeepingUnit Best Martin -- -- martin hepp e-business web science research group universitaet der bundeswehr muenchen e-mail: h...@ebusiness-unibw.org phone: +49-(0)89-6004-4217 fax: +49-(0)89-6004-4620 www: http://www.unibw.de/ebusiness/ (group) http://www.heppnetz.de/ (personal) skype: mfhepp twitter: mfhepp Check out GoodRelations for E-Commerce on the Web of Linked Data! = Webcast: http://www.heppnetz.de/projects/goodrelations/webcast/ Recipe for Yahoo SearchMonkey: http://www.ebusiness-unibw.org/wiki/GoodRelations_and_Yahoo_SearchMonkey Talk at the Semantic Technology Conference 2009: Semantic Web-based E-Commerce: The GoodRelations Ontology http://www.slideshare.net/mhepp/semantic-webbased-ecommerce-the-goodrelations-ontology-1535287 Overview article on Semantic Universe: http://www.semanticuniverse.com/articles-semantic-web-based-e-commerce-webmasters-get-ready.html Project page: http://purl.org/goodrelations/ Resources for developers: http://www.ebusiness-unibw.org/wiki/GoodRelations Tutorial materials: CEC'09 2009 Tutorial: The Web of Data for E-Commerce: A Hands-on Introduction to the GoodRelations Ontology, RDFa, and Yahoo! SearchMonkey
Re: The Power of Virtuoso Sponger Technology
Does Sindice crawl this (or any other semantic web search engines)? Juan Sequeda, Ph.D Student Dept. of Computer Sciences The University of Texas at Austin www.juansequeda.com www.semanticwebaustin.org On Sat, Oct 17, 2009 at 4:24 AM, Martin Hepp (UniBW) h...@ebusiness-unibw.org wrote: Dear all: I just found out that the Virtuoso Sponger technology is even more powerful than I thought. Briefly: Spongers create rich GoodRelations (and other RDF) meta-data for existing Web pages on-the-fly. Other than traditional screen-scraping approaches, Spongers reuse public APIs and other techniques, so the data is of unprecedented degree of structure. Now, this can be directly used in arbitrary queries... by simply using the URI of the *existing* HTML Web page in the FROM clause of a SPARQL query. Example: http://www.amazon.com/Semantic-Web-Real-World-Applications-Industry/dp/0387485309 is a Web page in plain HTML offering a book. Amazon does not yet produce GoodRelations meta-data on their pages. If you go to http://uriburner.com/sparql and paste the URI in the Default Graph URI field and select Retrieve remote RDF for all missing source graphs, then a query like SELECT * WHERE {?s ?p ?o} LIMIT 50 returns a fully-fledged GoodRelations description for that page - as if Amazon was already supporting GoodRelations for each of its 4 million items! There are spongers for BestBuy, eBay, Zillow, and many other types of resources. Wow! Congrats to Kingsley and his team! Best wishes Martin Hepp -- -- martin hepp e-business web science research group universitaet der bundeswehr muenchen e-mail: h...@ebusiness-unibw.org phone: +49-(0)89-6004-4217 fax: +49-(0)89-6004-4620 www: http://www.unibw.de/ebusiness/ (group) http://www.heppnetz.de/ (personal) skype: mfhepp twitter: mfhepp Check out GoodRelations for E-Commerce on the Web of Linked Data! = Webcast: http://www.heppnetz.de/projects/goodrelations/webcast/ Recipe for Yahoo SearchMonkey: http://www.ebusiness-unibw.org/wiki/GoodRelations_and_Yahoo_SearchMonkey Talk at the Semantic Technology Conference 2009: Semantic Web-based E-Commerce: The GoodRelations Ontology http://www.slideshare.net/mhepp/semantic-webbased-ecommerce-the-goodrelations-ontology-1535287 Overview article on Semantic Universe: http://www.semanticuniverse.com/articles-semantic-web-based-e-commerce-webmasters-get-ready.html Project page: http://purl.org/goodrelations/ Resources for developers: http://www.ebusiness-unibw.org/wiki/GoodRelations Tutorial materials: CEC'09 2009 Tutorial: The Web of Data for E-Commerce: A Hands-on Introduction to the GoodRelations Ontology, RDFa, and Yahoo! SearchMonkey http://www.ebusiness-unibw.org/wiki/Web_of_Data_for_E-Commerce_Tutorial_IEEE_CEC%2709
Re: The Power of Virtuoso Sponger Technology
Juan Sequeda wrote: Does Sindice crawl this (or any other semantic web search engines)? Juan, Sponger is not about Sindice crawling our proxy URIs. The Web of Linked Data shouldn't be about mass crawling (search engine style) etc... Its really supposed to be about smarter data network traversals triggered by data access requests. Basically, make the pathway on the fly, remember it for future reference, and know when its obsolete. If you look at it the other way round, our Sponger has Meta Cartridges that will lookup Sindice (via their APIs) for specific data about a various entities. It won't seek a complete dump of Sindice etc.. The same applies to a plethora of Web 2.0 style services. We can do smart database queries on the Web by simply meshing fundamental database principles with the inherent sophistication of HTTP :-) Kingsley Juan Sequeda, Ph.D Student Dept. of Computer Sciences The University of Texas at Austin www.juansequeda.com http://www.juansequeda.com www.semanticwebaustin.org http://www.semanticwebaustin.org On Sat, Oct 17, 2009 at 4:24 AM, Martin Hepp (UniBW) h...@ebusiness-unibw.org mailto:h...@ebusiness-unibw.org wrote: Dear all: I just found out that the Virtuoso Sponger technology is even more powerful than I thought. Briefly: Spongers create rich GoodRelations (and other RDF) meta-data for existing Web pages on-the-fly. Other than traditional screen-scraping approaches, Spongers reuse public APIs and other techniques, so the data is of unprecedented degree of structure. Now, this can be directly used in arbitrary queries... by simply using the URI of the *existing* HTML Web page in the FROM clause of a SPARQL query. Example: http://www.amazon.com/Semantic-Web-Real-World-Applications-Industry/dp/0387485309 is a Web page in plain HTML offering a book. Amazon does not yet produce GoodRelations meta-data on their pages. If you go to http://uriburner.com/sparql and paste the URI in the Default Graph URI field and select Retrieve remote RDF for all missing source graphs, then a query like SELECT * WHERE {?s ?p ?o} LIMIT 50 returns a fully-fledged GoodRelations description for that page - as if Amazon was already supporting GoodRelations for each of its 4 million items! There are spongers for BestBuy, eBay, Zillow, and many other types of resources. Wow! Congrats to Kingsley and his team! Best wishes Martin Hepp -- -- martin hepp e-business web science research group universitaet der bundeswehr muenchen e-mail: h...@ebusiness-unibw.org mailto:h...@ebusiness-unibw.org phone: +49-(0)89-6004-4217 fax: +49-(0)89-6004-4620 www: http://www.unibw.de/ebusiness/ (group) http://www.heppnetz.de/ (personal) skype: mfhepp twitter: mfhepp Check out GoodRelations for E-Commerce on the Web of Linked Data! = Webcast: http://www.heppnetz.de/projects/goodrelations/webcast/ Recipe for Yahoo SearchMonkey: http://www.ebusiness-unibw.org/wiki/GoodRelations_and_Yahoo_SearchMonkey Talk at the Semantic Technology Conference 2009: Semantic Web-based E-Commerce: The GoodRelations Ontology http://www.slideshare.net/mhepp/semantic-webbased-ecommerce-the-goodrelations-ontology-1535287 Overview article on Semantic Universe: http://www.semanticuniverse.com/articles-semantic-web-based-e-commerce-webmasters-get-ready.html Project page: http://purl.org/goodrelations/ Resources for developers: http://www.ebusiness-unibw.org/wiki/GoodRelations Tutorial materials: CEC'09 2009 Tutorial: The Web of Data for E-Commerce: A Hands-on Introduction to the GoodRelations Ontology, RDFa, and Yahoo! SearchMonkey http://www.ebusiness-unibw.org/wiki/Web_of_Data_for_E-Commerce_Tutorial_IEEE_CEC%2709 -- Regards, Kingsley Idehen Weblog: http://www.openlinksw.com/blog/~kidehen President CEO OpenLink Software Web: http://www.openlinksw.com
Re: The Power of Virtuoso Sponger Technology
I agree with Georgi. I would like to know what others think about this. On Sat, Oct 17, 2009 at 11:39 AM, Georgi Kobilarov georgi.kobila...@gmx.dewrote: The Web of Linked Data shouldn't be about mass crawling (search engine style) etc... It has to be. How would you answer a query like all offers for a book written by a German author without crawling the relevant data sets? Georgi -Original Message- From: public-lod-requ...@w3.org [mailto:public-lod-requ...@w3.org] On Behalf Of Kingsley Idehen Sent: Saturday, October 17, 2009 4:58 PM To: Juan Sequeda Cc: h...@ebusiness-unibw.org; public-lod@w3.org Subject: Re: The Power of Virtuoso Sponger Technology Juan Sequeda wrote: Does Sindice crawl this (or any other semantic web search engines)? Juan, Sponger is not about Sindice crawling our proxy URIs. The Web of Linked Data shouldn't be about mass crawling (search engine style) etc... Its really supposed to be about smarter data network traversals triggered by data access requests. Basically, make the pathway on the fly, remember it for future reference, and know when its obsolete. If you look at it the other way round, our Sponger has Meta Cartridges that will lookup Sindice (via their APIs) for specific data about a various entities. It won't seek a complete dump of Sindice etc.. The same applies to a plethora of Web 2.0 style services. We can do smart database queries on the Web by simply meshing fundamental database principles with the inherent sophistication of HTTP :-) Kingsley Juan Sequeda, Ph.D Student Dept. of Computer Sciences The University of Texas at Austin www.juansequeda.com http://www.juansequeda.com www.semanticwebaustin.org http://www.semanticwebaustin.org On Sat, Oct 17, 2009 at 4:24 AM, Martin Hepp (UniBW) h...@ebusiness-unibw.org mailto:h...@ebusiness-unibw.org wrote: Dear all: I just found out that the Virtuoso Sponger technology is even more powerful than I thought. Briefly: Spongers create rich GoodRelations (and other RDF) meta-data for existing Web pages on-the-fly. Other than traditional screen-scraping approaches, Spongers reuse public APIs and other techniques, so the data is of unprecedented degree of structure. Now, this can be directly used in arbitrary queries... by simply using the URI of the *existing* HTML Web page in the FROM clause of a SPARQL query. Example: http://www.amazon.com/Semantic-Web-Real-World-Applications- Industry/dp/0387485309 is a Web page in plain HTML offering a book. Amazon does not yet produce GoodRelations meta-data on their pages. If you go to http://uriburner.com/sparql and paste the URI in the Default Graph URI field and select Retrieve remote RDF for all missing source graphs, then a query like SELECT * WHERE {?s ?p ?o} LIMIT 50 returns a fully-fledged GoodRelations description for that page - as if Amazon was already supporting GoodRelations for each of its 4 million items! There are spongers for BestBuy, eBay, Zillow, and many other types of resources. Wow! Congrats to Kingsley and his team! Best wishes Martin Hepp -- -- martin hepp e-business web science research group universitaet der bundeswehr muenchen e-mail: h...@ebusiness-unibw.org mailto:h...@ebusiness- unibw.org phone: +49-(0)89-6004-4217 fax: +49-(0)89-6004-4620 www: http://www.unibw.de/ebusiness/ (group) http://www.heppnetz.de/ (personal) skype: mfhepp twitter: mfhepp Check out GoodRelations for E-Commerce on the Web of Linked Data! = Webcast: http://www.heppnetz.de/projects/goodrelations/webcast/ Recipe for Yahoo SearchMonkey: http://www.ebusiness- unibw.org/wiki/GoodRelations_and_Yahoo_SearchMonkey Talk at the Semantic Technology Conference 2009: Semantic Web-based E-Commerce: The GoodRelations Ontology http://www.slideshare.net/mhepp/semantic-webbased-ecommerce-the- goodrelations-ontology-1535287 Overview article on Semantic Universe: http://www.semanticuniverse.com/articles-semantic-web-based-e- commerce-webmasters-get-ready.html Project page: http://purl.org/goodrelations/ Resources for developers: http://www.ebusiness-unibw.org/wiki/GoodRelations Tutorial materials: CEC'09 2009 Tutorial: The Web of Data for E-Commerce: A Hands-on Introduction to the GoodRelations Ontology, RDFa, and Yahoo!
Re: The Power of Virtuoso Sponger Technology
Georgi Kobilarov wrote: The Web of Linked Data shouldn't be about mass crawling (search engine style) etc... It has to be. How would you answer a query like all offers for a book written by a German author without crawling the relevant data sets? To qualify my response: It shouldn't be about mass crawling (search engine style) that results in Google or Yahoo! style indexes. It should be about smart walking and indexing that uses HTTP to device smart cache invalidation schemes and Linked Data oriented URIs, for smart pathways. The comment: does Sindice Index Sponger URIs is not the answer. Just as the Sponger indexing Sindice isn't the answer. Both services can use their data pathways to make newer and better pathways depending on the query at hand. Basically, No Mass Dumb Crawling Indexing is what I am trying to relay via my comments :-) If we stick with the traditional search approach, how do we deal with the change sensitivity factor re: all offers for a book written by a German author ? Kingsley Georgi -Original Message- From: public-lod-requ...@w3.org [mailto:public-lod-requ...@w3.org] On Behalf Of Kingsley Idehen Sent: Saturday, October 17, 2009 4:58 PM To: Juan Sequeda Cc: h...@ebusiness-unibw.org; public-lod@w3.org Subject: Re: The Power of Virtuoso Sponger Technology Juan Sequeda wrote: Does Sindice crawl this (or any other semantic web search engines)? Juan, Sponger is not about Sindice crawling our proxy URIs. The Web of Linked Data shouldn't be about mass crawling (search engine style) etc... Its really supposed to be about smarter data network traversals triggered by data access requests. Basically, make the pathway on the fly, remember it for future reference, and know when its obsolete. If you look at it the other way round, our Sponger has Meta Cartridges that will lookup Sindice (via their APIs) for specific data about a various entities. It won't seek a complete dump of Sindice etc.. The same applies to a plethora of Web 2.0 style services. We can do smart database queries on the Web by simply meshing fundamental database principles with the inherent sophistication of HTTP :-) Kingsley Juan Sequeda, Ph.D Student Dept. of Computer Sciences The University of Texas at Austin www.juansequeda.com http://www.juansequeda.com www.semanticwebaustin.org http://www.semanticwebaustin.org On Sat, Oct 17, 2009 at 4:24 AM, Martin Hepp (UniBW) h...@ebusiness-unibw.org mailto:h...@ebusiness-unibw.org wrote: Dear all: I just found out that the Virtuoso Sponger technology is even more powerful than I thought. Briefly: Spongers create rich GoodRelations (and other RDF) meta-data for existing Web pages on-the-fly. Other than traditional screen-scraping approaches, Spongers reuse public APIs and other techniques, so the data is of unprecedented degree of structure. Now, this can be directly used in arbitrary queries... by simply using the URI of the *existing* HTML Web page in the FROM clause of a SPARQL query. Example: http://www.amazon.com/Semantic-Web-Real-World-Applications- Industry/dp/0387485309 is a Web page in plain HTML offering a book. Amazon does not yet produce GoodRelations meta-data on their pages. If you go to http://uriburner.com/sparql and paste the URI in the Default Graph URI field and select Retrieve remote RDF for all missing source graphs, then a query like SELECT * WHERE {?s ?p ?o} LIMIT 50 returns a fully-fledged GoodRelations description for that page - as if Amazon was already supporting GoodRelations for each of its 4 million items! There are spongers for BestBuy, eBay, Zillow, and many other types of resources. Wow! Congrats to Kingsley and his team! Best wishes Martin Hepp -- -- martin hepp e-business web science research group universitaet der bundeswehr muenchen e-mail: h...@ebusiness-unibw.org mailto:h...@ebusiness- unibw.org phone: +49-(0)89-6004-4217 fax: +49-(0)89-6004-4620 www: http://www.unibw.de/ebusiness/ (group) http://www.heppnetz.de/ (personal) skype: mfhepp twitter: mfhepp Check out GoodRelations for E-Commerce on the Web of Linked Data! = Webcast: http://www.heppnetz.de/projects/goodrelations/webcast/ Recipe for Yahoo SearchMonkey: http://www.ebusiness- unibw.org/wiki/GoodRelations_and_Yahoo_SearchMonkey Talk at the Semantic Technology Conference 2009: Semantic Web-based E-Commerce: The GoodRelations Ontology http://www.slideshare.net/mhepp/semantic-webbased-ecommerce-the- goodrelations-ontology-1535287
Re: The Power of Virtuoso Sponger Technology
Juan Sequeda wrote: I agree with Georgi. I would like to know what others think about this. What do you actually mean by Sindice indexing Sponger proxy URIs? Are you talking about it indexing in the same manner it does say, PingTheSemanticWeb? If so, then you are still thinking Google / Yahoo! style behavior. The better way is to work like a DBMS, have a base of data and progressively build it up while remaining sensitive to change. HTTP, Linked Data Objects, SPARQL, and OWL collectively make it possible for the Web of Linked Data to work like a very smart Federated DBMS. Kingsley On Sat, Oct 17, 2009 at 11:39 AM, Georgi Kobilarov georgi.kobila...@gmx.de mailto:georgi.kobila...@gmx.de wrote: The Web of Linked Data shouldn't be about mass crawling (search engine style) etc... It has to be. How would you answer a query like all offers for a book written by a German author without crawling the relevant data sets? Georgi -Original Message- From: public-lod-requ...@w3.org mailto:public-lod-requ...@w3.org [mailto:public-lod-requ...@w3.org mailto:public-lod-requ...@w3.org] On Behalf Of Kingsley Idehen Sent: Saturday, October 17, 2009 4:58 PM To: Juan Sequeda Cc: h...@ebusiness-unibw.org mailto:h...@ebusiness-unibw.org; public-lod@w3.org mailto:public-lod@w3.org Subject: Re: The Power of Virtuoso Sponger Technology Juan Sequeda wrote: Does Sindice crawl this (or any other semantic web search engines)? Juan, Sponger is not about Sindice crawling our proxy URIs. The Web of Linked Data shouldn't be about mass crawling (search engine style) etc... Its really supposed to be about smarter data network traversals triggered by data access requests. Basically, make the pathway on the fly, remember it for future reference, and know when its obsolete. If you look at it the other way round, our Sponger has Meta Cartridges that will lookup Sindice (via their APIs) for specific data about a various entities. It won't seek a complete dump of Sindice etc.. The same applies to a plethora of Web 2.0 style services. We can do smart database queries on the Web by simply meshing fundamental database principles with the inherent sophistication of HTTP :-) Kingsley Juan Sequeda, Ph.D Student Dept. of Computer Sciences The University of Texas at Austin www.juansequeda.com http://www.juansequeda.com http://www.juansequeda.com www.semanticwebaustin.org http://www.semanticwebaustin.org http://www.semanticwebaustin.org On Sat, Oct 17, 2009 at 4:24 AM, Martin Hepp (UniBW) h...@ebusiness-unibw.org mailto:h...@ebusiness-unibw.org mailto:h...@ebusiness-unibw.org mailto:h...@ebusiness-unibw.org wrote: Dear all: I just found out that the Virtuoso Sponger technology is even more powerful than I thought. Briefly: Spongers create rich GoodRelations (and other RDF) meta-data for existing Web pages on-the-fly. Other than traditional screen-scraping approaches, Spongers reuse public APIs and other techniques, so the data is of unprecedented degree of structure. Now, this can be directly used in arbitrary queries... by simply using the URI of the *existing* HTML Web page in the FROM clause of a SPARQL query. Example: http://www.amazon.com/Semantic-Web-Real-World-Applications- Industry/dp/0387485309 is a Web page in plain HTML offering a book. Amazon does not yet produce GoodRelations meta-data on their pages. If you go to http://uriburner.com/sparql and paste the URI in the Default Graph URI field and select Retrieve remote RDF for all missing source graphs, then a query like SELECT * WHERE {?s ?p ?o} LIMIT 50 returns a fully-fledged GoodRelations description for that page - as if Amazon was already supporting GoodRelations for each of its 4 million items! There are spongers for BestBuy, eBay, Zillow, and many other types of resources. Wow! Congrats to Kingsley and his team! Best wishes Martin Hepp -- -- martin hepp e-business web science research group universitaet der bundeswehr muenchen e-mail: h...@ebusiness-unibw.org mailto:h...@ebusiness-unibw.org mailto:h...@ebusiness- mailto:h...@ebusiness- unibw.org
Re: The Power of Virtuoso Sponger Technology
Hi all, The Web of Linked Data shouldn't be about mass crawling (search engine style) etc... It has to be. How would you answer a query like all offers for a book written by a German author without crawling the relevant data sets? First question would be: which dataset has this information? Does amazon has it, or does it needs to be linked to other people dataset where you can find such information? (which brings all the question of disambiguation of entities, etc...) In any case, there are multiple ways to endup with more or less the same result. Tell me if I am right, but I think that the current set of related cartridges only get data from a book URL? So, it is just converting data about a particular book, for a given URL, using some API (amazon in this case). What about search URLs, using search APIs from the same services? I can certainly think about a cartridge that does just this: searching for items, and returning the resultsets in RDF using some ontologies. And then you use the current cartridge to get all the information about the items you care about in the resultset. One thing is sure is that the expressiveness of your queries is bound to the expressiveness of the search API you query. So this is not the answer to all problems. But one question: is it realists to think that anyone could query all amazon and ebay sites (US, CAN, and all the other countries) to convert everything? And if it endups being the case, how synching and maintenance could take place? It really depends on the usecases, but there are much that can be done by leveraging all APIs in systems such as the Virtuoso sponger. I think that what you are talking about here will only happen when these services will want it to happen. Thanks, Take care, Fred
Re: The Power of Virtuoso Sponger Technology
With respect to crawling and scraping or sponging or .. trying to guess based on partial fragments of structured information i can say 3 thngs a) No, we're not doing it at the moment, we are only covering those who chose to put structured semantics. Some book stuff shows up in Sig.ma .. e.g. http://sig.ma/search?q=frank+van+harmelensources=100 bookfinder, our jerome digital library installation, but the triplees they provide are scarce and dont contribute much. It would take so little for this to improve on their side i believe. b) No, we are not religious about this. We have talked about it several times, it might make sense to try to understand as much as the web as possible and index it. Maybe we'll do it in the future for selected fractions of the web to show how it looks c) crawling should be just one mean of acquiring the semantic web. in case of bestbuy or other large retailers where prices change possibly everyday crawling as a mean to emulate a simple.. call to a web service seems really not the smart thing to do. Will data providers really support with data dumps? cheers Giovanni On Sat, Oct 17, 2009 at 3:32 PM, Juan Sequeda juanfeder...@gmail.com wrote: But Sindice could at least crawl Amazon. It would be great to use sig.ma to create a meshup with the amazon data. Juan Sequeda, Ph.D Student Dept. of Computer Sciences The University of Texas at Austin www.juansequeda.com www.semanticwebaustin.org On Sat, Oct 17, 2009 at 9:28 AM, Martin Hepp (UniBW) h...@ebusiness-unibw.org wrote: I don't think so, because this would require that Sindice crawled the whole regular web and checked the Spongers for each URL (sic!). Juan Sequeda wrote: Does Sindice crawl this (or any other semantic web search engines)? Juan Sequeda, Ph.D Student Dept. of Computer Sciences The University of Texas at Austin www.juansequeda.com www.semanticwebaustin.org On Sat, Oct 17, 2009 at 4:24 AM, Martin Hepp (UniBW) h...@ebusiness-unibw.org wrote: Dear all: I just found out that the Virtuoso Sponger technology is even more powerful than I thought. Briefly: Spongers create rich GoodRelations (and other RDF) meta-data for existing Web pages on-the-fly. Other than traditional screen-scraping approaches, Spongers reuse public APIs and other techniques, so the data is of unprecedented degree of structure. Now, this can be directly used in arbitrary queries... by simply using the URI of the *existing* HTML Web page in the FROM clause of a SPARQL query. Example: http://www.amazon.com/Semantic-Web-Real-World-Applications-Industry/dp/0387485309 is a Web page in plain HTML offering a book. Amazon does not yet produce GoodRelations meta-data on their pages. If you go to http://uriburner.com/sparql and paste the URI in the Default Graph URI field and select Retrieve remote RDF for all missing source graphs, then a query like SELECT * WHERE {?s ?p ?o} LIMIT 50 returns a fully-fledged GoodRelations description for that page - as if Amazon was already supporting GoodRelations for each of its 4 million items! There are spongers for BestBuy, eBay, Zillow, and many other types of resources. Wow! Congrats to Kingsley and his team! Best wishes Martin Hepp -- -- martin hepp e-business web science research group universitaet der bundeswehr muenchen e-mail: h...@ebusiness-unibw.org phone: +49-(0)89-6004-4217 fax: +49-(0)89-6004-4620 www: http://www.unibw.de/ebusiness/ (group) http://www.heppnetz.de/ (personal) skype: mfhepp twitter: mfhepp Check out GoodRelations for E-Commerce on the Web of Linked Data! = Webcast: http://www.heppnetz.de/projects/goodrelations/webcast/ Recipe for Yahoo SearchMonkey: http://www.ebusiness-unibw.org/wiki/GoodRelations_and_Yahoo_SearchMonkey Talk at the Semantic Technology Conference 2009: Semantic Web-based E-Commerce: The GoodRelations Ontology http://www.slideshare.net/mhepp/semantic-webbased-ecommerce-the-goodrelations-ontology-1535287 Overview article on Semantic Universe: http://www.semanticuniverse.com/articles-semantic-web-based-e-commerce-webmasters-get-ready.html Project page: http://purl.org/goodrelations/ Resources for developers: http://www.ebusiness-unibw.org/wiki/GoodRelations Tutorial materials: CEC'09 2009 Tutorial: The Web of Data for E-Commerce: A Hands-on Introduction to the GoodRelations Ontology, RDFa, and Yahoo! SearchMonkey http://www.ebusiness-unibw.org/wiki/Web_of_Data_for_E-Commerce_Tutorial_IEEE_CEC%2709 -- -- martin hepp e-business web science research group universitaet der bundeswehr muenchen e-mail: h...@ebusiness-unibw.org phone: +49-(0)89-6004-4217 fax: +49-(0)89-6004-4620 www: http://www.unibw.de/ebusiness/ (group)
Re: The Power of Virtuoso Sponger Technology
Giovanni Tummarello wrote: With respect to crawling and scraping or sponging or .. trying to guess based on partial fragments of structured information i can say 3 thngs a) No, we're not doing it at the moment, we are only covering those who chose to put structured semantics. Some book stuff shows up in Sig.ma .. e.g. http://sig.ma/search?q=frank+van+harmelensources=100 bookfinder, our jerome digital library installation, but the triplees they provide are scarce and dont contribute much. It would take so little for this to improve on their side i believe. b) No, we are not religious about this. We have talked about it several times, it might make sense to try to understand as much as the web as possible and index it. Maybe we'll do it in the future for selected fractions of the web to show how it looks c) crawling should be just one mean of acquiring the semantic web. in case of bestbuy or other large retailers where prices change possibly everyday crawling as a mean to emulate a simple.. call to a web service seems really not the smart thing to do. Will data providers really support with data dumps? cheers Giovanni Juan, I am hoping that the response above clarifies matters, esp. point C. Crawling the old way is futile when the change sensitivity aspect of a given unit of data is high. Georgi: even the count of German book authors, the prices of their books, across a plethora or retailers, with a wide range of prices and availability, is very sensitive to change. Georgi/Juan: Mechanically, there is crawling, but essentially it simply isn't the old style approach (data warehousing) of yore as exemplified by Google, Yahoo!, ASK, and others. Kingsley On Sat, Oct 17, 2009 at 3:32 PM, Juan Sequeda juanfeder...@gmail.com wrote: But Sindice could at least crawl Amazon. It would be great to use sig.ma to create a meshup with the amazon data. Juan Sequeda, Ph.D Student Dept. of Computer Sciences The University of Texas at Austin www.juansequeda.com www.semanticwebaustin.org On Sat, Oct 17, 2009 at 9:28 AM, Martin Hepp (UniBW) h...@ebusiness-unibw.org wrote: I don't think so, because this would require that Sindice crawled the whole regular web and checked the Spongers for each URL (sic!). Juan Sequeda wrote: Does Sindice crawl this (or any other semantic web search engines)? Juan Sequeda, Ph.D Student Dept. of Computer Sciences The University of Texas at Austin www.juansequeda.com www.semanticwebaustin.org On Sat, Oct 17, 2009 at 4:24 AM, Martin Hepp (UniBW) h...@ebusiness-unibw.org wrote: Dear all: I just found out that the Virtuoso Sponger technology is even more powerful than I thought. Briefly: Spongers create rich GoodRelations (and other RDF) meta-data for existing Web pages on-the-fly. Other than traditional screen-scraping approaches, Spongers reuse public APIs and other techniques, so the data is of unprecedented degree of structure. Now, this can be directly used in arbitrary queries... by simply using the URI of the *existing* HTML Web page in the FROM clause of a SPARQL query. Example: http://www.amazon.com/Semantic-Web-Real-World-Applications-Industry/dp/0387485309 is a Web page in plain HTML offering a book. Amazon does not yet produce GoodRelations meta-data on their pages. If you go to http://uriburner.com/sparql and paste the URI in the Default Graph URI field and select Retrieve remote RDF for all missing source graphs, then a query like SELECT * WHERE {?s ?p ?o} LIMIT 50 returns a fully-fledged GoodRelations description for that page - as if Amazon was already supporting GoodRelations for each of its 4 million items! There are spongers for BestBuy, eBay, Zillow, and many other types of resources. Wow! Congrats to Kingsley and his team! Best wishes Martin Hepp -- -- martin hepp e-business web science research group universitaet der bundeswehr muenchen e-mail: h...@ebusiness-unibw.org phone: +49-(0)89-6004-4217 fax: +49-(0)89-6004-4620 www: http://www.unibw.de/ebusiness/ (group) http://www.heppnetz.de/ (personal) skype: mfhepp twitter: mfhepp Check out GoodRelations for E-Commerce on the Web of Linked Data! = Webcast: http://www.heppnetz.de/projects/goodrelations/webcast/ Recipe for Yahoo SearchMonkey: http://www.ebusiness-unibw.org/wiki/GoodRelations_and_Yahoo_SearchMonkey Talk at the Semantic Technology Conference 2009: Semantic Web-based E-Commerce: The GoodRelations Ontology http://www.slideshare.net/mhepp/semantic-webbased-ecommerce-the-goodrelations-ontology-1535287 Overview article on Semantic Universe: http://www.semanticuniverse.com/articles-semantic-web-based-e-commerce-webmasters-get-ready.html Project page: http://purl.org/goodrelations/ Resources for developers: http://www.ebusiness-unibw.org/wiki/GoodRelations Tutorial materials: CEC'09 2009 Tutorial: