Hello, Msquared wrote: > I'm trying to work out if Piggy Bank allows me to add metadata to a > website (or at least to the Piggy Bank data that is saved for a website).
The short answer is no, not right now. > I really like metadata, but sometimes I know more about a website than is > published in its own metadata, either because the author simply didn't put > it in, or because it's information that only I know, or it's information > that only I care about. > > For example, let's say that I know two websites are related, but neither > of them have any RDF data that indicates as much. Could I add the data to > Piggy Bank's repository on my system? It's nice to see you looking for ways to expand on your use of Piggy Bank. At a broad level, we're also interested in the publication of metadata. If you have something to say about the way two sites relate, it would be great for you to publish it. Piggy Bank is intended to help extract metadata from existing stuff. By publishing it somewhere where everybody else can see it, they can also download it into their instances of Piggy Bank. You could write your own scraper that generates some of this metadata based on the one or two sites you're visiting, but that's a one-off solution to the general issue of adding arbitrary metadata you know that the site you're scraping doesn't publish. > I realise there are possibly two steps involved: the creation of the > metadata (which I expect may be outside of the scope of what Piggy Bank > does), and the storage of the metadata. > > Perhaps a screen scraper could ask the user for metadata directly, rather > than scraping it from the page? As you noted, creating metadata wholesale is not really in-scope for PB at the moment. You could certainly write a scraper that did pop-ups if you felt so inclined. > Can Piggy Bank also look up metadata in a third-party repository? > Metadata doesn't have to actually be embedded into the page itself, but > could theoretically be sprawled out across the web, right? Due to security restrictions, no. If you're at simile.mit.edu and you want to look up metadata hosted at metadata.example.org, your user agent is rightly going to deny a scraper from communicating with a foreign domain. There are other levels at which one might try to solve that issue, such as deeper within the add-on, but not within the scope of scrapers. This is something of a weakness since you are quite correct about the decentralized nature of metadata - and it would be one way to assist with the metadata dearth issue you brought up earlier. -- Ryan Lee [EMAIL PROTECTED] MIT CSAIL Research Staff http://simile.mit.edu/ http://people.csail.mit.edu/ryanlee/ _______________________________________________ General mailing list [email protected] http://simile.mit.edu/mailman/listinfo/general
