Hello,

Msquared wrote:
> I'm trying to work out if Piggy Bank allows me to add metadata to a
> website (or at least to the Piggy Bank data that is saved for a website).

The short answer is no, not right now.

> I really like metadata, but sometimes I know more about a website than is
> published in its own metadata, either because the author simply didn't put
> it in, or because it's information that only I know, or it's information
> that only I care about.
> 
> For example, let's say that I know two websites are related, but neither
> of them have any RDF data that indicates as much.  Could I add the data to
> Piggy Bank's repository on my system?

It's nice to see you looking for ways to expand on your use of Piggy 
Bank.  At a broad level, we're also interested in the publication of 
metadata.  If you have something to say about the way two sites relate, 
it would be great for you to publish it.

Piggy Bank is intended to help extract metadata from existing stuff.  By 
publishing it somewhere where everybody else can see it, they can also 
download it into their instances of Piggy Bank.

You could write your own scraper that generates some of this metadata 
based on the one or two sites you're visiting, but that's a one-off 
solution to the general issue of adding arbitrary metadata you know that 
the site you're scraping doesn't publish.

> I realise there are possibly two steps involved: the creation of the
> metadata (which I expect may be outside of the scope of what Piggy Bank
> does), and the storage of the metadata.
> 
> Perhaps a screen scraper could ask the user for metadata directly, rather
> than scraping it from the page?

As you noted, creating metadata wholesale is not really in-scope for PB 
at the moment.  You could certainly write a scraper that did pop-ups if 
you felt so inclined.

> Can Piggy Bank also look up metadata in a third-party repository?
> Metadata doesn't have to actually be embedded into the page itself, but
> could theoretically be sprawled out across the web, right?

Due to security restrictions, no.  If you're at simile.mit.edu and you 
want to look up metadata hosted at metadata.example.org, your user agent 
is rightly going to deny a scraper from communicating with a foreign 
domain.  There are other levels at which one might try to solve that 
issue, such as deeper within the add-on, but not within the scope of 
scrapers.  This is something of a weakness since you are quite correct 
about the decentralized nature of metadata - and it would be one way to 
assist with the metadata dearth issue you brought up earlier.

-- 
Ryan Lee                  [EMAIL PROTECTED]
MIT CSAIL Research Staff  http://simile.mit.edu/
http://people.csail.mit.edu/ryanlee/
_______________________________________________
General mailing list
[email protected]
http://simile.mit.edu/mailman/listinfo/general

Reply via email to