Jon Crump wrote: > Two things about PB and screen-scraping: > > 1. I used solvent to help me write a screen scraper for full record > descriptions in the Intute internet resources database > <http://www.intute.ac.uk/>. I saved it to my PB and activated it. It works > as expected, except that records are scraped by BOTH my scraper AND the > generic web-page scraper. The results from my scraper are aggregated with > the results from the generic scraper drawing its data from the page's > <meta /> tags. Is this the expected behavior? Is there any way to prevent > it?
At some point we introduced a 'generatedBy' property into scraping results, which is a facet that you should be able to use on the results to select by which scraper generated which data. I'm not sure that feature is in your build of PB, but if it is, it is one workaround for the issue, aside from disabling the generic scraper. > 2. The dc:description property in the n3 description of my js scraper is > not displayed in my PB. Is this a fresnel issue? Is this the expected > behavior? This is an expected Fresnel behavior, and an interesting one we've observed with the current Longwell+Fresnel architecture; that is, some of the data becomes 'black matter' if a lens doesn't incorporate that property - it exists, but you never see it. It's probably not a desirable behavior, but it is expected for the moment. > PS. Also still interested in facades (re: 12/10 Longwell Facade question) > if anyone knows how I might create a facade using properties rather than > types, I'd love to know. David is best suited to answer this question... -- Ryan Lee [EMAIL PROTECTED] MIT CSAIL Research Staff +1.617.253.5327 http://simile.mit.edu/ _______________________________________________ General mailing list [email protected] http://simile.mit.edu/mailman/listinfo/general
