On 2/27/07, Stefano Mazzocchi <[EMAIL PROTECTED]> wrote: > Moreover, we mine those server logs (with tools like Referee and with > custom-made scripts) to generate reports that are useful for us and for > others to understand the evolution of the project. > > In aggregated, they respect the privacy of the individuals contributing > to the logs and we therefore make the results public > > http://simile.mit.edu/history/ > > but for exhibit referrers we do more: we crawl back the referrers, > understand what views are used, what data is used, create an exhibit of > all the exhibits (called 'metaexhibit'), we also fetch the data, RDFize > it with Babel, store in a local triple store and generate an atom feed > of the new usages of exhibit.
Might be obvious, but "data" above means the data points mentioned. Your exhibit *content* stays with you and your visitors. > One could say that since the exhibit users did not protect their pages > with HTTP authentication, we should not treat this page different from > any other web page. But since it might be that this data is considered > private because it's not linked to the general web, we feel that doing > so would be abusive and for that reason we do not show the metaexhibit > to the public. Agreed. Some hosting facilities don't even support setting up HTTP auth. > Unfortunately, while some exhibits are made to be private, others are > not and we are sure that such a 'metaexhibit' would be of great use for > others for inspiration, example, curiosity or data integration. > > There are several ways we thought about enabling this: > > 1) ask the major search engines if they have the referring URL in their > databases. If so, it means that this page has been linked from the > public web and for that reason it is reasonably safe to assume this is > the case. The pro of this approach is that it can be fully automated and > with ease, the con is that one could link a page by mistake and then it > would end up public (but it would be in the google cache anyway). -1. Spidering urls people ask about that you do not have in your database would be one of the first things I did as a search engine host; if we do, we leak data that was dark web matter until we did. > 2) suggest people to embed machine-readable licensing information in > their pages so that we can understand what kind of activity we are > allowed to do. This is the exhibit-equivalent of a robots.txt file for > web spiders but also has the advantage of adding licensing information > to the data, so that mixing could be done legally. +1. This is my favourite idea, and one I'd make use of for my own exhibits, once we have it. The way I see it, an exhibit can be shared and reused by others, in at least two ways: data and presentation. The two are typically not related much, at least when I create Exhibits. I mostly work with data sets that are not my own, but I would usually gladly share my Exhibit page template with someone else that wanted to borrow ideas, code, layout or inspiration from how I made it the way it looks and works. I occasionally use graphics shared under a different license, making me unable to place that too in the public domain. It might sound a bit hairy, but I would find it useful if we devised a method for tagging those three properties with license name, and perhaps a single tag for the basic case where all are the same. To me it makes most sense sticking those tags in the URL we load Exhibit with, for instance: <script src="http://simile.mit.edu/exhibit/api/exhibit-api.js?license=data:proprietary,layout=bsd,gfx=cc-by-nc/2.0"></script> for my exhibit that visualizes the live readings of 192 outdoors temperature measuring stations spread across Sweden, data c/o www.temperatur.nu, presentation by me (based on the Simile Presidents layout, which I hope is BSD or public domain?), background photo by Weston Renoud under the CC attribution-noncommercial license, http://creativecommons.org/licenses/by-nc/2.0/ The exhibit itself is available here (and demonstrates our lack of a numeric range facet ;-): http://exhibit.ecmanaut.googlepages.com/temperatur.nu.html A perhaps more typical case might be "license=pd" for a completely free-for-all exhibit. Given all of the above, we could easily set up an automated exhibit of exhibits available for reuse and inspiration, under terms you are comfortable with. I'd love that, and am certain that it would boost Exhibit adoption / spread most significantly. Copying and modifying the code of others is a lot easier than reading partial docs in a wiki, and while the example crop on the Simile site is a good start, it is very small and they don't say much about how you are allowed to reuse them. > 3) let people that want to show their exhibits write a list on our wiki > and write some script that automatically extract that data and generate > exhibits out of it. our use of semantic mediawiki helps a lot in this > regard. +0; won't hurt, but too much work (for exhibit authors) to gather much momentum, by my guess. (I haven't added any of mine, anyway.) > #2 is the more complex and requires us to agree on modeling the > licensing information in machine-processable form. Of course, Creative > Commons comes to mind, but that doesn't contain the notion of 'private > use', so at least it should be extended. > > Comments? A set of human writable license shorthands and a table somewhere over what they mean (above suggested exhibit would be a good place to find and learn more about them) is my preference. I'm sure "proprietary" might not be a very good word, but I'd hate to see longwinded urls go there, unless made optional. "data=proprietary;http://www.temperatur.nu/temperatur-1-99_1.html" might be another legal syntax, I guess, or data=proprietary(url), to get it nicely contained. Tossing up ideas, -- / Johan Sundström, http://ecmanaut.blogspot.com/ _______________________________________________ General mailing list [email protected] http://simile.mit.edu/mailman/listinfo/general
