Hi Rufus, Thanks for your detailed and thoughtful reply.
I think it is worth separating out centralization/decentralization of organizations versus what it is they build/create. The two are reasonably independent. I will limit myself to just what is created. FreeTable takes a centralized approach to data storage. This is a point of expediency. A decentralized approach has the advantage of preventing a single point of control. Weighing against this though there are difficult problems in dealing with data integrity, preventing data loss (what happened to the climate data from 1902?), ensuring system responsiveness, and preventing spam. Most of these are research problems. A centralized approach on the other hand can be created relatively easily using well known techniques. You are right about the importance of having well defined goals. Let me try an nail a few things down... A database in the cloud: FreeTable envisions working with datasets too large and too dynamic to be downloaded and then queried. Instead the query is sent to FreeTable and FreeTable returns the result. A programmatic interface: FreeTable seeks to provide a programmatic interface to data, allowing programmers to create a user interface to the data, and the next Ebay, Facebook, or Craigslist. (Based on feedback it seems this goal may be inadequate and FreeTable may need to provide a nice user interface for entering data, but this interface won't be able to compete with the domain specific interfaces others could build). Develper community: FreeTable's community will be software developers seeking to share data. (Again based on feedback this community might have to be expanded to people with data to share). All but the largest datasets: The notion of what is data spans a huge range from small classified ad listings to large genomic datasets. FreeTable would like to focus towards the lower end of this range. The limits of FreeTable are probably datasets less than 1 Gbyte in size or receiving less than 1000 simple queries per second. This probably covers a majority of datasets. The Open Database License looks like a good step forward. I have a concern though in the context of FreeTable. I don't think it is strong enough. Suppose FreeTable hosts a database of classified ads. A site that displays classified ads could use the FreeTable database and also contribute the ads they receive from users back to FreeTable. Another site though could use the FreeTable database, but keep any ads contributed to themselves. To the user the second site is always better since it has more ads, and the first site is left with little incentive to contribute ads back to FreeTable since it only helps their competition. In the end the public commons withers away. I would like to see a license that says if you use this dataset, then you must contribute back any similar data you gather. I don't know how to word that legally. thanks, gordon _______________________________________________ okfn-discuss mailing list [email protected] http://lists.okfn.org/mailman/listinfo/okfn-discuss
