I have now gone one step further and automatically loaded the Metadean feeds into my Drupal. I used the following SQL statement:
INSERT INTO feed ( title, url, refresh, [timestamp], attributes, link, description ) SELECT metadean_nodes.node_name, metadean_nodes.node_feeds, "900" AS Expr1, "0" AS Expr2, "metadean" AS Expr3, metadean_nodes.node_url, metadean_nodes.node_description FROM metadean_nodes WHERE metadean_nodes.node_feeds is not NULL (Note: I used Access as my SQL front end, so this might require some tweaking to work directly from MySQL) Of the 42 feeds in the metadean_data.sql file that had been sent, 12 had node_feeds that were non-NULL. One of those was not really an XML feed, and I deleted it ahead of time. Of the remaining 11, two of them failed when Drupal tried to load them (www.politicalpunk.com and Bayarea) I have deleted them, and the remaining nine are now updating regularly in my metadean bundle. You can check this bundle out at http://ahynes1.homeip.net:8180/drupal/index.php?q=import/bundle/9 Hopefully, this provides some useful information that we can look at and think about vis-a-vis feed gathering. Currently, I am subscribed to 46 different feeds on my Drupal. Several of them are test feeds and not particularly useful. There are 1241 different items that have been retrieved by these feeds. I don't have a good feel for how many items a day I get, since I am still in the process of getting things set up. >From this, I would like to try and address, at least in part, some of the questions that ?!ng has raised: What determines the set of article pointers that each site will cache? and How does content "bubble up"? Will parent nodes have to poll all of their child nodes to look for content to promote? My view is that each node chooses what sites it will cache, and what content it will choose to bubble up, according to whatever parameters and tools are available. Essentially, then, each site becomes an editor or recommendor of certain feed items. Other sites may choose to subscribe to any particular sites aggegration. For example, if I am running a small site in Connecticut, I might decide that I want to subscribe to some 'official' aggregation, or I might decide that I think Neil's method of aggregation is better than the offical one, and I'll subscribe to that. This keeps it much more decentralised, giving power to the individual nodes, and chances for better aggregation techniques to be discovered. These later parts, are of course my humble opinion, and I expect that others might have strong opinions on these subjects. I look forward to hearing these opinions. Aldon -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Joshua Koenig Sent: Thursday, July 17, 2003 8:11 PM To: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED]; Benjamin Krokower; [EMAIL PROTECTED] Subject: [developers] MetaDean Development MetaDean Thinking about the MetaDean element of this project, I'm realizing that I don't have a lot of time in the next few days/week to get my hands dirty with the code. However, there are other people who've signed on to do some work on this, and who knows who else might be excited by the project. While I don't have time to get in there and code, I do have some time to help facilitate others. Below you'll find my assessments and recommendations as to how to proceed. This is my no means a difinitive list of TODOs or even anything anyone should feel obliged to do. There may be a better way around all of this, but these are my suggestions as to how we might get started. If there's interest, email back and ask questions on the developers list. We can have an IRC session if need be to work out all the concepts. Everyone who wants to work on this should read the MetaDean page in the wiki. http://www.hack4dean.org/phpwiki/index.php?MetaDean Also, if you're not already, you should sign up for the developers list. There are instructions on how to do this on the hack4dean.org homepage. # What we have: 1) An sql database scehma 2) Some data for nodes table 3) A basic web form for loading more data into the notes tables All of these are attached to this email as a tarball. # What In think we want to do: 0) Review my SQL. I'm not a very advanced user. There may be improvements to be made. 1) Create some mechanism to start watching the known DeanSpace - A script that loads all Node Feed urls from metadean_nodes, spiders them, and stores the results in another table (e.g. metadean_feeds, which needs to be created). - A script that can spider a big list of existing dean sites and see which are currently producing RSS and which are not. Those that are should have the RSS url's put into the metadean_nodes table and the feed cached. Those that shouldn't should be flagged so we can contact them and offer tools. - Hack at drupal so that the "phone home" feature can also send updates to metadean 2) Start populating the metadean_talent table - Hack at Drupal's registration form to add fields and an opt-in box which will also send user data into metadean. - Create a non-drupal form to do the same. - Hack at Drupal to built a module which will list all registered users and syndicate this list (or rather a list of those who've opted in) as RSS in such a way that metadean can periodically spiter Drupal Dean Sites for more talent. 3) Start trying to do things with cached RSS and known sites/talent. - Experimentation is probably good for starters. - Find a way to see what news items, similar links, or keywords are popping up in numerous places. - Start mapping the Dean Network Make sense to anyone? Let me know! -josh ------------------------ Politics is the art of controlling your environment. Participate! Elect Howard Dean President in 2004! http://www.outlandishjosh.com/politics/dean/
