Hello all, I'd like to ask for your collective help and input on a project of mine. I have developed a newsbot that collects, categorizes and ranks news stories: http://memigo.com/ Memigo was originally conceived (more than a year ago) as a collector of news (more on that later) and more importantly for this list, as a way to get news I'd like to read on the road on my Palm (I am a road warrior after all).
A few words about memigo before I go on: memigo was built to mimic the way I browsed: I'd hit Slashdot, Metafilter and a few other meta-sites, click on some links that looked interesting, ignore others. If I'd see a story that came up on more than one meta-site which I ignored originally, I'd click on it. I would also click on stories based on their authors (I'd follow a News.com article for example, more than I would a Register article). Memigo does that: it tracks who's pointing to each article (much like Blogdex or Daypop), but it also does a couple more things: it points to discussion boards on each link (e.g. Slashdot's) and it asks memigo users to rank each article: if you like the article, you can rate it on a 3-pt scale (+, ++, +++). Memigo aggregates rankings for the article, its author site and any sites that point to the article externally. In other words it builds a "web of trust". Memigo also ranks articles collaboratively (much like Amazon Recommendations ), but we'd need a bit more users for that to be statistically meaningful. There's more info for the site at memigo.com/help That's the end of my pitch, I promise :-) Here's where I need y'all's help: memigo is pretty smart when it gets to screen-scraping: it can tailor its spiders to each site. One feature I have added recently is to have memigo grab the printer-friendly version of each article. That version should be better suited to PDAs (thereby allowing me to fulfill my original goal), correct? I now serve the top-ranked articles in their "lite" versions (if available), here: http://memigo.com/now The page above fits nicely on Avantgo and Plucker, but there are a few gotchas: * It seems to me that some sites switch to their full version if the referrer of their "printer-friendly" version is not the full version. Workarounds? * New York Times. Need I say more? how can I (never mind whether I should) get around their restrictions? * I am thinking it may be better if, instead of looking for the printer version of some sites, memigo hit their Palm index versions directly and matched them to the full versions. However, I don't know how to discover those (as AvantGo hides the Channel URLs, as I found out just recently). * Memigo can customize pages to each user with a simple GET (not yet; more on this below). What do you think are meaningful customizations for Plucker clients? At this point I am developing memigo.com/now for Plucker (as I am sick of AvantGo's limitations); it's not quite done yet as you can see, as I first want to address the above issues. But here's a rough roadmap of sorts: * Customized feeds. Memigo already generates a strong random number that's tied to each user/password combo. Memigo Now should be able to produce a page customized for each user if passed this number as a GET argument. That's an easy addition, once I have a better idea what to customize for each user. I also think the same idea can be used for customized XML/RSS feeds. * Pre-built Plucker DBs. I am not sure about this one. Memigo is 100% Python (yeah! I kept that last one for the end) and could integrate nicely with the Plucker code, when I get around to figuring it out :-) However, I don't know if there's enough added value for this, much less if it's legally/ethically advisable. * Since memigo is also Python, I could probably easily restrict the Memigo Now feed to pages that are Plucker-izable. Is there a quick way to determine this? I apologize for the long-winded post and my butting in on your list. If there's enough interest I will create a forum on memigo (yes, it does comments too) to discuss your ideas and feedback, so as to keep the conversation away from this list. This is my pet project and I'd really appreciate any thoughtful feedback. Thanks in advance, Costas _______________________________________________ plucker-dev mailing list [EMAIL PROTECTED] http://lists.rubberchicken.org/mailman/listinfo/plucker-dev
