Hello all,

I'd like to ask for your collective help and input on a project of mine.  I have 
developed a newsbot that collects, categorizes and ranks news stories: 
http://memigo.com/  Memigo was originally conceived (more than a year ago) as a 
collector of news (more on that later) and more importantly for this list, as a way to 
get news I'd like to read on the road on my Palm (I am a road warrior after all).

A few words about memigo before I go on: memigo was built to mimic the way I browsed: 
I'd hit Slashdot, Metafilter and a few other meta-sites, click on some links that 
looked interesting, ignore others.  If I'd see a story that came up on more than one 
meta-site which I ignored originally, I'd click on it.  I would also click on stories 
based on their authors (I'd follow a News.com article for example, more than I would a 
Register article).  Memigo does that: it tracks who's pointing to each article (much 
like Blogdex or Daypop), but it also does a couple more things: it points to 
discussion boards on each link (e.g. Slashdot's) and it asks memigo users to rank each 
article:  if you like the article, you can rate it on a 3-pt scale (+, ++, +++).  
Memigo aggregates rankings for the article, its author site and any sites that point 
to the article externally.  In other words it builds a "web of trust".  Memigo also 
ranks articles collaboratively (much like Amazon Recommendations
 ), but we'd need a bit more users for that to be statistically meaningful.  There's 
more info for the site at memigo.com/help

That's the end of my pitch, I promise :-)  Here's where I need y'all's help: memigo is 
pretty smart when it gets to screen-scraping: it can tailor its spiders to each site.  
One feature I have added recently is to have memigo grab the printer-friendly version 
of each article.  That version should be better suited to PDAs (thereby allowing me to 
fulfill my original goal), correct?  I now serve the top-ranked articles in their 
"lite" versions (if available), here: http://memigo.com/now 

The page above fits nicely on Avantgo and Plucker, but there are a few gotchas:
  * It seems to me that some sites switch to their full version if the referrer of 
their "printer-friendly" version is not the full version.  Workarounds?
  * New York Times.  Need I say more?  how can I (never mind whether I should) get 
around their restrictions?
  * I am thinking it may be better if, instead of looking for the printer version of 
some sites, memigo hit their Palm index versions directly and matched them to the full 
versions.  However, I don't know how to discover those (as AvantGo hides the Channel 
URLs, as I found out just recently).
  * Memigo can customize pages to each user with a simple GET (not yet; more on this 
below).  What do you think are meaningful customizations for Plucker clients?

At this point I am developing memigo.com/now for Plucker (as I am sick of AvantGo's 
limitations); it's not quite done yet as you can see, as I first want to address the 
above issues.  But here's a rough roadmap of sorts:
  * Customized feeds.  Memigo already generates a strong random number that's tied to 
each user/password combo.  Memigo Now should be able to produce a page customized for 
each user if passed this number as a GET argument.  That's an easy addition, once I 
have a better idea what to customize for each user.  I also think the same idea can be 
used for customized XML/RSS feeds.
  * Pre-built Plucker DBs.  I am not sure about this one.  Memigo is 100% Python 
(yeah! I kept that last one for the end) and could integrate nicely with the Plucker 
code, when I get around to figuring it out :-)  However, I don't know if there's 
enough added value for this, much less if it's legally/ethically advisable.
  * Since memigo is also Python, I could probably easily restrict the Memigo Now feed 
to pages that are Plucker-izable.  Is there a quick way to determine this?

I apologize for the long-winded post and my butting in on your list.  If there's 
enough interest I will create a forum on memigo (yes, it does comments too) to discuss 
your ideas and feedback, so as to keep the conversation away from this list.  This is 
my pet project and I'd really appreciate any thoughtful feedback.

Thanks in advance,


Costas

_______________________________________________
plucker-dev mailing list
[EMAIL PROTECTED]
http://lists.rubberchicken.org/mailman/listinfo/plucker-dev

Reply via email to