On 18/08/2006, at 8:31 AM, Geoff Bowers wrote: > 2) its not computationally trivial to work out what is a good and not > so good post
Just FYI Chip Temm has an interesting comment on our blog about using Bayesian algorithms (often used in spam filters) to automatically categorise content - here is a link to the article he wrote in CFDJ about this a while back: http://au.sys-con.com/read/154232.htm To make this work we need a large sample of posts in various categories. To this end I wrote a CF script this evening that visited the 468 feeds aggregated by the Goog and built a distinct list of dc:subject tag values on the feed items (see my blog comment for the list). I figure that if we can map the various subjects used on these blogs for CF, Flash, Flex etc (there typically seem to be about 4-8 variations for each product) to products we should be able to visit the original articles, and assign each to the correct sample (CF, Flash, Trash etc) based on the subjects allocated by the author at post time. With this data (basically a word frequency table) we should be able to look at any article or web page and with some modest number crunching get a pretty good indication of how relevant it would be to a particular product. It will be fun to see if it works anyway... ______________ Robin Hilliard On 18/08/2006, at 8:31 AM, Geoff Bowers wrote: > > Dale et al, > > Dale Fraser wrote: >> I recently dropped all my favourite feeds in Google and put in >> Fullasagoog >> Coldfusion Blend instead. >> >> Wow, am I disappointed. I'm not sure what's going on, but I'm >> wasting my >> time here. I think someone at Fullasagoog should do something >> about it. >> Here's the current top 9 Coldfusion Blend Entries > > First thing to say is generally I agree. I'm not a great fan of "off > topic" posts myself but they clearly don't annoy me as much as they > annoy some. > > There needs to be a bit of a reality check: > 1) anecdotally -- about an equal proportion of people *want* to see > non-technical posts from CF insiders. They feel it humanises the > community and so on. > 2) its not computationally trivial to work out what is a good and not > so good post > 3) not everyone has a category that is relevant -- if i only take CF > posts from a blogger do I miss the posts they might have on JS, Flash, > Flex, SQL etc? Many bloggers have many technical interests. CF > itself > has many satellite subjects that should be of interest to CF > developers. > > I have plans for the next generation Goog to provide some degree of > social interaction to widen the scope for users to be editors and hone > the relevance of posts. I also have a variety of ideas on how to do > this computationally. > > There are some 500 hand picked blogs on Fullasagoog. And a waiting > list of about half that. I review each blog before adding it. I even > remove some blogs I find to be reliably bad. This is a very > subjective > and time consuming process. Bloggers tools change, their posting > habits change, there are a multitude of human variables associated > with > maintaining a good feed. > > I will endeavour to find more time to address the concerns you have > raised. But in the end, Fullasagoog is not cash flow positive and is > heavily subsidised by Daemon [1]. It's a bit of a hobby that was > built > to scatch an itch of *mine* several years ago and at the moment I've > got some sort of St. Vitus dance going on trying to reach all the > other > itches. > > -- geoff > http://www.fullasagoog.com/ > > [1]: http://www.daemon.com.au/ > > > --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "cfaussie" group. To post to this group, send email to cfaussie@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/cfaussie -~----------~----~----~----~------~----~------~--~---