2011/6/20 Alexandre Patry <a...@nlpfu.com>:
> Maybe you do not need to use NLP for your task. Recipe websites often render
> all recipes using similar html structures, it can be simpler to just create
> a program for each website that will extract the recipe title from the html
> DOM.
>
> I do not know which websites you want to extract recipes from, but if they
> use the hRecipe micro-format[1], the same extraction code will do in all
> places.

+1

You should also have a look at http://scraperwiki.com/

-- 
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel

Reply via email to