Thanks for getting back to me Jérôme, Would you suggest I jump into the Tokenizer? Would we need to differentiate indexing, summaries, and/or anchors (as google claims to do)? Should I target 0.7.2 or 0.8-dev?
Aside, perhaps we should add the modified date field (as NutchWax and others do). Alex
But since there is no specification about this, you should probably support the most used : * <!-- robots content="none" --> * <noindex> * <!-- googleon ... --> <!-- googleoff ... -->
-- 55.67N 12.588E CCC7 D19D D107 F079 2F3D BF97 8443 DB5A 6DB8 9CE1 --
