Hi everyone,
Would anyone find useful a parser for collecting outlinks from CSS (stylesheets)? As far as I can tell Tika doesn't offer this (it looks like Tika 1.12 parses CSS as plain text, correct me if I'm wrong). Modern CSS often contains "url(.)" links to content needed to properly style pages (e.g. fonts, images). I have a simple, working, tested "parse-css" plugin that uses http://cssparser.sourceforge.net/ and parses only outlinks, but if it's not something that belongs in Nutch that's fine. Otherwise I'll happily open a pull request. Thanks, Joe

