Thanks, I was checking something with the default from jdk...
On Tue, 2006-01-10 at 11:06 +0100, Jérôme Charron wrote:
> > the following code would fail in case the meta tags are in upper case
> >
> > Node nameNode = attrs.getNamedItem("name");
> > Node equivNode = attrs.getNamedItem("http-equiv");
> > Node contentNode = attrs.getNamedItem("content");
>
> This code works well, because Nutch HTML Parser uses Xerces implementation
> HTMLDocumentImpl object that lowercased attributes (instead of elements
> names that are uppercased).
> For consistency and to decouple a little Nutch HTML Parser and Xerces
> implementation, I suggest to change these lines by something like:
> Node nameNode = null;
> Node equivNode = null;
> Node contentNode = null;
> for (int i=0; i<attrs.getLength(); i++) {
> Node attr = attrs.item(i);
> String attrName = attr.getNodeName().toLowerCase();
> if (attrName.equals("name")) {
> nameNode = attr;
> } else if (attrName.equals("http-equiv")) {
> equivNode = attr;
> } else if (attrName.equals("content")) {
> contentNode = attr;
> }
> }
>
>
> Jérôme
>
>
> --
> http://motrech.free.fr/
> http://www.frutch.org/