Thanks, I was checking something with the default from jdk...

On Tue, 2006-01-10 at 11:06 +0100, Jérôme Charron wrote:
> > the following code would fail in case the meta tags are in upper case
> >
> >         Node nameNode = attrs.getNamedItem("name");
> >         Node equivNode = attrs.getNamedItem("http-equiv");
> >         Node contentNode = attrs.getNamedItem("content");
> 
> This code works well, because Nutch HTML Parser uses Xerces implementation
> HTMLDocumentImpl object that lowercased attributes (instead of elements
> names that are uppercased).
> For consistency and to decouple a little Nutch HTML Parser and Xerces
> implementation, I suggest to change these lines by something like:
> Node nameNode = null;
> Node equivNode = null;
> Node contentNode = null;
> for (int i=0; i<attrs.getLength(); i++) {
>   Node attr = attrs.item(i);
>   String attrName = attr.getNodeName().toLowerCase();
>   if (attrName.equals("name")) {
>     nameNode = attr;
>   } else if (attrName.equals("http-equiv")) {
>     equivNode = attr;
>   } else if (attrName.equals("content")) {
>     contentNode = attr;
>   }
> }
> 
> 
> Jérôme
> 
> 
> --
> http://motrech.free.fr/
> http://www.frutch.org/


Reply via email to