RE: [htdig] laola2html.pl

Holmes, Gregory Wed, 11 Jul 2001 12:21:01 -0700

Title: RE: [htdig] laola2html.pl

The garbage in document excerpts fixed itself. I have no idea how. But I've done three full site indexes now and it doesn't happen anymore.

It was only in excerpts; I'm displaying both meta descriptions and excerpts, having hacked 3.1.5 (with help) to make both available to the templates separately. So the garbage characters only ever happened in excerpts, not meta descriptions.

Don't know, what, if anything, I did to fix it. Seems like I must have tried deleting databases before, but maybe not. Maybe that's what fixed it.

Anyway, knock on wood, it is working now :) Word documents indexed, meta information (where provided by the authors), and only using freeloading, ah, free, software. Does wp2html provide meta information in its output, by the way? Maybe it would have been worth it to buy that from my own pocket, instead of working all this out ;)

-----Original Message-----
From: David Adams [mailto:[EMAIL PROTECTED]]
Sent: Monday, July 09, 2001 5:32 AM
To: Holmes, Gregory
Cc: [EMAIL PROTECTED]
Subject: Re: [htdig] laola2html.pl

One more wild guess: is ldat the problem?

You are seeing the non-ASCII characters in the excerpt.
Have you got htdig configured to show the META description as the excerpt?
If so, is the META description being created by laola2html.pl using the document summary?
Could ldat be returning garbage for the document summary in some cases,
eg when the author of the document hasn't provided a summary?

--
David Adams
Computing Services
Southampton University

RE: [htdig] laola2html.pl

Reply via email to