Re: [htdig] Avoiding search on file name

2001-01-18 Thread Loys Masquelier
In fact, it seams that htsearch results are directories and files where the searched word is inside the directory or file name. Ex : /foo/foo.html Searched word : foo Result : /foo /foo/foo.html Is there a way to avoid htsearch to find those directories and files. Thanks. Loys. Gilles

[htdig] Indexing a given list of file

2001-01-18 Thread Loys Masquelier
Hello, I want to check that it is not possible to index a list of changed files without reindexing all the data. In fact the situation is that I know that that list of files needs to be reindexed and I want to do that as fast as possible. Thanks in advance. Loys. --

Re: [htdig] Merging two databases

2001-01-18 Thread Berthold Cogel
Geoff Hutchison wrote: On Tue, 9 Jan 2001, Peterman, Timothy P wrote: I have a related question. Can I merge more that two databases at a time? Not at the moment. -- -Geoff Hutchison Williams Students Online http://wso.williams.edu/ To

[htdig] Spelling Help

2001-01-18 Thread David Adams
I am trying to do what I can to aid those with spelling difficulties perform searches on our web pages. This was triggered by seeing in the htsearch log that attempts to find "accomodation" were finding some pages, but not the important ones (where it is spelt correctly)! Also this University

[htdig] Re: Htdig ?

2001-01-18 Thread K
Thank you Yours Kamel. - Original Message - From: "Geoff Hutchison" [EMAIL PROTECTED] To: "K" [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Sent: Thursday, January 18, 2001 2:28 PM Subject: Re: Htdig ? At 3:14 PM + 1/18/01, K wrote: Hi, I've found your e-mail into htdig site. I

[htdig] Re: Htdig ?

2001-01-18 Thread Geoff Hutchison
At 3:14 PM + 1/18/01, K wrote: Hi, I've found your e-mail into htdig site. I hope you've got the solution to my pb. I've got a datbase of url. I would like to use htdig for a quick search to my database, how can I do to index all my url. Could I index an acces file for exemple Thank u

Re: [htdig] Spelling Help

2001-01-18 Thread Geoff Hutchison
At 1:34 PM + 1/18/01, David Adams wrote: 1)What have other sites done to address this problem? (Spell checking and correcting our own Use good fuzzy methods, including the synonym file. We are working on additional fuzzy matching code, but of course if anyone can come up with sample

Re: [htdig] Changing sites list midway

2001-01-18 Thread Geoff Hutchison
At 11:49 AM -0600 1/17/01, htdighelp wrote: Is there some way to restart htdig so that is re-reads both the conf and the url list and just continues on? Not really. If you use the -l flag, you can kill htdig and it will write out its progress to a file and re-read it the next time it's called

Re: [htdig] Merging two databases

2001-01-18 Thread Geoff Hutchison
At 1:28 PM +0100 1/18/01, Berthold Cogel wrote: Is it possible to do the merging step twice? Is it possible to 'cascade' this step? Oh sure. It's probably most effective to work out some sort of "tree" of merges if you want to do it efficiently. You just can't merge more than two in one

Re: [htdig] Avoiding search on file name

2001-01-18 Thread Gilles Detillieux
According to Loys Masquelier: In fact, it seams that htsearch results are directories and files where the searched word is inside the directory or file name. Ex : /foo/foo.html Searched word : foo Result : /foo /foo/foo.html Is there a way to avoid htsearch to find those directories

Re: [htdig] Spelling Help

2001-01-18 Thread Gilles Detillieux
According to Geoff Hutchison: At 1:34 PM + 1/18/01, David Adams wrote: 1)What have other sites done to address this problem? (Spell checking and correcting our own Use good fuzzy methods, including the synonym file. We are working on additional fuzzy matching code, but of course

Re: [htdig] Indexing a given list of file

2001-01-18 Thread Gilles Detillieux
According to Loys Masquelier: I want to check that it is not possible to index a list of changed files without reindexing all the data. In fact the situation is that I know that that list of files needs to be reindexed and I want to do that as fast as possible. You may be out of luck with

Re: [htdig] Memory requriements

2001-01-18 Thread Gilles Detillieux
According to Pat Lennon: I have a Linux box with approx 1 gig of html and pdf books. I want to use htdig for the search engine. I dont want to assume to much butwill 1 additional gig of hard disk cover the size of the index database. I figure double may be a safe starting point. Also what

Re: [htdig] Memory requriements

2001-01-18 Thread Ing. Noel Vargas Baltodano
I have a Linux box with approx 1 gig of html and pdf books. I want to use htdig for the search engine. I dont want to assume to much butwill 1 additional gig of hard disk cover the size of the index database. I figure double may be a safe starting point. Also what type of memory

Re: [htdig] Indexing a given list of file

2001-01-18 Thread Geoff Hutchison
On Thu, 18 Jan 2001, Gilles Detillieux wrote: There was talk of adding to the 3.2 code a feature whereby you can tell htdig not to recheck all the indexed documents, but only check a given list of URLs. I don't remember if this feature is already in the current development snapshots. Yes.

[htdig] Using htdig to just collect/gather the pages

2001-01-18 Thread Mark Friedman
Isthere a way to configure htdig to be used to just spider and collect pages and documents without doing any of the index/search related stuff? Thanks in advance. -Mark

[htdig] HOWTO? setp-by-step?

2001-01-18 Thread Geordon VanTassle
Does someone have a step-by-step HOWTO for ht://Dig ? I have downloaded and installed (configure;make;make install) everything and it is installed right. I just can't seem to get the bloody thing to WORK! Or, is there an issue using it on a webserver that uses the Microsoft FrontPAge

Re: [htdig] HOWTO? setp-by-step?

2001-01-18 Thread Geordon VanTassle
It's version 3.1.5, on SuSE Pro 7.0 kernel 2.2.16. Apache version 1.3.14, PHP 4.0.4 FrontPAge extensions 4.0.4.3 Geordon Original Message On 1/18/01, 2:29:14 PM, "Ing. Noel Vargas Baltodano" [EMAIL PROTECTED] wrote regarding Re: [htdig] HOWTO? setp-by-step?: Hi Gordon: First of

Re: [htdig] HOWTO? setp-by-step?

2001-01-18 Thread Ing. Noel Vargas Baltodano
Geordon I suppose that apache is running fine, right? The next question would be, what exactly do you want to do with htdig? Just have a search engine for your site or what? Geordon VanTassle wrote: It's version 3.1.5, on SuSE Pro 7.0 kernel 2.2.16. Apache version 1.3.14, PHP 4.0.4

[htdig] Htdig 3.20b3 -- installation problems.

2001-01-18 Thread Sphboc
I've been able to successfully install (and execute, giving valid results), the 1/14/01 snapshot of this. To get this to work, however, I had to invoke the "--without-zlib" option. I obtained zlib113, uploaded it to the server, decompressed, and attempted to compile. (I do NOT have authority

Re: [htdig] HOWTO? setp-by-step?

2001-01-18 Thread Geordon VanTassle
Tht is correct: Apache works JUST fine. :) And yes, I just want to have ht://Dig index my server and use it as a basic search. I have it installed, like I said, and I can get to the SEARCH page, as well as enter something in the box and have the CGI run. However, it doesn't seem to FIND

[htdig] Other search tools?

2001-01-18 Thread Mike Paradis
Just wondering, are there any other tools that will search the style of output that htdig results are in? Mike To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] You will receive a message to confirm this. List archives:

Re: [htdig] Other search tools?

2001-01-18 Thread Geoff Hutchison
On Thu, 18 Jan 2001, Mike Paradis wrote: Just wondering, are there any other tools that will search the style of output that htdig results are in? I'm not sure what you mean. You make it sound like you want to parse the output. So yes, there are a variety of PHP, Perl, Java, etc. wrappers to

Re: [htdig] HOWTO? setp-by-step?

2001-01-18 Thread Geoff Hutchison
On Thu, 18 Jan 2001, Geordon VanTassle wrote: /opt/www which is fine for me. In /opt/www/db there are the files, and they have "size," so I know that there is something IN them. Who should When you say it doesn't "find anything." Does it come up with an error? Have you tried running

[htdig] Strategy for a dynamic site

2001-01-18 Thread Richard Seymour
I'm working on a site where some pages will be dynamically generated using java servlets and many static pages will be linked to only through dynamically generated content, or through javascript. I'm wondering what the best strategy for implementing htdig on a site like this would be. I was

[htdig] Multiple kinds of links?

2001-01-18 Thread Richard Seymour
In a site I'm working on, we have some links that are standard HTML (click on it, current page is replaced in the browser). We also have some links that run a javascript function which opens a popup window, leaving the current browser window open as well. Of course, since the javascript links

Re: [htdig] Spelling Help

2001-01-18 Thread Dave Salisbury
okay, here's one for the gurus. I'd like to be able to preserve user state, which is held in the query string. So my idea is to return just the urls from a search that match the state of the user. Basically, we have a ?lang=en or ?lang=fr, and since many of our pages are not translated yet,

Re: [htdig] HOWTO? setp-by-step?

2001-01-18 Thread Geordon VanTassle
Ok, I took a look at it again, and everything can be read by the Webserver. When I run the htsearch from either the command line OR the HTML interface, it comes back and says that there were no matches found. Now, granted, there's not a lot on my site at this point, but when I searched

Re: [htdig] HOWTO? setp-by-step?

2001-01-18 Thread Ing. Noel Vargas Baltodano
Geordon You have to edit the htdig.conf file according to your directory structure (start url, etc.) and your needs. Then you run htdig and htmerge. Try running htdig with the -i option to re-do the database from scratch and the -vvv option to see exactly what it is indexing. -- Noel Vargas

[htdig] append database problem

2001-01-18 Thread Julian C. Dunn
I have an existing, working htdig database, and I wanted to add more data to it that's indexed from another URL, using a different configuration file. The reason I used a different configuration file is because I want to run the two indexing runs at different times, using a cron job. Anyway, what