Hi Gilles, 1. With regards to the different versions and their functionality:
We are currently using Htdig's 3.2.0b3 (Beta) version and would like to know if you can help us with the differences between the newer 3.2.0b3 (Beta) version and the 3.1.6 version you recommended us using (You wrote: See http://www.htdig.org/attrs.html#ignore_alt_text (requires version 3.1.6)). What would downgrading do at this point? Is that the solution? Because it seems as though we have incorporated the "ignore alt" code into our configuration code- 3.2.0b3 (Beta)- and we have had no luck with it working. We are trying a test page today with "noindex_start" and "noindex_end" to see if this might be an alternate way to work around this issue. We should know more tomorrow once the page has been indexed. Any thoughts on this? 2. Also, I saw this on Htdig's website. How would this work as far as searching the dynamic pages? 1.18. Can I use ht://Dig to index and search an SQL database? You can if your database has a web-based front end that can be "spidered" by ht://Dig. The requirement is that every search result must resolve to a unique URL which can be accessed via HTTP. The htdig program uses these URLs, which you feed it via the start_url attribute, to fetch and index each page of information. The search results will then give a list of URLs for all pages that match the search terms. If you don't have such a front end to your database, or the search results must be given as something other than URLs, then ht://Dig is probably not the best way of dealing with this problem: you may be better off using an SQL query engine that works directly on your own database, rather than building a separate ht://Dig database for searching. Ted Stresen-Reuter had the following tips: "In my case, because I like htdig's ability to rank results (and that ranking can be modified), I created an index page that simply walks through each record and indexes each record (with next and previous links so the spider can read all the records). And then I do one other thing: I make the <title> tag start with the unique ID of each record. Then, when I'm parsing the search results, I do a lookup on the database using the title tag as the key." 3. Mainly for testing purposes: Do you know how often these pages get indexed? Is there any way to speed this up? Can pages be searched by users while indexing is in progress? Please advise. Thank you very much in advance for your time and help on these matter. Sincerely, Rinat >According to rinat uzan: > > 1. Can we ignore content in HTML tags (esp. alt tags)? > >See http://www.htdig.org/attrs.html#ignore_alt_text (requires version 3.1.6) >and http://www.htdig.org/FAQ.html#q4.15 > >> 2. Restrict directories, exclude file types (or files with certain >> names eg.header.html)? > >See http://www.htdig.org/attrs.html#exclude_urls, > http://www.htdig.org/attrs.html#limit_urls_to, >and http://www.htdig.org/FAQ.html#q4.20 > >> 3. Can we search secure folders? > >See http://www.htdig.org/htdig.html (-u option) >and http://www.htdig.org/attrs.html#authorization > >> 4. How do we have the ability to search PDF's? > >See http://www.htdig.org/FAQ.html#q4.9 > >> 5. What kind of access overall? Because we want to be able to search >> dynamic results from our database and be able to merge results from >> our search to results displayed on the same page as HTDig's? > >This would require some sort of wrapper script that calls htsearch to >query the ht://Dig database, plus calls your own database search engine, >and somehow combines the results. Depending on how "merged" you want the >results to be, this might be quite easy or a bit more involved. > >See http://www.htdig.org/FAQ.html#q4.7 >and http://www.htdig.org/FAQ.html#q4.11 > >-- >Gilles R. Detillieux E-mail: <[EMAIL PROTECTED]> >Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/ >Dept. Physiology, U. of Manitoba Winnipeg, MB R3E 3J7 (Canada) -- Rinat Uzan Producer Supernova Productions, Inc. 33 W. 17th St., 7th Floor New York, NY 10011 212-633-2222 ext. 235 http://www.supernovainc.com _______________________________________________________________ Sponsored by: ThinkGeek at http://www.ThinkGeek.com/ _______________________________________________ htdig-general mailing list <[EMAIL PROTECTED]> To unsubscribe, send a message to <[EMAIL PROTECTED]> with a subject of unsubscribe FAQ: http://htdig.sourceforge.net/FAQ.html

