[htdig] perl interface.

2000-12-05 Thread Gary Artim

Folks,
I downloaded the htdig perl interface from sourceforge.net.
I'm wondering if this module gives you any ability to
setup your own calls to htdig and present the results
as u like (ie, returns a list of urls and there ratings). 
If this doesn't do this could someone point me to a perl
module that does, if it exists...
Thanks for any ideas,
perl hack, not c++ hack, newbie to htdig
gary

Gary Artim
[EMAIL PROTECTED]
510.540.5071






To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.
List archives:  http://www.htdig.org/mail/menu.html
FAQ:http://www.htdig.org/FAQ.html




[htdig] C++

2000-12-05 Thread Bill Vick

How can we allow the user to search for the word
'C++'?

We are stumped and is it just something htdig is not
capable of doing?

Thanks

=
Bill Vick
972-612-8425

__
Do You Yahoo!?
Yahoo! Shopping - Thousands of Stores. Millions of Products.
http://shopping.yahoo.com/


To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.
List archives:  http://www.htdig.org/mail/menu.html
FAQ:http://www.htdig.org/FAQ.html




Re: [htdig] C++

2000-12-05 Thread Geoff Hutchison

On Tue, 5 Dec 2000, Bill Vick wrote:

 How can we allow the user to search for the word
 'C++'?

See http://www.htdig.org/attrs.html#extra_word_characters

e.g.

extra_word_characters: +

--
-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/



To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.
List archives:  http://www.htdig.org/mail/menu.html
FAQ:http://www.htdig.org/FAQ.html




Re: [htdig] Named characters in search output

2000-12-05 Thread Gilles Detillieux

According to Tamas Nagy:
 Hello,
 
 When using "rarr;" (right arrow) named character in the first part of HTML
 documents, htdig seems to generate "amp;rarr; romaacute;" in the preview
 of documents. It is a bit strange, maybe a bug, because this string should
 generates a right arrow...
 
 Cheers,
 
 Tamas
 
 PS:
 Config: HtDig 3.0.2b2, RedHat 7

I assume you mean 3.2.0b2.  This is a known problem, which is fixed in
the 3.2.0b3 development snapshots.  See http://www.htdig.org/FAQ.html#q5.22

-- 
Gilles R. Detillieux  E-mail: [EMAIL PROTECTED]
Spinal Cord Research Centre   WWW:http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:(204)789-3930


To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.
List archives:  http://www.htdig.org/mail/menu.html
FAQ:http://www.htdig.org/FAQ.html




Re: [htdig] How Can I use htdig to index two or more websites?

2000-12-05 Thread Gilles Detillieux

According to Sean Harris:
 How Can I use htdig to index two or more websites?
 Thank you for your help!:-)

Just add all the URLs you want to the start_url attribute, and possibly
adjust limit_urls_to if you want something less limiting than what you've
put in start_urls.

-- 
Gilles R. Detillieux  E-mail: [EMAIL PROTECTED]
Spinal Cord Research Centre   WWW:http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:(204)789-3930


To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.
List archives:  http://www.htdig.org/mail/menu.html
FAQ:http://www.htdig.org/FAQ.html




Re: [htdig] Help me to Search using Chinese!!!!!!!

2000-12-05 Thread Gilles Detillieux

According to Sean Harris:
 Help me to Search using Chinese!!!

I'm afraid the answer hasn't really changed from 2-1/2 weeks ago.
ht://Dig only supports 8-bit character sets.
See http://www.htdig.org/FAQ.html#q4.10

This topic has been discussed many times on the list, and there are still
no volunteers to take on the huge amount of work it would require to
adapt ht://Dig for full Unicode support, and to add in the word splitting
algorithms needed for many Asian languages.

-- 
Gilles R. Detillieux  E-mail: [EMAIL PROTECTED]
Spinal Cord Research Centre   WWW:http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:(204)789-3930


To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.
List archives:  http://www.htdig.org/mail/menu.html
FAQ:http://www.htdig.org/FAQ.html




Re: [htdig] Do me a favor

2000-12-05 Thread Gilles Detillieux

According to ellenliu:
 I have downloaded the program of 'htdig-3.2ob2.tar' from your site.
 But I have trouble to run it on personal my computer.
 My computer has been installed 'Red Hat 6.2' , which kernel is 2.214.
 However, when I run '/configuer' ,on the 993 lines it calls 'config.sub' ,then the  
program exits along with the promotion 'can't run config.sub'
 .
 Would you do me a favor to tell me why this happened ,and the most important thing 
is how I can run it successfully?
   Moreover, when should the embedded  database  be  compiled ,and how  is it 
compiled?   
 CONFIGUER of HARDWARE:
   CPU : Pentium processor 550
   Hard disc: 20G
   Memery: 64M

It would probably be helpful to see the full output from the ./configure
program.  This package has been successfully installed before on Red Hat
systems (6.1, 6.2 and others), so I would think that the most likely
problem is a missing component on your system.  You may also want to try
the most recent development snapshot of 3.2.0b3, instead of 3.2.0b2, as
many known bugs in 3.2.0b2 have since been fixed.

-- 
Gilles R. Detillieux  E-mail: [EMAIL PROTECTED]
Spinal Cord Research Centre   WWW:http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:(204)789-3930


To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.
List archives:  http://www.htdig.org/mail/menu.html
FAQ:http://www.htdig.org/FAQ.html




Re: [htdig] restrict values and htdig.conf

2000-12-05 Thread Gilles Detillieux

According to [EMAIL PROTECTED]:
 We have 3 htdig-searches in our website.
 there are 3 different databases that are indexed:
 ../htdig/db/database1 indexed with ../conf/htdig1.conf
 ../htdig/db/database2 indexed with ../conf/htdig2.conf
 ../htdig/db/database3 indexed with ../conf/htdig3.conf
 
 the database3 includes all sites while the other 2 databases contains only parts
 of the whole.
 
 Now i want to expand the html-form with a select-option as follows:
 
 select name="restrict"
 option value=""  selected.. Database1/OPTION
 option value="http://www.../"on Database2
 option value="http://www.../"on Database3
 /OPTION
 /SELECT
 
 o.k.!
 
 but how can i use this restrict-value in my htdig.conf?
 According to the selection in the html-form i must call the right htsearch with
 the right database!

You seem to be confusing two alternate methods of restricting search
results.  You use the restrict parameter on htsearch only when searching
a database that contains everything, in order to restrict the results
to a subset of that database, i.e. only the URLs that match a particular
pattern.

If you want the user to select separate databases, then you should leave
the restrict input parameter as an empty string, and have the user select
the value of the "config" input parameter, which should be one of htdig1,
htdig2 or htdig 3, i.e. the three configuration files you mentioned above
with the directory and .conf file name extension stripped off.

-- 
Gilles R. Detillieux  E-mail: [EMAIL PROTECTED]
Spinal Cord Research Centre   WWW:http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:(204)789-3930


To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.
List archives:  http://www.htdig.org/mail/menu.html
FAQ:http://www.htdig.org/FAQ.html




[htdig] Indexing never ends ...

2000-12-05 Thread Zon Hisham Bin Zainal Abidin

I am trying to index a directory containing php scripts that generates
dynamic webpages from records in mySQL database.

The script generate html pages that contain categories/subcategories
from different states and towns, and called with something like this:

category.phtml?catcode=ACCsubcatcode=ACC-DEVstatecode=STATE1TOWN=T1

I ran the indexing at 11pm last nite and it's still not finish at 8am
this morning. There are only 20 categories in the category table, 120
subcategories in the subcategory table, 15 states in the state table and
152 towns in the town table.

The console output seems that htdig runs in an infinite loop. I wonder
why.

As a clue, I suspected that the FOOTER in category.phtml is giving the
problem.
The footer consisted of all the states with hyperlink that points to:

category.phtml?catcode=ACCsubcatcode=ACC-DEVstatecode=STATE1 for
STATE1
category.phtml?catcode=ACCsubcatcode=ACC-DEVstatecode=STATE2 for
STATE2 so on and so forth:


AND of course the footer will (dynamically) CHANGE) when different
category/subcategory are chosen:
category.phtml?catcode=BUSsubcatcode=BUS-AUTstatecode=STATE1 for
STATE1
category.phtml?catcode=BUSsubcatcode=BUS-AUTstatecode=STATE2 for
STATE2 so on and so


I have tried to refer to the RTFM, but it's getting me no where.

I hope my explaination is clear and precise. Appreciate your kind
advise.


best regards,
Zon


To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.
List archives:  http://www.htdig.org/mail/menu.html
FAQ:http://www.htdig.org/FAQ.html




Re: [htdig] Indexing never ends ...

2000-12-05 Thread Geoff Hutchison

On Wed, 6 Dec 2000, Zon Hisham Bin Zainal Abidin wrote:

 I ran the indexing at 11pm last nite and it's still not finish at 8am
 this morning. There are only 20 categories in the category table, 120
 subcategories in the subcategory table, 15 states in the state table and
 152 towns in the town table.

Well, it's not clear if you can match these independently, but if you
could this would be "only"
20*120*15*152
5,472,000

Which in my mind would take some time. Even just 120*15*152 gives 273,600
pages. To index the latter in 9 hours would require indexing an average of
30,400 pages in an hour or better than 8 pages a second. (!)

 AND of course the footer will (dynamically) CHANGE) when different
 category/subcategory are chosen:
 category.phtml?catcode=BUSsubcatcode=BUS-AUTstatecode=STATE1 for
 STATE1
 category.phtml?catcode=BUSsubcatcode=BUS-AUTstatecode=STATE2 for
 STATE2 so on and so

OK. But I don't see how this would necessarily lead to an infinite loop.
If you see that the indexing is generating two URLs that lead to the same
page, e.g.:

category.phtml?catcode=BUSsubcatcode=BUS-AUTstatecode=STATE2
category.phtml?catcode=BUSstatecode=STATE2subcatcode=BUS-AUT

To htdig, these are different, but these are probably the same to your
code. But from your description, you haven't given any sense that this is
happening, just that this seems to be taking longer than you expect.

--
-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/




To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.
List archives:  http://www.htdig.org/mail/menu.html
FAQ:http://www.htdig.org/FAQ.html




Re: [htdig] perl interface.

2000-12-05 Thread Geoff Hutchison

On Tue, 5 Dec 2000, Gary Artim wrote:

 setup your own calls to htdig and present the results
 as u like (ie, returns a list of urls and there ratings). 
 If this doesn't do this could someone point me to a perl
 module that does, if it exists...

As things stand now, the easiest way to do this is to write a Perl
"wrapper." See for example contrib/ewswrap.cgi or others on the
contributed section of the website:

http://www.htdig.org/contrib/

Ideally there would be a Perl XS interface to the ht://Dig code, but
unless someone steps forward to do that, it's not likely to be done
anytime soon.

--
-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/



To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.
List archives:  http://www.htdig.org/mail/menu.html
FAQ:http://www.htdig.org/FAQ.html




[htdig] Re: detailed information

2000-12-05 Thread Geoff Hutchison


Hi there,

I'm assuming you picked my name as the contact for the ht://Dig search
engine package. It is a UNIX search engine, but it is not based on Oracle.
In most cases, if you're looking for a way to search an Oracle database,
it's often better to hire an Oracle consultant to write a custom search
package that is specifically addressed to your database schemas.

If, on the other hand, you're looking for a general-purpose,
open-source* web search package, feel free to browse the information on
ht://Dig at:
http://www.htdig.org/

*Specifically, ht://Dig is covered under the GNU GPL and is free software.

--
-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/


On Tue, 5 Dec 2000, Pan, Belinda wrote:

 Hello,
 
 We are looking for a powerful search engine application based on UNIX
 platform and Oracle. please send us detailed information about your
 products.
 
 We would appreciate your response at your earliest convinence.
 
 Belinda Pan
 Sr. Webmaster 
 [EMAIL PROTECTED]
 416.538.7538 ext 270
 
 



To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.
List archives:  http://www.htdig.org/mail/menu.html
FAQ:http://www.htdig.org/FAQ.html




Re: [htdig] Re: detailed information

2000-12-05 Thread Geoff Hutchison

On Tue, 5 Dec 2000, Geoff Hutchison wrote:

 If, on the other hand, you're looking for a general-purpose,
 open-source* web search package, feel free to browse the information on
 ht://Dig at:
 http://www.htdig.org/

Sorry, I couldn't resist the urge to throw in some buzzwords. :-)

--
-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/



To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.
List archives:  http://www.htdig.org/mail/menu.html
FAQ:http://www.htdig.org/FAQ.html




[htdig] Re:

2000-12-05 Thread Geoff Hutchison

At 4:55 PM +0100 12/5/00, Roberta Minneci wrote:
  How do I restrict a search to word out  script language="JavaScript"
/script?

See http://www.htdig.org/attrs.html#noindex_start

--
-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/


To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.
List archives:  http://www.htdig.org/mail/menu.html
FAQ:http://www.htdig.org/FAQ.html




[htdig] Pb indexing HTML with htdig 3.1.5

2000-12-05 Thread André LAGADEC

Hello,

I use htdig 3.1.5 on a Red Hat Linux 5.0, and I want to index a new web
site. But when I run rundig I get only one document.

So to see what is doing, I use rundig -vvv and I get this output :
Header line: HTTP/1.1 200 OK
Header line: Server: Netscape-Enterprise/3.5.1C
Header line: Date: Wed, 06 Dec 2000 07:32:02 GMT
Header line: Content-type: text/html
Header line: Last-modified: Mon, 15 Nov 1999 10:45:01 GMT
Translated Mon, 15 Nov 1999 10:45:01 GMT to 1999-11-15 10:45:01 (99)
And converted to Mon, 15 Nov 1999 10:45:01
Header line: Content-length: 1258
Header line: Accept-ranges: bytes
Header line: Connection: close
Header line: 
returnStatus = 0
Read 1258 from document
Read a total of 1258 bytes
Tag: html, matched -1
head:  
 size = 1258
pick: x.y.z.t, # servers = 1
htdig: Run complete
htdig: 1 server seen:
htdig: x.y.z.t:8000 1 document

I think that htdig doesn't like the HTML code "!--//" and "//--", and
it see beginning of comment but not the end and ignore the rest of HTML
code of the page.

I am true ? An other idea ? What can I do ?

N.B. : The HTML code of the first page on the site is under this line.
_
html

head
titleAccueil DIRECTION/title
base target="rtop"
script language="JavaScript"
!--//
var url="";
var nom="";
var bName="";

function Ouvrir()
{
bName = navigator.appName
Version = navigator.appVersion
Version = Version.substring(0,1)
browserOK = ((Version = 2))

if (browserOK) 
{
this.name="home";
   
msgWindow=window.open("actu/default2.htm","popupdpd","location=no,toolbar=no,status=no,directories=no,scrollbars=yes,width=400,height=450");
bName=navigator.appName;
if (bName=="Netscape") msgWindow.focus();

}
}
Ouvrir()

//--
/script
/head

frameset framespacing="0" border="false" frameborder="0" cols="155,*"
  frame name="gauche" scrolling="no" noresize target="haut_droite"
src="defaulta.htm"
  marginwidth="0" marginheight="5"
  frameset rows="*,45"
frame name="texte" target="bas_droite" src="defaultb.htm"
scrolling="auto"
marginwidth="0" marginheight="0" noresize
frame name="bas" src="basac.htm" scrolling="no" marginwidth="7"
marginheight="15"
noresize
  /frameset
  noframes
  body
  pCette page utilise des cadres, mais votre navigateur ne les prend
pas en charge./p
  /body
  /noframes
/frameset
/html


To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.
List archives:  http://www.htdig.org/mail/menu.html
FAQ:http://www.htdig.org/FAQ.html