RE: htdig: SSL search engine

1998-11-22 Thread Geoff Hutchison

At 9:06 AM -0500 11/19/98, Tiziana Manfroni wrote:
I have problem with SSL.
The default port of SSL is 443 (no 80). There aren't documents in
database.

Well since ht://Dig doesn't support SSL at all, it doesn't know that https
should go through port 443.

I'm not quite sure why the document was indexed and then removed during
htmerge--it seemed to do the "htdig" phase correctly (assuming you don't
have any links off of Welcome.html)


-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/


--
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED] containing the single word "unsubscribe" in
the body of the message.



RE: htdig: SSL search engine

1998-11-19 Thread Tiziana Manfroni

I have problem with SSL.

This is the output of "htdig -vvv" :
---
  New server: www.mat.uniroma3.it, 80 
  Retrieval command for http://www.mat.uniroma3.it/robots.txt: GET
  /robots.txt HTTP/1.0
  User-Agent: htdig/3.1.0b2 ([EMAIL PROTECTED])
  Host: www.mat.uniroma3.it

  Header line: HTTP/1.1 404 File Not Found
  Header line: Date: Thu, 19 Nov 1998 14:07:42 GMT
  Header line: Server: Apache/1.2.5 Ben-SSL/1.13
  Header line: Connection: close
  Header line: Content-Type: text/html
  Header line: 
  returnStatus = 1
  pick: www.mat.uniroma3.it:80, # servers = 1
  0:0:-1:https://www.mat.uniroma3.it/: Trying local file
  /www/private/Welcome.html
   not changed
  pick: www.mat.uniroma3.it:80, # servers = 1
---

With the following in the htdig.conf:

  local_urls: https://www.mat.uniroma3.it/=/www/private/
  local_default_doc: Welcome.html
  start_url:  https://www.mat.uniroma3.it/

but when I run the rundig script:
---
  htdig: Run complete
  htdig: 1 server seen:
  htdig: www.mat.uniroma3.it:80 1 document
  htmerge: Total word count: 35
  htmerge: Total documents: 0
  htmerge: Total doc db size (in K): 0
---
The default port of SSL is 443 (no 80). There aren't documents in
database.
How can I resolve this problem?
Thanks, Tiziana

--
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED] containing the single word "unsubscribe" in
the body of the message.



RE: htdig: SSL search engine

1998-11-16 Thread Oliver Smith

Here is the output of attempting to dig via the filesystem with either
SSL or no webserver running:
---
 ./htdig -vvv

New server: volans.uk.internal, 80
Unable to build connection with volans.uk.internal:80
pick: volans.uk.internal:80, # servers = 1 
---

With the following in the conf file:
---
local_urls: https://volans.uk.internal/=/home/www/docs/
local_default_doc: index.shtml
start_url: https://volans.uk.internal/ 
---

It does appear to work fine however with an http server running that it
is able to connect to.

I assume the reason broken links should not be ignored is that there is
no guarantee that the link is broken just because the file can not be
found via the filesystem. I think a configuration option that could stop
htdig using http at all would be very useful in our case though. Any
other suggestions would be gratefully received.

thanks, regards 
Oliver Smith


 -Original Message-
 From: Geoff Hutchison [SMTP:[EMAIL PROTECTED]]
 Sent: 15 November 1998 01:39
 To:   Oliver Smith
 Cc:   Brian K. Justice; [EMAIL PROTECTED]
 Subject:  RE: htdig: SSL search engine
 
 At 12:24 PM -0500 11/13/98, Oliver Smith wrote:
  We also have a site that is going to use SSL.
 
  How would one go about using the local-filesystem indexing
 option you mention below, considering that ht://dig still requires to
 establish a HTTP connection even when indexing via the filesystem?
 
 OK. I could be wrong, but I think the last time this was brought up, I
 asked for a trace from htdig -vvv to see why it's hitting HTTP unless
 it
 can't find a file.
 
 Now if you want to know why it can't ignore broken links when doing
 filesystem digging, that's a different matter (I'll add a config
 option if
 this is an issue). But it shouldn't need to hit HTTP.
 
 
 -Geoff Hutchison
 Williams Students Online
 http://wso.williams.edu/
 
--
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED] containing the single word "unsubscribe" in
the body of the message.



Re: htdig: SSL search engine

1998-11-16 Thread rfi from Rich Roth

On Wed, Nov 11, 1998 at 01:27:27PM -0500, Brian K. Justice wrote:

 PS. These sites can't be non-SSL long enough to just index, we're 
 talking about .mil sites, and these folks are *very* headstrong
 about this :-(

I know the type, grew up in a aerospace family.

Anyway, I'm sure you can get an IP that is protected by the firewall and
assign it to a port XXX (not 80) web server that has a 'allow from
MACHINES-own-IP' protection directive (that is the Apache directive)

-- 
Later ...

Rich Roth --- On-the-Net

Direct:  Box 927, Northampton, MA 01061, Voice: 413-586-9668

Email: [EMAIL PROTECTED] Url: http://www.on-the-net.com
   ~~~   www.i-depth.com lets you Add Instant Depth to your Website~~~
~~~  Adding depths to Web presences and Internet providers  ~

--
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED] containing the single word "unsubscribe" in
the body of the message.



RE: htdig: SSL search engine

1998-11-14 Thread Geoff Hutchison

At 12:24 PM -0500 11/13/98, Oliver Smith wrote:
   We also have a site that is going to use SSL.

   How would one go about using the local-filesystem indexing
option you mention below, considering that ht://dig still requires to
establish a HTTP connection even when indexing via the filesystem?

OK. I could be wrong, but I think the last time this was brought up, I
asked for a trace from htdig -vvv to see why it's hitting HTTP unless it
can't find a file.

Now if you want to know why it can't ignore broken links when doing
filesystem digging, that's a different matter (I'll add a config option if
this is an issue). But it shouldn't need to hit HTTP.


-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/


--
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED] containing the single word "unsubscribe" in
the body of the message.



RE: htdig: SSL search engine

1998-11-13 Thread Oliver Smith

We also have a site that is going to use SSL.

How would one go about using the local-filesystem indexing
option you mention below, considering that ht://dig still requires to
establish a HTTP connection even when indexing via the filesystem?

--Oliver R Smith


 At 1:27 PM -0500 11/11/98, Brian K. Justice wrote:
 
 We've used ht://dig to index a bunch of sites, but these sites
 are now all going SSL, which presents the obvious problems. Does
 anyone
 know of a search engine that will go through SSL? Or, better yet, how
 feasible is it to add this functionality to ht://dig? I'd like to
 
 There are two ways to do this. One is through local-filesystem
 indexing,
 which they should love--theoretically no network.
 
 The other is more complicated. Due to US laws, I can't put actual SSL
 code
 into the indexer. Someone could put in hooks to SSL code... Then if
 you
 want to add SSL you can go get the SSL library legally (but separately
 from
 ht://Dig) and it would work.
 
 
 -Geoff Hutchison
 Williams Students Online
 http://wso.williams.edu/
 
 
 --
 To unsubscribe from the htdig mailing list, send a message to
 [EMAIL PROTECTED] containing the single word "unsubscribe" in
 the body of the message.
--
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED] containing the single word "unsubscribe" in
the body of the message.



htdig: SSL search engine

1998-11-11 Thread Brian K. Justice

All,

I caught the discussion that went on here a few months ago
about ht://dig and SSL, but I'm going to bring it up again anyway.
We've used ht://dig to index a bunch of sites, but these sites
are now all going SSL, which presents the obvious problems. Does anyone
know of a search engine that will go through SSL? Or, better yet, how
feasible is it to add this functionality to ht://dig? I'd like to 
think we (we = company I work for) could potentially do this, although
I may be dreaming. In any event, all info is appreciated

Brian

PS. These sites can't be non-SSL long enough to just index, we're 
talking about .mil sites, and these folks are *very* headstrong
about this :-(


Brian K. JusticeSoftware Engineer 
Raytheon Systems Companyemail: [EMAIL PROTECTED]
7700 Arlington Blvd., M/S N102  phone: (703) 560-5000 x 2840
Falls Church, VA 22046-1572 BGPHES lab: (703) 560-5000 x 4395

#include raytheon/policy/95-1002-110


--
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED] containing the single word "unsubscribe" in
the body of the message.



Re: htdig: SSL search engine

1998-11-11 Thread Geoff Hutchison

At 1:27 PM -0500 11/11/98, Brian K. Justice wrote:

We've used ht://dig to index a bunch of sites, but these sites
are now all going SSL, which presents the obvious problems. Does anyone
know of a search engine that will go through SSL? Or, better yet, how
feasible is it to add this functionality to ht://dig? I'd like to

There are two ways to do this. One is through local-filesystem indexing,
which they should love--theoretically no network.

The other is more complicated. Due to US laws, I can't put actual SSL code
into the indexer. Someone could put in hooks to SSL code... Then if you
want to add SSL you can go get the SSL library legally (but separately from
ht://Dig) and it would work.


-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/


--
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED] containing the single word "unsubscribe" in
the body of the message.