I added the cookies_input_file and disable_cookies elements to the config file for this site as follows:
# enable cookies support
disable_cookies: false
cookies_input_file: /apps/www/htdig/conf/cookies.txt
I then added a single line to the cookies.txt file as follows:
testintranetportal TRUE /portal FALSE 0 epicentric d97692f64e7b540b0f504e238790307a
I got this info after clearing my cookies and then logging in to the site. There is also a session id cookie (JSESSIONID) which gets set, but it is only persistent for the session. The "epicentric" cookie seems to be the one which contains persistent login information and the value in the line above is what was stored in my browser for my login. However, this still does not get me past the login redirect.... Here's some info from the htdig verbose output:
rundig: Start time: Wed Jan 26 14:13:43 EST 2005
ht://dig Start Time: Wed Jan 26 14:13:43 2005
Importing Cookies input file /apps/www/htdig/conf/cookies.txt
Cookies that have been correctly imported from: /apps/www/htdig/conf/cookies.txt
1. epicentric: d97692f64e7b540b0f504e238790307a (Domain: testintranetportal)
......This tells me the htdig cookies file is being read correctly.......
Try to get through to host testintranetportal (port 80)
2 - Open of the connection ok
Assigning the server (testintranetportal) to the TCP connection
Assigned the remote host testintranetportal
Assigning the port (80) to the TCP connection
Assigned the port 80
Connecting via TCP to (testintranetportal:80)
New connection open successfully
Header line: HTTP/1.1 302 Found
Header line: Server: Microsoft-IIS/5.0
Header line: Date: Wed, 26 Jan 2005 19:09:14 GMT
Header line: X-Powered-By: ASP.NET
Discarded header line: X-Powered-By: ASP.NET
Header line: Connection: close
Header line: Server: WebSphere Application Server/5.0
Header line: Set-Cookie: JSESSIONID=00003WHZUFZ25GHUYKKT0E5Y5LI:-1;Path=/
........This tells me that a TCP connection can be made to the server...and in fact the server sets JSESSIONID.......
Retrieving document /portal/site/inside-test/index.jsp on host: testintranetportal:80
Http version : HTTP/1.1
Server : HTTP/1.1
Status Code : 302
Reason : Found
Access Time : Wed, 26 Jan 2005 19:09:14 EST
Modification Time : Wed, 26 Jan 2005 19:13:44 EST
Content-type : text/html; charset=UTF-8
Content-Language : en-US
Connection : close
Persistent connection: not accepted
Body not retrieved
2 - Connection closed (No persistent connection)
Request time: 0 secs
Contents:
Content Type: text/html; charset=UTF-8
Content Length: -1
Modification Time: 2005-01-26 19:13:44 EST
redirect
redirect: http://testintranetportal/portal/site/inside-test/index.jsp?epi-content=LOGIN
resolving 'http://testintranetportal/portal/site/inside-test/index.jsp?epi-content=LOGIN'
pushing http://testintranetportal/portal/site/inside-test/index.jsp?epi-content=LOGIN
......This tells me that htdig is trying to retrieve the correct document (index.jsp)....I'm not sure what the deal is with the persistent connection error??.......
......Then you can see that there is a redirect to the LOGIN page......
Thereafter there are quite a few lines of similar content since each URL I'm trying to dig gets redirected in the same manner as above. At the end, the login page is actually read and indexed....but that's the only page.
Perhaps this info provides some detail that might be helpful in further diagnosing the problem. I still haven't heard back from my colleague who is supposed to be contacting Vignette. Is there any way to tell where/how/if htdig is attempting to set the cookie or pass it to the host/server? I didn't see anything in the log file about that.
Thanks for your help.
Bruce
| Neal Richter <[EMAIL PROTECTED]>
01/24/2005 08:51 PM |
|
On Mon, 24 Jan 2005, Bruce DeYoung wrote:
> OK. Here's the URL at the login page:
>
> http://testintranetportal/portal/site/insideQAD/index.jsp?epi-content=LOGIN
>
> Then, after logging in, here are a couple of URL's of content pages:
>
> http://testintranetportal/portal/site/insideQAD/index.jsp?front_door=true&epi_menuItemID=17b4d03e0ebb0d03c0bc8ed22890307a&epi_menuID=557c013f162725a5c2046e478790307a&epi_baseMenuID=557c013f162725a5c2046e478790307a
>
> and
>
> http://testintranetportal/portal/site/insideQAD/index.jsp?front_door=true&epi_menuItemID=8853e4e036d9d40ecfd048922890307a&epi_menuID=b65bac56c452abf6aeda32202890307a&epi_baseMenuID=557c013f162725a5c2046e478790307a
ha ha.. this is almost as opaque as it gets. I've solved your issue
before via the cookies file and rewriting the URL.. but those are not very
informative.
For those URLs you need information on how they tell the CGI what to do
and 'is there a sessionid buried in there'?
Can you get this from Vignette or the people that connected Vignette
to whatever CGI/ASP/JSP software that produces the website?
The main question you want to answer is this:
Do I need to do anything to those URLs so that after a user clicks on a
search result they are able to view that page without screwing up my
reporting?
A simple test would be to log into the site with one browser and 'cut' a
link URL. Then open up a second Browser (different one, not two IE
windows) like Firefox (with the cookies all cleared) and paste the URL
into it. What happens?
Will the search box be 'behind' the login screen? ie the users will
already be loged-in before they do their first search.
Anyway, things to think about.
Thanks
> Thanks again,
>
> Bruce
>
>
>
>
> Neal Richter <[EMAIL PROTECTED]>
> 01/24/2005 12:59 PM
>
> To
> Bruce DeYoung <[EMAIL PROTECTED]>
> cc
> [email protected]
> Subject
> Re: [htdig-dev] htDig and Vignette??
>
>
>
>
>
>
> On Sun, 23 Jan 2005, Bruce DeYoung wrote:
>
>> Thanks Neal for the reply. Unfortunately, I cannot provide a link to
> the
>> site since it is an intranet site only....at this time.
>
> Post it anyway so I can take a look at it's structure. Post the login
> URL then the first URL you see after a sucessful login.
>
>> My suspicion about this is that Vignette security is handled differently
>> than, say, standard Apache security. Using the -u option with htdig
> and
>> supplying an authenticated user for our Apache-based sites works fine.
> I'm
>> not sure how Vignette authentication works, but I do know that when you
>> attempt to access the site, if your login cookie is not set, it will
>> rediret to a login page and request authentication information.
>
> Open your cookies file in the browser and clear anything associated
> with
> this website, then relogin into the webiste and check the cookies.
>
>> I've asked our Vignette developer to request some assistance from
> Vignette
>> support as well.
>>
>> When you say "make sure cookie support is enabled", are you referring to
>> something in Vignette or in htDig?
>
> I assume you are using HtDig 3.2B6
>
> Look at the cookies_input_file & disable_cookies settings in HtDig.
> The disable_cookies is 'true' be default.
>
> My gut feeling is that it's setting a cookie. You can take the
> contents of the cookie that the browser stores and load it in to the
> HtDig indexer via the cookies_input_file.
>
> It may also be that the software checks the 'user_agent' string
> supplied
> by the browser/indexer and may disallow access if you aren't running a
> certain version of browser.
>
> You can fake this buy setting the user_agent in HtDig to be the string
> supplied by IE. Get it from your apache server weblogs.
>
> I've seen both of these problems and worked around them this way.
>
>> And, I understand what you're saying about using the rewrite rules...and
> I
>> think you're right about that one. So, once I'm able to dig the site,
> I
>> will look at the URL references and create a url_rewrite rule to remove
>> the session information.
>
> Thanks.
>
>
--
Neal Richter
Knowledgebase Developer
RightNow Technologies, Inc.
Customer Service for Every Web Site
Office: 406-522-1485
