[ 
https://issues.apache.org/jira/browse/CONNECTORS-275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13162233#comment-13162233
 ] 

Mark Bennett commented on CONNECTORS-275:
-----------------------------------------

Hi Karl,

I went to take a look at what you had checked in, thanks.  Looks you added 
content to the source XML doc at 
http://svn.apache.org/repos/asf/incubator/lcf/trunk/site/src/documentation/content/xdocs/end-user-documentation.xml,
 but when I go to look at it on 
http://incubator.apache.org/connectors/end-user-documentation.html#webrepository
 I'm not seeing the change.

I'm assuming there's some cron XML->HTML job that runs, I don't know much about 
the logistics.

If I had suggested updates to the doc, I imagine I could checkin a patch to the 
XML file to this bug report.  But is there some tool that you use to edit it 
that'll let you switch back and forth between wysiwyg/html/xml ?

And I'm guessing the doc is in this format so that folks get a local copy?  And 
so that it's also version controlled under SVN, instead of having separate 
versions in Confluence?

I've had a chat with some cohorts here and I think I might be able to help with 
the doc.

Thanks again,
Mark 
                
> Clarify documentation as to how to set up session login for web connector
> -------------------------------------------------------------------------
>
>                 Key: CONNECTORS-275
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-275
>             Project: ManifoldCF
>          Issue Type: Improvement
>          Components: Documentation, Web connector
>    Affects Versions: ManifoldCF 0.4
>            Reporter: Karl Wright
>
> A book reader has this comment, which basically implies that we need to 
> improve the documentation for the web connector:
> "I was excited to get the full version of the online book, but then 
> disappointed when it referred back to the online doc for setting up logins 
> for a Web spidering. The online doc is very vague and only gives one example. 
> I've used Ultraseek's and Google's spider, but I still find the Session login 
> sequences non-obvious.
> I've got a subscription request into the user mailing list, but here's the 
> parts that are not clear.
> I generally understand about using regexes to define sites and sorting out 
> content pages from login pages.
> But it's not clear why there's TWO Regex's per entry. There's a "Login URL" 
> regex, and also a "Form name/link target" regex.
> It's also not clear about the "page type" radio button choices.
> For "rediection", am I saying "look for a redirect event", or am I saying 
> "then DO a redirect to this page".
> And for "form name", what if my login page doesn't have a named form? In the 
> case of the site I'm trying to spider, when your session expires, you 
> manually go back to an https page and supply your username and password as 
> CGI parameters. I know this sounds odd, but it's apparently how a number of 
> the sites we're trying to spider work, some proprietary software.
> Karl, I really think the book or Wiki or doc needs 3 or 4 different examples 
> of login scenarios.
> Here's the scenario I'm trying, if you'd like to use it:
> Try to fetch: http://site.com/product?id=1234
> If you get a redirect to: http://site.com/Main.asp
> Note that there's no login form nor link on this page.
> Then invoke this login URL: 
> https://site.com/validate?username=me&password=that&otherArg=something
> Note that you can't just visit this page and fill in a form, that gives an 
> error, it has to be passed in (I think as a GET)
> Then record the session cookie and try for /product?id=1234 again.
> I realize this is odd, I didn't design it. "

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to