[
https://issues.apache.org/jira/browse/CONNECTORS-275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13162233#comment-13162233
]
Mark Bennett commented on CONNECTORS-275:
-----------------------------------------
Hi Karl,
I went to take a look at what you had checked in, thanks. Looks you added
content to the source XML doc at
http://svn.apache.org/repos/asf/incubator/lcf/trunk/site/src/documentation/content/xdocs/end-user-documentation.xml,
but when I go to look at it on
http://incubator.apache.org/connectors/end-user-documentation.html#webrepository
I'm not seeing the change.
I'm assuming there's some cron XML->HTML job that runs, I don't know much about
the logistics.
If I had suggested updates to the doc, I imagine I could checkin a patch to the
XML file to this bug report. But is there some tool that you use to edit it
that'll let you switch back and forth between wysiwyg/html/xml ?
And I'm guessing the doc is in this format so that folks get a local copy? And
so that it's also version controlled under SVN, instead of having separate
versions in Confluence?
I've had a chat with some cohorts here and I think I might be able to help with
the doc.
Thanks again,
Mark
> Clarify documentation as to how to set up session login for web connector
> -------------------------------------------------------------------------
>
> Key: CONNECTORS-275
> URL: https://issues.apache.org/jira/browse/CONNECTORS-275
> Project: ManifoldCF
> Issue Type: Improvement
> Components: Documentation, Web connector
> Affects Versions: ManifoldCF 0.4
> Reporter: Karl Wright
>
> A book reader has this comment, which basically implies that we need to
> improve the documentation for the web connector:
> "I was excited to get the full version of the online book, but then
> disappointed when it referred back to the online doc for setting up logins
> for a Web spidering. The online doc is very vague and only gives one example.
> I've used Ultraseek's and Google's spider, but I still find the Session login
> sequences non-obvious.
> I've got a subscription request into the user mailing list, but here's the
> parts that are not clear.
> I generally understand about using regexes to define sites and sorting out
> content pages from login pages.
> But it's not clear why there's TWO Regex's per entry. There's a "Login URL"
> regex, and also a "Form name/link target" regex.
> It's also not clear about the "page type" radio button choices.
> For "rediection", am I saying "look for a redirect event", or am I saying
> "then DO a redirect to this page".
> And for "form name", what if my login page doesn't have a named form? In the
> case of the site I'm trying to spider, when your session expires, you
> manually go back to an https page and supply your username and password as
> CGI parameters. I know this sounds odd, but it's apparently how a number of
> the sites we're trying to spider work, some proprietary software.
> Karl, I really think the book or Wiki or doc needs 3 or 4 different examples
> of login scenarios.
> Here's the scenario I'm trying, if you'd like to use it:
> Try to fetch: http://site.com/product?id=1234
> If you get a redirect to: http://site.com/Main.asp
> Note that there's no login form nor link on this page.
> Then invoke this login URL:
> https://site.com/validate?username=me&password=that&otherArg=something
> Note that you can't just visit this page and fill in a form, that gives an
> error, it has to be passed in (I think as a GET)
> Then record the session cookie and try for /product?id=1234 again.
> I realize this is odd, I didn't design it. "
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira