hi Karl,
thank you for the very prompt feedback!

> 1) Have you made sure to include the redirection back to the content?
This is the step I don't quite understand - could you please clarify
how that could be done? I thought, when the auth sequence is done
(exit login mode), the redirect to the original page happens
automatically (which is the case here, but somehow the content is
still "public").

> 2) your check for *entering* the login sequence is too broad and fires again 
> even though the private sitemap page is being returned.
totally agree, that's why the first step is to look into the content
of the page, to check, if there is a pattern which appears in the
public version ONLY.
This is the only solution I can imagine so far, but any ideas - very welcome!

The simple history shows basically the same - the process never leaves
the login stage.

If I remove the 3rd step, then I see, that the login stage is over
(logon end), but as the content of the sitemap.xml is still "public",
the login process kicks in again.

Thanks!
Konstantin

2016-07-07 11:07 GMT+02:00 Karl Wright <[email protected]>:
> Hi Konstantin,
>
> There are two possibilities:
>
> (1) You have missed one stage when specifying the login sequence.  The
> cookies are getting set, but not during a step that's part of the login
> sequence.  Have you made sure to include the redirection back to the
> content?
> (2) You really are logging in but your check for *entering* the login
> sequence is too broad and fires again even though the private sitemap page
> is being returned.
>
> You can also look at the simple history as well to get an idea what MCF is
> doing for your job for session handling.
>
> Thanks,
> Karl
>
>
> On Thu, Jul 7, 2016 at 4:35 AM, jetnet <[email protected]> wrote:
>>
>> Hi All,
>>
>> I've been trying to setup a session-based auth sequence for a forked
>> MediaWiki site (Wiki connector does not work with this version), but
>> somehow got stuck with the configuration.
>> The idea is to index the site using its sitemap.xml with hops=1. The
>> "public" version (user not logged in) of the sitemap.xml contains a
>> different set of links as the "authenticated" one (user logged in).
>> The current auth sequence looks like this (the job's seeding
>> URL=http://wikisite/sitemap.xml):
>>
>> 1) the first call to the seeding URL should be redirected to the login
>> page
>> Login URL regexp: sitemap.xml
>> Page type: content
>> Identification regular expression: <some content from the "public"
>> version>
>> Override target URL: /Special:UserLogin
>>
>> 2) enter user's credentials on the login page
>> Login URL regexp: Special:UserLogin
>> Page type: form
>> Override form parameters: username=someuser, password=******,
>> returntourl=http://wikisite/sitemap.xml
>>
>> 3) the login page ***should*** redirect back to the seeding URL with
>> the authorized content
>> Login URL regexp: /Special:UserLogin
>> Page type: redirection
>> Identification regular expression: /sitemap.xml
>>
>> From the log-file I can see, that first 2 steps work fine - the public
>> content gets recognized, the form data get sent, the session's cookies
>> get set. But the 3rd step returns the "public" version of the
>> sitemap.xml again, and the login process is getting stuck in a loop.
>> Am I on the right way or did I miss something?
>>
>> here is the log for the 3rd step:
>>
>>  INFO 2016-07-06 22:52:27,285 (Worker thread '43') - WEB: FETCH
>> LOGIN|http://wikisite/Special:UserLogin|1467838347082+203|302|153|
>> DEBUG 2016-07-06 22:52:27,285 (Worker thread '43') - WEB: Tried to
>> match raw url 'http://wikisite/sitemap.xml'
>> DEBUG 2016-07-06 22:52:27,285 (Worker thread '43') - WEB: Tried to
>> match cooked url 'http://wikisite/sitemap.xml'
>> DEBUG 2016-07-06 22:52:27,285 (Worker thread '43') - WEB: Redirection
>> link lookup matched 'http://wikisite/sitemap.xml'
>> DEBUG 2016-07-06 22:52:27,285 (Worker thread '43') - WEB: Document
>> 'http://wikisite/Special:UserLogin' matches preferred redirection, so
>> determined to be login page for sequence 'wikisite'
>> DEBUG 2016-07-06 22:52:27,394 (Worker thread '43') - WEB: Waiting for
>> an HttpClient object
>> DEBUG 2016-07-06 22:52:27,394 (Worker thread '43') - WEB: For
>> http://wikisite/sitemap.xml, setting virtual host to wikisite
>> DEBUG 2016-07-06 22:52:27,394 (Worker thread '43') - WEB: Got an
>> HttpClient object after 0 ms.
>> DEBUG 2016-07-06 22:52:27,394 (Worker thread '43') - WEB: Get method
>> for '/sitemap.xml'
>> DEBUG 2016-07-06 22:52:27,394 (Worker thread '43') - WEB: Adding 2
>> cookies for '/sitemap.xml'
>> DEBUG 2016-07-06 22:52:27,394 (Worker thread '43') - WEB:  Cookie
>> '[version: 0][name: PHPSESSID][value:
>> 1vnhgi0f84dc9pi6eaoj0nau45][domain: wikisite][path: /][expiry: null]'
>> added
>> DEBUG 2016-07-06 22:52:27,394 (Worker thread '43') - WEB:  Cookie
>> '[version: 0][name: authtoken][value:
>> 920_636034351472613318_616a5fd45ce4d5fed6c5318d73b38070][domain:
>> wikisite][path: /][expiry: Wed Jul 13 22:52:27 CEST 2016]' added
>> DEBUG 2016-07-06 22:52:35,660 (Worker thread '43') - WEB: Retrieving
>> cookies...
>> DEBUG 2016-07-06 22:52:35,660 (Worker thread '43') - WEB:   Cookie
>> '[version: 0][name: PHPSESSID][value:
>> vqfpr88pqa6d62nl6h4lp03nu1][domain: wikisite][path: /][expiry: null]'
>> DEBUG 2016-07-06 22:52:35,660 (Worker thread '43') - WEB:   Cookie
>> '[version: 0][name: authtoken][value:
>> 920_636034351472613318_616a5fd45ce4d5fed6c5318d73b38070][domain:
>> wikisite][path: /][expiry: Wed Jul 13 22:52:27 CEST 2016]'
>>  INFO 2016-07-06 22:52:37,004 (Worker thread '43') - WEB: FETCH
>> LOGIN|http://wikisite/sitemap.xml|1467838347394+9610|200|683773|
>> DEBUG 2016-07-06 22:52:37,004 (Worker thread '43') - WEB: Document
>> 'http://wikisite/sitemap.xml' is text, with encoding 'utf-8'; link
>> extraction starting
>> DEBUG 2016-07-06 22:52:37,019 (Worker thread '43') - WEB: Document
>> 'http://wikisite/sitemap.xml' matches content, so determined to be
>> login page for sequence 'wikisite'
>>
>>
>> Thank you!
>> regards, Konstantin
>
>

Reply via email to