Hi Karl,

 

I noticed that the cookies used by the web connector are stored both into
memory and in the cookiedata table of the manifold database. The cookiedata
table still keeps cookies of a connector even if this one is removed from
MCF admin UI. This can lead to problematic behaviors. Let me explain:

 

I have configured a login sequence to cover the following use case:

 

URL=test.com

Step1 = form a (set cookie)

Step2 = redirect b (set cookie)

Step3 = redirect c (set cookie)

 

The 3 different cookies are required to be able to crawl the wanted website
but I did a mistake in the configuration and the login sequence was
interrupted at step 2. So the connector retrieved 2 cookies then ended up in
an infinite loop. I did a correction on the configuration but then, when I
have restarted the job, it did not work. By checking the logs I noticed that
the job was using the 2 retrieved cookies at Step1, and the problem was that
with the cookies, the form have a different behavior and does not redirect
to 'b' (Step2) but returns a 200 OK response which ends prematurely the
login sequence. As a consequence, the third required cookie was never
retrieved. 

The solution was simple, I needed to remove the cookies so that the job
restarts with an empty cookie cache for the website. Indeed it worked, but
to be able to do that I had to:

1.      remove the cookies from the cookiedata table 
2.      reboot the mcf agent so that its in memory cache was emptied. 

 

Without those manips, the job was always using the cookies (even a job +
connector delete then recreation did not work)

 

Would it be possible to create a button in the connector's view to remove
the cookies from the cookiedata table + the in memory cache in order to
avoid such manips ? 

 

Julien 

 

Reply via email to