Hi, it's likely the python equivalent of hex_sha1 is:
import hashlib
def hex_sha1(value):
return hashlib.sha1(value).hexdigest()
and .php_to8bit() equivalent to .encode("utf8").
Using firebug or chrome tools you should be able to capture the hashed
password sent to server and then attempt to reproduce the value in python.
If you are not sure from where or how to retrieve the session value, you
can do it with:
response.css('#guest_form::attr(onsubmit)').re("'(.+)'")[0]
Assuming there is no other form value set with javascript, it seems very
straightforward to reproduce this POST request with the method
FormRequest.from_response.
Regards,
Rolando
On Sun, Jul 27, 2014 at 5:03 PM, Benjamin Schollnick <
[email protected]> wrote:
> Folks,
>
> I'm looking to use Scrapy to simplify some web scrapers that I have
> previously written.
>
> But I'm curious if anyone else has had any luck in using Scrapy with
> forums based off of SMF forums? (e.g.
> http://www.simplemachines.org/community/).
>
> The main kicker, is that they require the password to be hashed when sent
> back to them.
>
> <form id="guest_form" action="
> http://www.simplemachines.org/community/index.php?action=login2"
> method="post" accept-charset="UTF-8" *onsubmit="hashLoginPassword(this,
> '52c8d7f4468b42ca44664e302e2a5865');">*
> <div class="info">Please <a href="
> http://www.simplemachines.org/community/index.php?action=login">login</a>
> or <a href="
> http://www.simplemachines.org/community/index.php?action=register
> ">register</a>.</div>
> <input type="text" name="user" size="10" class="input_text">
> <input type="password" name="passwrd" size="10" class="input_password">
> <select name="cookielength">
> <option value="60">1 Hour</option>
> <option value="1440">1 Day</option>
> <option value="10080">1 Week</option>
> <option value="43200">1 Month</option>
> <option value="-1" selected="selected">Forever</option>
> </select>
> <input type="submit" value="Login" class="button_submit"><br>
> <div class="info">Login with username, password and session length</div>
> * <input type="hidden" value="52c8d7f4468b42ca44664e302e2a5865"
> name="b259f60">*
> * <input type="hidden" name="hash_passwrd" value="">*
> </form>
>
> Which uses the following Javascript,
>
> function hashLoginPassword(doForm, cur_session_id)
> {
> // Compatibility.
> if (cur_session_id == null)
> cur_session_id = smf_session_id;
>
> if (typeof(hex_sha1) == 'undefined')
> return;
> // Are they using an email address?
> if (doForm.user.value.indexOf('@') != -1)
> return;
>
> // Unless the browser is Opera, the password will not save properly.
> if (!('opera' in window))
> doForm.passwrd.autocomplete = 'off';
>
> doForm.hash_passwrd.value =
> hex_sha1(hex_sha1(doForm.user.value.php_to8bit().php_strtolower() +
> doForm.passwrd.value.php_to8bit()) + cur_session_id);
>
> // It looks nicer to fill it with asterisks, but Firefox will try to
> save that.
> if (is_ff != -1)
> doForm.passwrd.value = '';
> else
> doForm.passwrd.value = doForm.passwrd.value.replace(/./g, '*');
> }
>
>
> Has anyone had success with SMF Forums? And example or plugin code that I
> could use for this?
>
> I'm sure there is an python equivalent to the hex_sha1, and php_to8bit,
> but I had the opportunity to dig it up.
>
> - Benjamin
>
> --
> You received this message because you are subscribed to the Google Groups
> "scrapy-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at http://groups.google.com/group/scrapy-users.
> For more options, visit https://groups.google.com/d/optout.
>
--
You received this message because you are subscribed to the Google Groups
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.