Re: Using Netsurf cookies with wget
On 2022-01-21 13:17, Harriet Bazley wrote: On 21 Jan 2022 as I do recall, Jeremy Nicoll - ml netsurf wrote: Also, the form when used on a webpage, sets variable "user_remember_me" and (I'm not completely sure) maybe also the submit button part sets something - I don't know why it defines a name and a value - the latter is the text on the button but what's "name" for? That's just a 'Remember Me' (and don't ask for log-in again but redirect to the stats page) button. It's not just that. The form code has id="user_remember_me"/> (odd that there's two such definitions, but maybe JS on the page hides one of them completely). Nevertheless one would expect the POST request sent to the server to contain "_remember_me=0" (or =1) and maybe the server ignore requests that arrive without all the required parameters. I'm afraid I don't know enough about HTML forms to understand exactly what the Submit button is doing, but the page only prompts for two user inputs in this form. Yes, but type=hidden means there's entries in the form that do do things even if you don't see them. You should maybe read this, and the pages around it that discuss various aspects of forms https://www.w3schools.com/tags/att_input_type_hidden.asp I see from one of your other posts that you've found that pages have unique content. I'd guess that the server is using (eg) php session ids & it's sending the id of the session it's established for you to the page so the page can send that back on successive requests. The id (and maybe other values) would likely be used as a key in a database on the server that's storing a whole set of user- & session- specific information. -- Jeremy Nicoll - my opinions are my own ___ netsurf-users mailing list -- netsurf-users@netsurf-browser.org To unsubscribe send an email to netsurf-users-le...@netsurf-browser.org
Re: Using Netsurf cookies with wget
>> there might be hidden input fields, [...] > Ah - I think I may have spotted something. The actual tag at > the start contains an 'authenticity token': [reformatted for readability] >accept-charset="UTF-8" method="post"> > > value="VfGGu3jwjsf6xNQmlmuu3Qkgc1BsZzgu0ikhluwqmVHU9RFVQQUUANuaza9HFgXr_c71SiKwBLz8XA8bQ4hSOA"/> > > [...] > There's also that "utf8" field. Amusingly, U+2713, from the Dingbats range, is CHECK MARK. Of course, who knows what the server would do if that field weren't there or had a different value, such as maybe U+2718 (an X mark, called HEAVY BALLOT X) or U+00AC (NOT SIGN) > And this value is different for every copy of the page served, which > presumably means that it is, by design, impossible for anyone to log > in 'blind' with user name and password alone Likely. Quite possibly done as a defense against automated password-guessing bots. Unfortunately, with the current state of Internet governance, such defenses are close to essential. The token looks like URL-safe base64. Decoding it under that assumption produces random-looking binary data, so I suspect it is (as should be) being done with proper crypto. /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTMLmo...@rodents-montreal.org / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B ___ netsurf-users mailing list -- netsurf-users@netsurf-browser.org To unsubscribe send an email to netsurf-users-le...@netsurf-browser.org
Re: Using Netsurf cookies with wget
On 21 Jan 2022 as I do recall, Michael Drake wrote: > > On 21/01/2022 13:17, Harriet Bazley wrote: > > > Then the browser history getting updated with the new page and a > > FETCH_REDIRECT from the login page to the user home page. No record of > > what data was sent to the server, that I can see. > > In your Choices file, try setting: > > suppress_curl_debug:0 > That doesn't seem to make any difference, despite quitting and restarting (so far as I can tell by eye, the two log files are identical around that area in terms of what gets logged following the 'keydown' events). I tried it using the search box form, in order to simplify the amount of logging-in, quitting, and logging-out I had to do, and you just get a fetch_curl_setup with the URL of the page being fetched - which in the case of the search form, contains the 'work_search?query=' data encoded as part of the URL itself. Unfortunately in the case of the login form the variables submitted are not present in the URL of the resulting page! -- Harriet Bazley == Loyaulte me lie == Radioactive cats have 18 half-lives. ___ netsurf-users mailing list -- netsurf-users@netsurf-browser.org To unsubscribe send an email to netsurf-users-le...@netsurf-browser.org
Re: Using Netsurf cookies with wget
On 21 Jan 2022 as I do recall, Mouse wrote: > > I'm afraid I don't know enough about HTML forms to understand exactly > > what the Submit button is doing, > > HTML forms, I think, just generate a POST when submitting. But just > prompting for two visible inputs doesn't mean there are only two > input fields in the POST; there might be hidden input fields, fields > which aren't displayed, being there just to pass values through from > page generation to form submission. Read the HTML source for the form > if you want to check that possibility. > Ah - I think I may have spotted something. The actual tag at the start contains an 'authenticity token': User name or email: Password: Remember me Submit And this value is different for every copy of the page served, which presumably means that it is, by design, impossible for anyone to log in 'blind' with user name and password alone -- Harriet Bazley == Loyaulte me lie == The best laid schemes o' mice and men gang oft a-gley. ___ netsurf-users mailing list -- netsurf-users@netsurf-browser.org To unsubscribe send an email to netsurf-users-le...@netsurf-browser.org
Re: Using Netsurf cookies with wget
On 21/01/2022 13:17, Harriet Bazley wrote: > Then the browser history getting updated with the new page and a > FETCH_REDIRECT from the login page to the user home page. No record of > what data was sent to the server, that I can see. In your Choices file, try setting: suppress_curl_debug:0 -- Michael Drake https://www.codethink.co.uk/ ___ netsurf-users mailing list -- netsurf-users@netsurf-browser.org To unsubscribe send an email to netsurf-users-le...@netsurf-browser.org
Re: Using Netsurf cookies with wget
> I'm afraid I don't know enough about HTML forms to understand exactly > what the Submit button is doing, HTML forms, I think, just generate a POST when submitting. But just prompting for two visible inputs doesn't mean there are only two input fields in the POST; there might be hidden input fields, fields which aren't displayed, being there just to pass values through from page generation to form submission. Read the HTML source for the form if you want to check that possibility. Mouse ___ netsurf-users mailing list -- netsurf-users@netsurf-browser.org To unsubscribe send an email to netsurf-users-le...@netsurf-browser.org
Re: Using Netsurf cookies with wget
On 21 Jan 2022 as I do recall, Jeremy Nicoll - ml netsurf wrote: > On 2022-01-21 00:55, Harriet Bazley wrote: > > > I also tried using --post-data 'user-Login=USER_password=PASSWORD' > > with no result, > > That /may/ be because you weren't careful enough coding that. According > to the form code, the login variable isn't called "user-Login" but > instead (2 differences) "user_login". No - unfortunately it looks as if that was just sloppy transcription. The actual data in my test script is "user_login". :-( > > Also, the form when used on a webpage, sets variable "user_remember_me" > and (I'm not completely sure) maybe also the submit button part sets > something - I don't know why it defines a name and a value - the latter > is the text on the button but what's "name" for? That's just a 'Remember Me' (and don't ask for log-in again but redirect to the stats page) button. I'm afraid I don't know enough about HTML forms to understand exactly what the Submit button is doing, but the page only prompts for two user inputs in this form. (There are other forms on that page, e.g. a search box to search the site, but I'm assuming that provided I supply the correct id/input pairs the server will act on the data from the correct form - the trouble is that I understand very little of what I am actually doing here and am basically poking around at random.) > > > I haven't used Netsurf for ages, but in Firefox one can see in its logs > what URLs are built and sent back to a server - indeed in developer > tools one can see the "curl" equivalent command to each communication > with a server. Really the simplest way to recreate something is to > look at what the browser actually does. Yes, that was the recommendation on the Web page that suggested using the --post-data method: simply 'turn on development tools'. Unfortunately so far as I know Netsurf doesn't provide access to sniff around at that level. You can't read the log in !Scrap while the browser is actually running, which makes looking at that a little tricky, and it gives a *lot* of data that is nothing to do with curl fetches. But all I'm seeing is HTTP status codes, e.g. (16.33) [INFO netsurf] content/fetchers/curl.c:1200 fetch_curl_process_headers: HTTP status code 200 And then way, way down the page (17.46) [INFO netsurf] content/handlers/html/html.c:126 fire_generic_dom_event: Dispatching 'click' against 0x55ef7118 (18.69) [INFO netsurf] content/handlers/html/html.c:203 fire_dom_keyboard_event: Dispatching 'keydown' against 0x55d2d690 (18.96) [INFO netsurf] content/handlers/html/html.c:203 fire_dom_keyboard_event: Dispatching 'keydown' against 0x55d2d690 etc., which is me typing passwords. Lots of things being removed on the submission of the form: (20.03) [INFO netsurf] content/handlers/html/html.c:1192 html_destroy: content 0x55a955c0 (20.03) [INFO netsurf] content/handlers/html/form.c:1460 form_free_control: Control:0x55e20698 name:0x55e1ca48 value:0x55b1e9f0 initial:0x0 (20.03) [INFO netsurf] content/handlers/html/form.c:1460 form_free_control: Control:0x55e6f2e0 name:0x55e6f2c8 value:0x55e6f348 initial:0x55e6ea88 (20.03) [INFO netsurf] content/handlers/html/form.c:1460 form_free_control: Control:0x55e42780 name:0x0 value:0x55e42000 initial:0x0 (20.03) [INFO netsurf] content/content.c:695 content_remove_user: content file:///NetSurf:/Resources/CSS (0x55b62098), user 0x16c498 0x55a92c28 Then the browser history getting updated with the new page and a FETCH_REDIRECT from the login page to the user home page. No record of what data was sent to the server, that I can see. But as I said, I understand very little of what is going on. -- Harriet Bazley == Loyaulte me lie == Death is nature's way of telling you to slow down. ___ netsurf-users mailing list -- netsurf-users@netsurf-browser.org To unsubscribe send an email to netsurf-users-le...@netsurf-browser.org
Re: Using Netsurf cookies with wget
On 2022-01-21 00:55, Harriet Bazley wrote: I also tried using --post-data 'user-Login=USER_password=PASSWORD' with no result, That /may/ be because you weren't careful enough coding that. According to the form code, the login variable isn't called "user-Login" but instead (2 differences) "user_login". Also, the form when used on a webpage, sets variable "user_remember_me" and (I'm not completely sure) maybe also the submit button part sets something - I don't know why it defines a name and a value - the latter is the text on the button but what's "name" for? I haven't used Netsurf for ages, but in Firefox one can see in its logs what URLs are built and sent back to a server - indeed in developer tools one can see the "curl" equivalent command to each communication with a server. Really the simplest way to recreate something is to look at what the browser actually does. -- Jeremy Nicoll - my opinions are my own ___ netsurf-users mailing list -- netsurf-users@netsurf-browser.org To unsubscribe send an email to netsurf-users-le...@netsurf-browser.org
Re: Using Netsurf cookies with wget
On 20 Jan 2022 as I do recall, Harriet Bazley wrote: > On 20 Jan 2022 as I do recall, > simon_sm...@zen.co.uk wrote: > > [snip] > > > > > The cookie part is probably a red herring. The conventional approach > > would be to use the wget tools to fetch the login page and send > > username and password using the features wget has built-in. Hey > > presto, now you're logged in, via wget, and you should be able to get > > the rest of the stuff you're after. > > I've tried "wget --ask-password URL", which prompts me for the password > but then redirects to fetch the login page as if I were not logged in, > and "wget --user=USER --password=PASSWORD URL", which also redirects to > the login page instead of retrieving the one I asked for. > > I've tried fetching the login page directly using --user and --pass, but > it just fetches the 'please log in' prompt instead of the 'you are > already signed in' prompt. > I also tried using --post-data 'user-Login=USER_password=PASSWORD' with no result, where the form in the log-in page is as follows: User name or email: Password: Remember me Submit -- Harriet Bazley == Loyaulte me lie == The fact that you're paranoid doesn't mean they're NOT out to get you. ___ netsurf-users mailing list -- netsurf-users@netsurf-browser.org To unsubscribe send an email to netsurf-users-le...@netsurf-browser.org
Re: Using Netsurf cookies with wget
On 20 Jan 2022 as I do recall, simon_sm...@zen.co.uk wrote: [snip] > > The cookie part is probably a red herring. The conventional approach > would be to use the wget tools to fetch the login page and send > username and password using the features wget has built-in. Hey > presto, now you're logged in, via wget, and you should be able to get > the rest of the stuff you're after. I've tried "wget --ask-password URL", which prompts me for the password but then redirects to fetch the login page as if I were not logged in, and "wget --user=USER --password=PASSWORD URL", which also redirects to the login page instead of retrieving the one I asked for. I've tried fetching the login page directly using --user and --pass, but it just fetches the 'please log in' prompt instead of the 'you are already signed in' prompt. I've tried using the --save-cookies option to save any cookies generated by 'logging in', and it just saves a blank "generated by Wget" file with no data in it. I've tried --load-cookies=SCSI::SSD.$.!BOOT.Choices.WWW.NetSurf.Cookies on the offchance, but that didn't work either -- Harriet Bazley == Loyaulte me lie == My opinions may have changed, but not the fact that I am right. ___ netsurf-users mailing list -- netsurf-users@netsurf-browser.org To unsubscribe send an email to netsurf-users-le...@netsurf-browser.org
Re: Using Netsurf cookies with wget
On 2022-01-20 06:46 PM, "Harriet Bazley" wrote: > On 20 Jan 2022 as I do recall, > cj wrote: > > > In article <88ea7bad59.harr...@bazleyfamily.co.uk>, > >Harriet Bazley wrote: > > > How can I use Netsurf's cookie file with wget to retrieve a web page > > > that is only accessible to logged-in users > > > > wget --help shows there are commands to send user names and passwords > > when downloading, ftping etc. Have you tried that? > > > I'm not sure how that would work, unless I fetched the log-in page > first? The stats page doesn't request a password - so far as I can > tell it just doesn't let you fetch it if the relevant browser cookie > isn't present. The cookie part is probably a red herring. The conventional approach would be to use the wget tools to fetch the login page and send username and password using the features wget has built-in. Hey presto, now you're logged in, via wget, and you should be able to get the rest of the stuff you're after. If this web site really was using a cookie for 'authentication', then in theory you should be able to send the cookie to someone else, they could put it on their computer, and then access the 'protected' area without logging in at all. That makes no sense. Try the tools wget already provides - unless it genuinely is a highly eccentric website, wget should be sufficient. -- Simon Smith [via webmail] ___ netsurf-users mailing list -- netsurf-users@netsurf-browser.org To unsubscribe send an email to netsurf-users-le...@netsurf-browser.org
Re: Using Netsurf cookies with wget
On 20 Jan 2022 as I do recall, cj wrote: > In article <88ea7bad59.harr...@bazleyfamily.co.uk>, >Harriet Bazley wrote: > > How can I use Netsurf's cookie file with wget to retrieve a web page > > that is only accessible to logged-in users > > wget --help shows there are commands to send user names and passwords > when downloading, ftping etc. Have you tried that? > I'm not sure how that would work, unless I fetched the log-in page first? The stats page doesn't request a password - so far as I can tell it just doesn't let you fetch it if the relevant browser cookie isn't present. -- Harriet Bazley == Loyaulte me lie == Cole's Law: Thinly sliced cabbage. ___ netsurf-users mailing list -- netsurf-users@netsurf-browser.org To unsubscribe send an email to netsurf-users-le...@netsurf-browser.org
Re: Using Netsurf cookies with wget
In article <88ea7bad59.harr...@bazleyfamily.co.uk>, Harriet Bazley wrote: > How can I use Netsurf's cookie file with wget to retrieve a web page > that is only accessible to logged-in users wget --help shows there are commands to send user names and passwords when downloading, ftping etc. Have you tried that? -- Chris Johnson Edinburgh ___ netsurf-users mailing list -- netsurf-users@netsurf-browser.org To unsubscribe send an email to netsurf-users-le...@netsurf-browser.org
Using Netsurf cookies with wget
How can I use Netsurf's cookie file with wget to retrieve a web page that is only accessible to logged-in users - so that it's visible when I visit that URL with the browser, but blocked if I try to fetch it with wget for local processing? Of course I can simply save the displayed page out of Netsurf manually, which is what I've been doing for test purposes, but it would be nice to be able to automate the script slightly further. -- Harriet Bazley == Loyaulte me lie == I like work; it fascinates me; I can sit and look at it for hours. ___ netsurf-users mailing list -- netsurf-users@netsurf-browser.org To unsubscribe send an email to netsurf-users-le...@netsurf-browser.org