Re: Using Netsurf cookies with wget

2022-01-21 Thread Jeremy Nicoll - ml netsurf

On 2022-01-21 13:17, Harriet Bazley wrote:

On 21 Jan 2022 as I do recall,
  Jeremy Nicoll - ml netsurf  wrote:


Also, the form when used on a webpage, sets variable 
"user_remember_me"

and (I'm not completely sure) maybe also the submit button part sets
something - I don't know why it defines a name and a value - the 
latter

is the text on the button but what's "name" for?


That's just a 'Remember Me' (and don't ask for log-in again but 
redirect

to the stats page) button.


It's not just that.  The form code has

 
 id="user_remember_me"/>


(odd that there's two such definitions, but maybe JS on the page hides 
one
of them completely).  Nevertheless one would expect the POST request 
sent

to the server to contain "_remember_me=0"  (or =1)  and maybe the
server ignore requests that arrive without all the required parameters.





I'm afraid I don't know enough about HTML forms to understand exactly
what the Submit button is doing, but the page only prompts for two user
inputs in this form.


Yes, but type=hidden means there's entries in the form that do do things
even if you don't see them.  You should maybe read this, and the pages
around it that discuss various aspects of forms

 https://www.w3schools.com/tags/att_input_type_hidden.asp


I see from one of your other posts that you've found that pages have
unique content.  I'd guess that the server is using (eg) php session ids
& it's sending the id of the session it's established for you to the 
page
so the page can send that back on successive requests.  The id (and 
maybe

other values) would likely be used as a key in a database on the server
that's storing a whole set of user- & session- specific information.


--
Jeremy Nicoll - my opinions are my own
___
netsurf-users mailing list -- netsurf-users@netsurf-browser.org
To unsubscribe send an email to netsurf-users-le...@netsurf-browser.org


Re: Using Netsurf cookies with wget

2022-01-21 Thread Mouse
>> there might be hidden input fields, [...]
> Ah - I think I may have spotted something.  The actual  tag at
> the start contains an 'authenticity token':

[reformatted for readability]
>accept-charset="UTF-8" method="post">
> 
>  value="VfGGu3jwjsf6xNQmlmuu3Qkgc1BsZzgu0ikhluwqmVHU9RFVQQUUANuaza9HFgXr_c71SiKwBLz8XA8bQ4hSOA"/>
>  
> [...]
>   

There's also that "utf8" field.  Amusingly, U+2713, from the Dingbats
range, is CHECK MARK.  Of course, who knows what the server would do if
that field weren't there or had a different value, such as maybe U+2718
(an X mark, called HEAVY BALLOT X) or U+00AC (NOT SIGN)

> And this value is different for every copy of the page served, which
> presumably means that it is, by design, impossible for anyone to log
> in 'blind' with user name and password alone

Likely.  Quite possibly done as a defense against automated
password-guessing bots.  Unfortunately, with the current state of
Internet governance, such defenses are close to essential.

The token looks like URL-safe base64.  Decoding it under that
assumption produces random-looking binary data, so I suspect it is (as
should be) being done with proper crypto.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B
___
netsurf-users mailing list -- netsurf-users@netsurf-browser.org
To unsubscribe send an email to netsurf-users-le...@netsurf-browser.org


Re: Using Netsurf cookies with wget

2022-01-21 Thread Harriet Bazley
On 21 Jan 2022 as I do recall,
  Michael Drake  wrote:

> 
> On 21/01/2022 13:17, Harriet Bazley wrote:
> 
> > Then the browser history getting updated with the new page and a
> > FETCH_REDIRECT from the login page to the user home page.   No record of
> > what data was sent to the server, that I can see.
> 
> In your Choices file, try setting:
> 
> suppress_curl_debug:0
> 
That doesn't seem to make any difference, despite quitting and
restarting (so far as I can tell by eye, the two log files are identical
around that area in terms of what gets logged following the 'keydown'
events).

I tried it using the search box form, in order to simplify the amount of
logging-in, quitting, and logging-out I had to do, and you just get a
fetch_curl_setup with the URL of the page being fetched - which in the
case of the search form, contains the 'work_search?query=' data encoded
as part of the URL itself.   Unfortunately in the case of the login form
the variables submitted are not present in the URL of the resulting
page!


-- 
Harriet Bazley ==  Loyaulte me lie ==

Radioactive cats have 18 half-lives.
___
netsurf-users mailing list -- netsurf-users@netsurf-browser.org
To unsubscribe send an email to netsurf-users-le...@netsurf-browser.org


Re: Using Netsurf cookies with wget

2022-01-21 Thread Harriet Bazley
On 21 Jan 2022 as I do recall,
  Mouse  wrote:

> > I'm afraid I don't know enough about HTML forms to understand exactly
> > what the Submit button is doing,
> 
> HTML forms, I think, just generate a POST when submitting.  But just
> prompting for two visible inputs doesn't mean there are only two
> input fields in the POST; there might be hidden input fields, fields
> which aren't displayed, being there just to pass values through from
> page generation to form submission.  Read the HTML source for the form
> if you want to check that possibility.
> 

Ah - I think I may have spotted something.  The actual  tag at the
start contains an 'authenticity token':



  
  
User name or email:

Password:

Remember me

Submit

  

  



And this value is different for every copy of the page served, which
presumably means that it is, by design, impossible for anyone to log in
'blind' with user name and password alone

-- 
Harriet Bazley ==  Loyaulte me lie ==

The best laid schemes o' mice and men gang oft a-gley.
___
netsurf-users mailing list -- netsurf-users@netsurf-browser.org
To unsubscribe send an email to netsurf-users-le...@netsurf-browser.org


Re: Using Netsurf cookies with wget

2022-01-21 Thread Michael Drake


On 21/01/2022 13:17, Harriet Bazley wrote:

> Then the browser history getting updated with the new page and a
> FETCH_REDIRECT from the login page to the user home page.   No record of
> what data was sent to the server, that I can see.

In your Choices file, try setting:

suppress_curl_debug:0

-- 
Michael Drake https://www.codethink.co.uk/
___
netsurf-users mailing list -- netsurf-users@netsurf-browser.org
To unsubscribe send an email to netsurf-users-le...@netsurf-browser.org


Re: Using Netsurf cookies with wget

2022-01-21 Thread Mouse
> I'm afraid I don't know enough about HTML forms to understand exactly
> what the Submit button is doing,

HTML forms, I think, just generate a POST when submitting.  But just
prompting for two visible inputs doesn't mean there are only two
input fields in the POST; there might be hidden input fields, fields
which aren't displayed, being there just to pass values through from
page generation to form submission.  Read the HTML source for the form
if you want to check that possibility.

Mouse
___
netsurf-users mailing list -- netsurf-users@netsurf-browser.org
To unsubscribe send an email to netsurf-users-le...@netsurf-browser.org


Re: Using Netsurf cookies with wget

2022-01-21 Thread Harriet Bazley
On 21 Jan 2022 as I do recall,
  Jeremy Nicoll - ml netsurf  wrote:

> On 2022-01-21 00:55, Harriet Bazley wrote:
> 
> > I also tried using --post-data 'user-Login=USER_password=PASSWORD'
> > with no result,
> 
> That /may/ be because you weren't careful enough coding that. According
> to the form code, the login variable isn't called "user-Login" but
> instead (2 differences) "user_login".

No - unfortunately it looks as if that was just sloppy transcription.
The actual data in my test script is "user_login".  :-(


> 
> Also, the form when used on a webpage, sets variable "user_remember_me"
> and (I'm not completely sure) maybe also the submit button part sets
> something - I don't know why it defines a name and a value - the latter
> is the text on the button but what's "name" for?

That's just a 'Remember Me' (and don't ask for log-in again but redirect
to the stats page) button.
I'm afraid I don't know enough about HTML forms to understand exactly
what the Submit button is doing, but the page only prompts for two user
inputs in this form.

(There are other forms on that page, e.g. a search box to search the
site, but I'm assuming that provided I supply the correct id/input pairs
the server will act on the data from the correct form - the trouble is
that I understand very little of what I am actually doing here and am
basically poking around at random.)

> 
> 
> I haven't used Netsurf for ages, but in Firefox one can see in its logs
> what URLs are built and sent back to a server - indeed in developer
> tools one can see the "curl" equivalent command to each communication
> with a server.  Really the simplest way to recreate something is to
> look at what the browser actually does.

Yes, that was the recommendation on the Web page that suggested using
the --post-data method:  simply 'turn on development tools'.
Unfortunately so far as I know Netsurf doesn't provide access to sniff
around at that level.

You can't read the log in !Scrap while the browser is actually running,
which makes looking at that a little tricky, and it gives a *lot* of
data that is nothing to do with curl fetches.   But all I'm seeing is
HTTP status codes, e.g.

(16.33) [INFO netsurf] content/fetchers/curl.c:1200 
fetch_curl_process_headers: HTTP status code 200

And then way, way down the page

(17.46) [INFO netsurf] content/handlers/html/html.c:126 
fire_generic_dom_event: Dispatching 'click' against 0x55ef7118
(18.69) [INFO netsurf] content/handlers/html/html.c:203 
fire_dom_keyboard_event: Dispatching 'keydown' against 0x55d2d690
(18.96) [INFO netsurf] content/handlers/html/html.c:203 
fire_dom_keyboard_event: Dispatching 'keydown' against 0x55d2d690

etc., which is me typing passwords.

Lots of things being removed on the submission of the form:

(20.03) [INFO netsurf] content/handlers/html/html.c:1192 html_destroy: 
content 0x55a955c0
(20.03) [INFO netsurf] content/handlers/html/form.c:1460 form_free_control: 
Control:0x55e20698 name:0x55e1ca48 value:0x55b1e9f0 initial:0x0
(20.03) [INFO netsurf] content/handlers/html/form.c:1460 form_free_control: 
Control:0x55e6f2e0 name:0x55e6f2c8 value:0x55e6f348 initial:0x55e6ea88
(20.03) [INFO netsurf] content/handlers/html/form.c:1460 form_free_control: 
Control:0x55e42780 name:0x0 value:0x55e42000 initial:0x0
(20.03) [INFO netsurf] content/content.c:695 content_remove_user: content 
file:///NetSurf:/Resources/CSS (0x55b62098), user 0x16c498 0x55a92c28

Then the browser history getting updated with the new page and a
FETCH_REDIRECT from the login page to the user home page.   No record of
what data was sent to the server, that I can see.

But as I said, I understand very little of what is going on.

-- 
Harriet Bazley ==  Loyaulte me lie ==

Death is nature's way of telling you to slow down.
___
netsurf-users mailing list -- netsurf-users@netsurf-browser.org
To unsubscribe send an email to netsurf-users-le...@netsurf-browser.org


Re: Using Netsurf cookies with wget

2022-01-21 Thread Jeremy Nicoll - ml netsurf

On 2022-01-21 00:55, Harriet Bazley wrote:


I also tried using --post-data 'user-Login=USER_password=PASSWORD'
with no result,


That /may/ be because you weren't careful enough coding that. According
to the form code, the login variable isn't called "user-Login" but
instead (2 differences) "user_login".

Also, the form when used on a webpage, sets variable "user_remember_me"
and (I'm not completely sure) maybe also the submit button part sets
something - I don't know why it defines a name and a value - the latter
is the text on the button but what's "name" for?


I haven't used Netsurf for ages, but in Firefox one can see in its logs
what URLs are built and sent back to a server - indeed in developer 
tools

one can see the "curl" equivalent command to each communication with a
server.  Really the simplest way to recreate something is to look at 
what

the browser actually does.


--
Jeremy Nicoll - my opinions are my own
___
netsurf-users mailing list -- netsurf-users@netsurf-browser.org
To unsubscribe send an email to netsurf-users-le...@netsurf-browser.org


Re: Using Netsurf cookies with wget

2022-01-20 Thread Harriet Bazley
On 20 Jan 2022 as I do recall,
  Harriet Bazley  wrote:

> On 20 Jan 2022 as I do recall,
>   simon_sm...@zen.co.uk wrote:
> 
> [snip]
> 
> >
> > The cookie part is probably a red herring. The conventional approach
> > would be to use the wget tools to fetch the login page and send
> > username and password using the features wget has built-in. Hey
> > presto, now you're logged in, via wget, and you should be able to get
> > the rest of the stuff you're after.
> 
> I've tried "wget --ask-password URL", which prompts me for the password
> but then redirects to fetch the login page as if I were not logged in,
> and "wget --user=USER --password=PASSWORD URL", which also redirects to
> the login page instead of retrieving the one I asked for.
> 
> I've tried fetching the login page directly using --user and --pass, but
> it just fetches the 'please log in' prompt instead of the 'you are
> already signed in' prompt.
> 

I also tried using --post-data 'user-Login=USER_password=PASSWORD'
with no result, where the form in the log-in page is as follows:


User name or email:

Password:

Remember me

Submit

  



-- 
Harriet Bazley ==  Loyaulte me lie ==

The fact that you're paranoid doesn't mean they're NOT out to get you.
___
netsurf-users mailing list -- netsurf-users@netsurf-browser.org
To unsubscribe send an email to netsurf-users-le...@netsurf-browser.org


Re: Using Netsurf cookies with wget

2022-01-20 Thread Harriet Bazley
On 20 Jan 2022 as I do recall,
  simon_sm...@zen.co.uk wrote:

[snip]

>
> The cookie part is probably a red herring. The conventional approach
> would be to use the wget tools to fetch the login page and send
> username and password using the features wget has built-in. Hey
> presto, now you're logged in, via wget, and you should be able to get
> the rest of the stuff you're after.

I've tried "wget --ask-password URL", which prompts me for the password
but then redirects to fetch the login page as if I were not logged in,
and "wget --user=USER --password=PASSWORD URL", which also redirects to
the login page instead of retrieving the one I asked for.

I've tried fetching the login page directly using --user and --pass, but
it just fetches the 'please log in' prompt instead of the 'you are
already signed in' prompt.

I've tried using the --save-cookies option to save any cookies generated
by 'logging in', and it just saves a blank "generated by Wget" file with
no data in it.

I've tried --load-cookies=SCSI::SSD.$.!BOOT.Choices.WWW.NetSurf.Cookies
on the offchance, but that didn't work either


-- 
Harriet Bazley ==  Loyaulte me lie ==

My opinions may have changed, but not the fact that I am right.
___
netsurf-users mailing list -- netsurf-users@netsurf-browser.org
To unsubscribe send an email to netsurf-users-le...@netsurf-browser.org


Re: Using Netsurf cookies with wget

2022-01-20 Thread simon_smith
On 2022-01-20 06:46 PM, "Harriet Bazley"  wrote:
> On 20 Jan 2022 as I do recall,
>   cj  wrote:
> 
> > In article <88ea7bad59.harr...@bazleyfamily.co.uk>,
> >Harriet Bazley  wrote:
> > > How can I use Netsurf's cookie file with wget to retrieve a web page
> > > that is only accessible to logged-in users
> > 
> > wget --help shows there are commands to send user names and passwords
> > when downloading, ftping etc. Have you tried that?
> > 
> I'm not sure how that would work, unless I fetched the log-in page
> first?  The stats page doesn't request a password - so far as I can
> tell it just doesn't let you fetch it if the relevant browser cookie
> isn't present.

The cookie part is probably a red herring. The conventional approach would
be to use the wget tools to fetch the login page and send username and
password using the features wget has built-in. Hey presto, now you're logged
in, via wget, and you should be able to get the rest of the stuff you're after.

If this web site really was using a cookie for 'authentication', then in theory 
you should be able to send the cookie to someone else, they could put it on
their computer, and then access the 'protected' area without logging in at all.
That makes no sense.

Try the tools wget already provides - unless it genuinely is a highly eccentric
website, wget should be sufficient.

-- 
Simon Smith [via webmail]




___
netsurf-users mailing list -- netsurf-users@netsurf-browser.org
To unsubscribe send an email to netsurf-users-le...@netsurf-browser.org


Re: Using Netsurf cookies with wget

2022-01-20 Thread Harriet Bazley
On 20 Jan 2022 as I do recall,
  cj  wrote:

> In article <88ea7bad59.harr...@bazleyfamily.co.uk>,
>Harriet Bazley  wrote:
> > How can I use Netsurf's cookie file with wget to retrieve a web page
> > that is only accessible to logged-in users
> 
> wget --help shows there are commands to send user names and passwords
> when downloading, ftping etc. Have you tried that?
> 
I'm not sure how that would work, unless I fetched the log-in page
first?  The stats page doesn't request a password - so far as I can
tell it just doesn't let you fetch it if the relevant browser cookie
isn't present.

-- 
Harriet Bazley ==  Loyaulte me lie ==

Cole's Law:  Thinly sliced cabbage.
___
netsurf-users mailing list -- netsurf-users@netsurf-browser.org
To unsubscribe send an email to netsurf-users-le...@netsurf-browser.org


Re: Using Netsurf cookies with wget

2022-01-20 Thread cj
In article <88ea7bad59.harr...@bazleyfamily.co.uk>,
   Harriet Bazley  wrote:
> How can I use Netsurf's cookie file with wget to retrieve a web page
> that is only accessible to logged-in users

wget --help shows there are commands to send user names and passwords
when downloading, ftping etc. Have you tried that?

-- 
Chris Johnson
Edinburgh
___
netsurf-users mailing list -- netsurf-users@netsurf-browser.org
To unsubscribe send an email to netsurf-users-le...@netsurf-browser.org


Using Netsurf cookies with wget

2022-01-19 Thread Harriet Bazley
How can I use Netsurf's cookie file with wget to retrieve a web page
that is only accessible to logged-in users - so that it's visible when I
visit that URL with the browser, but blocked if I try to fetch it with
wget for local processing?

Of course I can simply save the displayed page out of Netsurf manually,
which is what I've been doing for test purposes, but it would be nice to
be able to automate the script slightly further.

-- 
Harriet Bazley ==  Loyaulte me lie ==

I like work; it fascinates me; I can sit and look at it for hours.
___
netsurf-users mailing list -- netsurf-users@netsurf-browser.org
To unsubscribe send an email to netsurf-users-le...@netsurf-browser.org