Rodrigo Arias wrote: > On Mon, Dec 30, 2024 at 05:35:50PM +0100, > a1ex-j7k0xvabl0ielga04la...@public.gmane.org wrote: >> There was an interesting post[1] on HN today about 'curl-impersonate', >> which is a patch[2] to curl which allows it to act like various big >> browsers, bypassing various fingerprinting techniques which would >> otherwise prevent the client from accessing the page. >> >> Looking at the patch, maybe there could be some useful ideas here for >> Dillo to use to load more sites. The SSL library also obviously plays a >> large role, maybe that's something we will need to consider as well. > > I experienced problems with the user-agent being banned, and having to > impersonate Firefox to load some sites. I haven't found yet examples of > this deep fingerprinting for TLS or similar, you? > > In any case, it would be trivial to discern Dillo as we don't support > JS, so it can be banned if they decide so.
I've found that sometimes I go to a webpage and see one of the "enable Javascript to continue" pages in Dillo, then I load the same page in Firefox with NoScript blocking all its scripts, and it comes up fine without running any such Javascript. That could be just the User-Agent header though because I don't try faking that. Rather than add Chrome-faking features to Dillo, maybe this would be an extra application of the Rule-based content manipulation RFC: https://github.com/dillo-browser/rfc/blob/rfc-002/rfc-002-rule-based-content-manipulation.md Make a rule for some sites (or Web server responses?) that has Dillo call curl-impersonate to retrieve a Web page instead of doing it in Dillo? By the way, being a Git failure, I really can't see where that MD document lives. I look at the "rfc" repo via the GitHub website in Dillo and there's just a readme. I clone the repo and I just get a readme. I had to look back to your RFC repo announcement to find that link. I guess they're in separate branches or something but I forget things about Git faster than I learn them and can't be bothered learning how to use branches yet again today. I really think it would be better to list them together somewhere obvious, eg. a new Developer Documentation webpage. I can see from this URL mangling that there are probably only two RFCs so far: https://github.com/dillo-browser/rfc/tree/rfc-001/ (rfc-001-dillo-rfc-documents.md) https://github.com/dillo-browser/rfc/tree/rfc-002/ (rfc-002-rule-based-content-manipulation.md) https://github.com/dillo-browser/rfc/tree/rfc-003/ (404) > In my experiences, it is generally not worth reading the website > that performs this type of discrimination. That's often my approach, but then big offenders are things like government websites which one is obliged to read sometimes. _______________________________________________ Dillo-dev mailing list -- dillo-dev@mailman3.com To unsubscribe send an email to dillo-dev-le...@mailman3.com