[Edbrowse-dev] hash tags

Karl Dahlke Sat, 24 Jan 2015 05:13:32 -0800

I pushed a change that definitely fixes www.net10wireless.com,
and I hope doesn't break anything else.
This has to do with the format of a url,
which I thought I understood, and seemed to work properly for 10 years,
but obviously I don't understand it completely.
wikipedia url doesn't give the complete format either, though it does
reference the rfc, and I suppose I need to read that some day.
Meantime, net10wireless has urls with an unencoded #
in the middle, which I thought was illegal, but I guess it's not.
www.net10wireless.com/#/advantages/#/plans/unlimited-monthly-plans
Guess there's nothing wrogn with that, and # does not mean a tag
somewhere inside the document.
It's just part of the file name or resource locator.
But it does if # appears after the last slash.
http://this.that.com/file.html#main_content


This url regrocking affects at least 4 files.
I was *assuming* the url format all over the place.
So I had to make small changes all over the place.
One thing I did this time around that may help in the future
is to write findHash() in url.c.
This isolates the logic of finding the # sign and determining
if it does indeed represent a hash tag, and should thus not be
send to the http server, but should instead be used after the fact
to find a location within the fetched document.
If my logic is still wrong, or still too simplistic,
which it may well be,
then at least I only need change it in one place.

This change affects resolveURL(), but only slightly.
Sorry Chris if you were working on the same routine,
for the data_uri stuff, and if our work collides.
I didn't expect our seemingly distinct efforts to intersect,
but they might.

Karl Dahlke
_______________________________________________
Edbrowse-dev mailing list
[email protected]
http://lists.the-brannons.com/mailman/listinfo/edbrowse-dev

[Edbrowse-dev] hash tags

Reply via email to