This is the module I'm speaking about: https://arsd-official.dpldocs.info/arsd.dom.html

So I have this HTML that not even parseGarbae() can del with:

<a href = "https://hostname.com/?file=foo.png&foo=baa";>G!</a>

There is this spaces between "href" and "=" and "https..." which makes below code fails:


        string html = get(page, client).text;
        auto document = new Document();
        document.parseGarbage(html);
Element attEle = document.querySelector("span[id=link2]");
        Element aEle = attEle.querySelector("a");
string link = aEle.href; // <-- if the href contains space, it return "href" rather the link



let's say the page HTML look like this:

<body bgcolor="#000000">
<font color="yellow">
<h2>
        Hello, dear world!
        <span id="link2">
<a href = "https://hostname.com/?file=foo.png&foo=baa";>G!</a>
        </span>
</h2>
</font>

I know the library author post on this forum often, I hope he see this help somehow

to make it work. But if anyone else know how to fix this, will be very welcome too!

Reply via email to