[racket-users] html parsing library does not handle 'article' tags -- any solutions?

David Storrs Thu, 07 Jan 2016 12:13:49 -0800

Hi folks,

I'm using the html and xml libraries to parse a page that includes the
following HTML:


<div class="messageInfo primaryContent">
<div class="messageContent">
<article>
<blockquote class="messageText SelectQuoteContainer ugc baseHtml">
Message text here <br>
</blockquote>
</article>
</div>

When I parse this, the 'article' tag simply isn't parsed -- it lists the
contents of the messageContent div as just a series of PCDATA statements
containing "\n"

Is there a way to extend the library, or do I need to switch to a different
parser?

Dave

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

[racket-users] html parsing library does not handle 'article' tags -- any solutions?

Reply via email to