TLDR; Are there any HTML parsing libs for Racket? Do they support
"selecting" elements based on a set of conditions (e.g. CSS selectors or
XPath selectos)?

Suppose I need to look for a set of HTML elements with certain
conditions in a given URL (i.e. scraping).
Now, I can fetch the contents, easily, using the following:

<pre><code>

    (define html-as-string
      (url:call/input-url
       (url:string->url SOME-URL)
       url:get-pure-port
       (lambda (in-port)
         (port->string in-port))

</code></pre>

However I'm stuck here at the "selection". Normally, one would use CSS
selectors (i.e. jQuery style) or XPath but I'm failing to find the
library in Racket's ecosystem. I tried the approach `(se-path*/list
SELECTOR (string->xexpr html-as-string))` but obviously this is not what
I was looking for: I can't "select" tags (i.e. retrieve the matched
tags) or even put conditions on `SELECTOR`.

So I'm looking for a library which can parse a given string into HTML
and be able to "select" elements out of the HTML document, for example
by using CSS selector syntax --any syntax would do as long as I can
actually select what I want :-)

I'd appreciate any help/hint.

-- 
Bahman Movaqar

http://BahmanM.com - https://twitter.com/bahman__m
https://github.com/bahmanm - https://gist.github.com/bahmanm
PGP Key ID: 0x6AB5BD68 (keyserver2.pgp.com)


-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to