TLDR; Are there any HTML parsing libs for Racket? Do they support "selecting" elements based on a set of conditions (e.g. CSS selectors or XPath selectos)?
Suppose I need to look for a set of HTML elements with certain conditions in a given URL (i.e. scraping). Now, I can fetch the contents, easily, using the following: <pre><code> (define html-as-string (url:call/input-url (url:string->url SOME-URL) url:get-pure-port (lambda (in-port) (port->string in-port)) </code></pre> However I'm stuck here at the "selection". Normally, one would use CSS selectors (i.e. jQuery style) or XPath but I'm failing to find the library in Racket's ecosystem. I tried the approach `(se-path*/list SELECTOR (string->xexpr html-as-string))` but obviously this is not what I was looking for: I can't "select" tags (i.e. retrieve the matched tags) or even put conditions on `SELECTOR`. So I'm looking for a library which can parse a given string into HTML and be able to "select" elements out of the HTML document, for example by using CSS selector syntax --any syntax would do as long as I can actually select what I want :-) I'd appreciate any help/hint. -- Bahman Movaqar http://BahmanM.com - https://twitter.com/bahman__m https://github.com/bahmanm - https://gist.github.com/bahmanm PGP Key ID: 0x6AB5BD68 (keyserver2.pgp.com) -- You received this message because you are subscribed to the Google Groups "Racket Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to racket-users+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
signature.asc
Description: OpenPGP digital signature