Thank you, everyone.

On Wednesday, February 2, 2022 at 3:22:53 PM UTC-5 

> Assume I've been cursed to scrape HTML. If I convert the pages to Hickory 
> I end up with a big mass of data which, sadly, lacks many "class" or "id"s 
> that would let me easily pick out the data I need. However, for the most 
> part, the only thing I really need off this page is the CVEs, which look 
> like this:
> CVE-2021-40539
> I'm thinking I might write regex against the plain text of the page, but 
> I'm also curious, is it common to take something like Hiccup or Hickory or 
> a zipper and run regex through it? If yes, how is that done? 
> A small part of the data looks like this:
>                 :content
>                 [{:type :element,
>                   :attrs
>                   {:class "tip-intro", :style "font-size: 15px;"},
>                   :tag :p,
>                   :content
>                   [{:type :element,
>                     :attrs nil,
>                     :tag :em,
>                     :content
>                     ["This Joint Cybersecurity Advisory uses the MITRE 
> Adversarial Tactics, Techniques, and Common Knowledge (ATT&CK®) framework, 
> Version 8. See the "
>                      {:type :element,
>                       :attrs
>                       {:href
>                        "
>                       :tag :a,
>                       :content ["ATT&CK for Enterprise"]}
>                      " for  referenced threat actor tactics and for 
> techniques."]}]}
>                  "\n\n"
>                  {:type :element,
>                   :attrs nil,
>                   :tag :p,
>                   :content
>                   ["This joint advisory is the result of analytic efforts 
> between the Federal Bureau of Investigation (FBI), United States Coast 
> Guard Cyber Command (CGCYBER), and the Cybersecurity and Infrastructure 
> Security Agency (CISA) to highlight the cyber threat associated with active 
> exploitation of a newly identified vulnerability (CVE-2021-40539) in 
> ManageEngine ADSelfService Plus—a self-service password management and 
> single sign-on solution."]}
>                  "\n\n"
>                  {:type :element,
>                   :attrs nil,
>                   :tag :p,
>                   :content
>                   ["CVE-2021-40539, rated critical by the Common 
> Vulnerability Scoring System (CVSS), is an authentication bypass 
> vulnerability affecting representational state transfer (REST) application 
> programming interface (API) URLs that could enable remote code execution. 
> The FBI, CISA, and CGCYBER assess that advanced persistent threat (APT) 
> cyber actors are likely among those exploiting the vulnerability. The 
> exploitation of ManageEngine ADSelfService Plus poses a serious risk to 
> critical infrastructure companies, U.S.-cleared defense contractors, 
> academic institutions, and other entities that use the software. Successful 
> exploitation of the vulnerability allows an attacker to place webshells, 
> which enable the adversary to conduct post-exploitation activities, such as 
> compromising administrator credentials, conducting lateral movement, and 
> exfiltrating registry hives and Active Directory files."]}
>                  "\n\n"

You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
For more options, visit this group at
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
To view this discussion on the web visit

Reply via email to