Am Freitag, 22. Juli 2016 16:28:07 UTC+2 schrieb Scott Wiersdorf:
>
> You can use map() to do that:
>
> $dom->find('div')->map(sub { state $i = 0; say $i++ . " $_" });
>Right, that would give me the proper sequence for all <div>s. And then I would have another sequence for all <h1>s, and another for all <td>s and another for all <p>s, and so on. What I need is one sequence which gives me the right order of all tags I am looking at. Cheers, Ekki > > > Scott > > On Friday, July 22, 2016 at 1:44:45 AM UTC-6, Ekki Plicht wrote: >> >> I use Mojo::DOM for various web scraping and analysis, very easy, very >> fast, nice. >> >> Usually I am interested in only a few tags, not the entire dom. So I use >> ->find() to select the interesting nodes, check some facts on the found >> nodes and store the results in a database for later viewing. >> >> For this later viewing I would love to retain the sequence in which the >> nodes are in the source. Unfortunately all information about the sequence >> of tags is lost when I use ->find(). >> >> The parser I used to use before (HMTL::HTML5::Parser) does provide a >> line-number function for each element. This is enough for me to retain the >> sequence of nodes, the absolute position is not important. >> >> Do you think it would be possible to extend Mojo::DOM to provide a line >> number for each element? I understand this this might be insufficient for >> the situation where many tags are on the same line, but that's too bad >> then... >> >> TIA, >> Ekki >> >> >> >> >> -- You received this message because you are subscribed to the Google Groups "Mojolicious" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/mojolicious. For more options, visit https://groups.google.com/d/optout.
