Am Freitag, 22. Juli 2016 16:28:07 UTC+2 schrieb Scott Wiersdorf:
>
> You can use map() to do that:
>
> $dom->find('div')->map(sub { state $i = 0; say $i++ . " $_" });
>

Right, that would give me the proper sequence for all <div>s. 
And then I would have another sequence for all <h1>s, and another for all 
<td>s and another for all <p>s, and so on.

What I need is one sequence which gives me the right order of all tags I am 
looking at.

Cheers,
Ekki






 

>
>
> Scott
>
> On Friday, July 22, 2016 at 1:44:45 AM UTC-6, Ekki Plicht wrote:
>>
>> I use Mojo::DOM for various web scraping and analysis, very easy, very 
>> fast, nice.
>>
>> Usually I am interested in only a few tags, not the entire dom. So I use 
>> ->find() to select the interesting nodes, check some facts on the found 
>> nodes and store the results in a database for later viewing.
>>
>> For this later viewing I would love to retain the sequence in which the 
>> nodes are in the source. Unfortunately all information about the sequence 
>> of tags is lost when I use ->find(). 
>>
>> The parser I used to use before (HMTL::HTML5::Parser) does provide a 
>> line-number function for each element. This is enough for me to retain the 
>> sequence of nodes, the absolute position is not important.
>>
>> Do you think it would be possible to extend Mojo::DOM to provide a line 
>> number for each element? I understand this this might be insufficient for 
>> the situation where many tags are on the same line, but that's too bad 
>> then... 
>>
>> TIA,
>> Ekki
>>
>>
>>
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"Mojolicious" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/mojolicious.
For more options, visit https://groups.google.com/d/optout.

Reply via email to