[Mojolicious] Re: Scraper , simple script

mimosinnet Fri, 20 Jan 2017 13:03:42 -0800

Ups...! Your right! Thanks! This would be the right script:

#!/usr/bin/env perl
use Modern::Perl;
use Mojo::UserAgent;
use Mojo::DOM;



my $ua = Mojo::UserAgent->new;
my $dom = $ua->get('http://mojolicious.org/')->res->dom;


foreach my $a_href ( $dom->find('a[href]')->each ) {
 say $a_href;
}

Cheers!

El divendres, 20 gener de 2017 16:23:04 UTC+1, Joel Berger va escriure:
>
> You don't need to do $dom = Mojo::DOM->new($page); at all. ->res->dom 
> returns an instance of Mojo::DOM, therefore your $page was already a 
> Mojo::DOM. What you did next was then re-serialize the DOM object back to 
> an HTML string and then REparse it! Of course it works, but it ... lets say 
> ... not very efficient :-P
>
> On Thursday, January 19, 2017 at 4:18:34 PM UTC-6, mimosinnet wrote:
>>
>> Your question has inspired me this code, based on Joel Berger post 
>> <http://blogs.perl.org/users/joel_berger/2012/05/using-mojodom.html>, 
>> that finds *a href *tags in a page (I am also a beginner ;-) ). 
>>
>> #!/usr/bin/env perl
>> use Modern::Perl;
>> use Mojo::UserAgent;
>> use Mojo::DOM;
>>
>> my $ua = Mojo::UserAgent->new;
>> my $page = $ua->get('http://mojolicious.org/')->res->dom;
>> my $dom = Mojo::DOM->new($page);
>>
>> foreach my $a_href ( $dom->find('a[href]')->each ) {
>>  say $a_href;
>> }
>>
>> I am sorry it does not answer your question, but I hope it helps. 
>>
>> Cheers! 
>>
>> El dijous, 19 gener de 2017 9:58:42 UTC+1, Tin Woodman va escriure:
>>>
>>> Hi , i m beginner in Mojo. I have some expirience of php. Sorry if i use 
>>> it for more understanding.
>>> I m develop for self simple script scrapper.
>>> Part 1; Scrape root elements
>>> I have a page with a urls and titles in table html.
>>> my $res=
>>>   Mojo::UserAgent->new->get('http://example.com')->res->dom;
>>>
>>> I need grab from this page all elements, 
>>> But i find in internet only this example . There i m find only text of 
>>> element
>>> my $texts =
>>>   $res->find('.tdcont td a')->map(sub { $_->text });
>>>
>>> I need a create array or something else . Maybe csv file. 
>>> how i can create statement, when i need save from one element 2 or more 
>>> data. For example
>>> url;title;
>>> url2;title2;
>>> or in php(sorry)
>>> array(array('url','title),array('url2','title2))
>>>
>>> Part 2; Scrape child elements
>>> When exists array or something else data . I need run another scrapp in 
>>> loop .
>>> For example(php):
>>> foreach($data as $item) {
>>>   $url = $item[0];
>>>   $title = $item[1];
>>>   // there i need a parse elements
>>>   // go to url
>>>   doParseChild();
>>>   // there i need a exmaple how to check - exist element or not on page 
>>>   if (pagination exists) {
>>>     //foreach ($pages as $page) {
>>>     doParseChild();
>>>   }
>>>   }
>>> }
>>>
>>> When first iteration of loop ended , go to second iteraion . etc..
>>>
>>> Please help me , at least for a general understanding. Sorry for bad 
>>> english and php .
>>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"Mojolicious" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/mojolicious.
For more options, visit https://groups.google.com/d/optout.

[Mojolicious] Re: Scraper , simple script

Reply via email to