Re: Maintain relationships in scrapy

Rajeswari Rajkumar Wed, 26 Apr 2017 06:57:59 -0700

Hi Palash,

Thanks for the inputs. Let me try and reach out to you in case of further 
queries.




On Tuesday, April 18, 2017 at 3:49:23 PM UTC+5:30, Palash Kulshrestha wrote:
>
> Hi Rajeswari
>
> one way can be passing the details which you wan to percolate from shows 
> page to seasons page and seasons page to episodes page.
>
> For eg. you wont be crawling to all urls on the shows page, you would be 
> crawling url of a certain type only.For this you will be yielding new 
> requests for seasons pages from shows page.
>  if rows has all the urls for seasons then
> you can do something like
>
>         for row in rows:       
>             yield 
> Request(url=row[0],meta={'show':row[2]},callback=self.parse)
>
>     def parse(self, response):
>             print(response.meta['show']) # prints the shows name
>
> in this way you will be passing the name of the show while calling request 
> for each season url. Similarly you can do while crawling episodes on 
> seasons page.
>
> I hope this helps.
>             
> in this way all the seasons
> On Tuesday, April 18, 2017 at 11:32:43 AM UTC+5:30, Rajeswari Rajkumar 
> wrote:
>>
>> Is there way the relationship between pages can be maintained. For eg: we 
>> need to crawl page having shows then Seasons and episodes, but need to 
>> maintain which show and season and episode relation in all the stages. 
>>
>> Thanks, 
>> Rajeshwari
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to scrapy-users+unsubscr...@googlegroups.com.
To post to this group, send email to scrapy-users@googlegroups.com.
Visit this group at https://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.

Re: Maintain relationships in scrapy

Reply via email to