Hi Rajeswari

one way can be passing the details which you wan to percolate from shows 
page to seasons page and seasons page to episodes page.

For eg. you wont be crawling to all urls on the shows page, you would be 
crawling url of a certain type only.For this you will be yielding new 
requests for seasons pages from shows page.
 if rows has all the urls for seasons then
you can do something like

        for row in rows:       
            yield 
Request(url=row[0],meta={'show':row[2]},callback=self.parse)

    def parse(self, response):
            print(response.meta['show']) # prints the shows name

in this way you will be passing the name of the show while calling request 
for each season url. Similarly you can do while crawling episodes on 
seasons page.

I hope this helps.
            
in this way all the seasons
On Tuesday, April 18, 2017 at 11:32:43 AM UTC+5:30, Rajeswari Rajkumar 
wrote:
>
> Is there way the relationship between pages can be maintained. For eg: we 
> need to crawl page having shows then Seasons and episodes, but need to 
> maintain which show and season and episode relation in all the stages. 
>
> Thanks, 
> Rajeshwari
>

-- 
You received this message because you are subscribed to the Google Groups 
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to scrapy-users+unsubscr...@googlegroups.com.
To post to this group, send email to scrapy-users@googlegroups.com.
Visit this group at https://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.

Reply via email to