Hi Palash,
Thanks for the inputs. Let me try and reach out to you in case of further
queries.
On Tuesday, April 18, 2017 at 3:49:23 PM UTC+5:30, Palash Kulshrestha wrote:
>
> Hi Rajeswari
>
> one way can be passing the details which you wan to percolate from shows
> page to seasons page and seasons page to episodes page.
>
> For eg. you wont be crawling to all urls on the shows page, you would be
> crawling url of a certain type only.For this you will be yielding new
> requests for seasons pages from shows page.
> if rows has all the urls for seasons then
> you can do something like
>
> for row in rows:
> yield
> Request(url=row[0],meta={'show':row[2]},callback=self.parse)
>
> def parse(self, response):
> print(response.meta['show']) # prints the shows name
>
> in this way you will be passing the name of the show while calling request
> for each season url. Similarly you can do while crawling episodes on
> seasons page.
>
> I hope this helps.
>
> in this way all the seasons
> On Tuesday, April 18, 2017 at 11:32:43 AM UTC+5:30, Rajeswari Rajkumar
> wrote:
>>
>> Is there way the relationship between pages can be maintained. For eg: we
>> need to crawl page having shows then Seasons and episodes, but need to
>> maintain which show and season and episode relation in all the stages.
>>
>> Thanks,
>> Rajeshwari
>>
>
--
You received this message because you are subscribed to the Google Groups
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.