Re: Weird issue trying to get Scrapy to run on Windows Scheduled Task

ObserverEffect Wed, 10 May 2017 06:47:09 -0700

In Windows Task Scheduler.. Did you specify the optional "start in" folder 
setting?


Maybe set it to C:\Users\xyz\Google Drive\cineplex\


Den måndag 8 maj 2017 kl. 14:38:32 UTC+2 skrev alvin. zing:
>
> So I am trying to run the below code from a Window Scheduled Task.
> The weird thing is when I try running it by hand in a command prompt, it 
> works.
> However when the ST run the code, it only return the following, and the 
> program will end.
> I am struggling with this, may the community please help me with this?
>
> Thank you in advance.
>
>
>
> [The Command Prompt output when run from the ST]
> C:\Windows\system32>python "C:\Users\xyz\Google Drive\cineplex\start.py" 
> seati ngs 2017-05-06 21:47:03 [scrapy.utils.log] INFO: Scrapy 1.3.3 started 
> (bot: scrapybo t) 2017-05-06 21:47:03 [scrapy.utils.log] INFO: Overridden 
> settings: {} C:\Windows\system32>pause Press any key to continue . . . 
>
>
>
> [The Python script I am running]
> from cineplex import utils
> from cineplex.spiders import showtimes_spider as st
> from cineplex.spiders import seatings_spider as seat
> import scrapy
> from scrapy.crawler import CrawlerProcess
> from scrapy.utils.log import configure_logging
> from scrapy.utils.project import get_project_settings
> import sys
> import time
> from twisted.internet import reactor, defer
>
> '''
> Constant for Parent Directory.
> Subfolders will contain all movie times and seatings for the day
> '''
> PARENT_DIR = r'./data/'
>
>
> '''
> Crawls all Seatings per Cinema
> '''
> def crawl_all_seatings():
> # Create a CrawlerProcess instance to run multiple spiders simultaneously
> # Read more here https://doc.scrapy.org/en/latest/topics/practices.html
> process = CrawlerProcess()
>
> # Check folder for today
> directory_for_today = utils.create_dir_for_today(PARENT_DIR)
>
> # Get all showtimes files' filepaths
> filepaths = utils.get_all_showtimes_filepaths(directory_for_today)
>
> # In every filepath, is a file with all the movie session ids
> for filepath in filepaths:
> sessions = utils.get_all_sessions(filepath)
> # Only start crawling if there are sessions.
> if len(sessions) > 0:
> # Add spiders to crawler process
> for session_id in sessions:
> process.crawl(seat.SeatingsSpider, session_id=session_id, output_dir=
> directory_for_today)
>
> # Start crawling
> process.start()
>
>
> '''
> Crawls all Cinemas' movies' showtimes
> '''
> def crawl_all_showtimes():
> # Create a CrawlerProcess instance to run spiders simultaneously
> # Read more here https://doc.scrapy.org/en/latest/topics/practices.html
> process = CrawlerProcess()
>
> # Check folder for today
> directory_for_today = utils.create_dir_for_today(PARENT_DIR)
>
> # Get all cinema id and names first
> cinema_dict = utils.get_all_cinemas()
>
> # Iterate through all cinema to get show timings
> # Add spiders to crawler process
> for cinema_id, cinema_name in cinema_dict.iteritems():
> process.crawl(st.ShowTimesSpider, cinema_id=cinema_id, 
> cinema_name=cinema_name, 
> output_dir=directory_for_today )
>
> # Start crawling
> process.start()
>
>
> '''
> Main program run spiders
> '''
> def main(argv):
> # Turns on Scrapy Logging
> # configure_logging()
>
> crawl_type = argv[1]
> if crawl_type == 'showtimes':
> # Collect all Showtimes
> crawl_all_showtimes()
>
> elif crawl_type == 'seatings':
> # Collect all Seatings
> crawl_all_seatings()
>
> else:
> print 'usage: showtimes for crawling show timing or seatings to crawl 
> seats occupancy'
>
>
>
> if __name__ == "__main__":
> # main(sys.argv)
> main(['','seatings'])
>
> # Exit the program
> sys.exit()
>

-- 
You received this message because you are subscribed to the Google Groups 
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to scrapy-users+unsubscr...@googlegroups.com.
To post to this group, send email to scrapy-users@googlegroups.com.
Visit this group at https://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.

Re: Weird issue trying to get Scrapy to run on Windows Scheduled Task

Reply via email to