For your first problem, you've started the scrapyd project but you need to schedule a spider run using the schedule.json command. Something like
curl http://localhost:6800/schedule.json -d project=sole24ore -d spider=yourspidername For your second problem your settings.py is misconfigured your feed settings should be like FEED_URI = 'file://home/marco/crawlscrape/sole24ore/output.json' FEED_FORMAT = 'json' Hope that helps On Tuesday, January 20, 2015 at 4:23:04 AM UTC-8, Marco Ippolito wrote: > > Hi, > I' ve got 2 situations to solve. > > Seems that everything is ok: > > (SCREEN)marco@pc:~/crawlscrape/sole24ore$ scrapyd-deploy sole24ore -p > sole24ore > Packing version 1421755479 > Deploying to project "sole24ore" in http://localhost:6800/addversion.json > Server response (200): > {"status": "ok", "project": "sole24ore", "version": "1421755479", > "spiders": 1} > > > marco@pc:/var/lib/scrapyd/dbs$ ls -lah > totale 12K > drwxr-xr-x 2 scrapy nogroup 4,0K gen 20 13:04 . > drwxr-xr-x 5 scrapy nogroup 4,0K gen 20 06:55 .. > -rw-r--r-- 1 root root 2,0K gen 20 13:04 sole24ore.db > > > marco@pc:/var/lib/scrapyd/eggs/sole24ore$ ls -lah > totale 16K > drwxr-xr-x 2 scrapy nogroup 4,0K gen 20 13:04 . > drwxr-xr-x 3 scrapy nogroup 4,0K gen 20 12:47 .. > -rw-r--r-- 1 scrapy nogroup 5,5K gen 20 13:04 1421755479.egg > > > , but nothing is executed > > marco@pc:/var/lib/scrapyd/items/sole24ore/sole$ ls -a > . .. > > [detached from 2515.pts-4.pc] > marco@pc:~/crawlscrape/sole24ore$ curl > http://localhost:6800/listjobs.json?project=sole24ore > {"status": "ok", "running": [], "finished": [], "pending": []} > > > > The second aspect regards how to save the output into a json file. > What is the correct form to put into settings.py? > > ile Edit Options Buffers Tools Python Help > # Scrapy settings for sole24ore project > # > # For simplicity, this file contains only the most important settings by > # default. All the other settings are documented here: > # > # http://doc.scrapy.org/en/latest/topics/settings.html > # > > BOT_NAME = 'sole24ore' > > SPIDER_MODULES = ['sole24ore.spiders'] > NEWSPIDER_MODULE = 'sole24ore.spiders' > > FEED_URI=file://home/marco/crawlscrape/sole24ore/output.json --set > FEED_FORMAT=json > > > SCREEN)marco@pc:~/crawlscrape/sole24ore$ scrapyd-deploy sole24ore -p > sole24ore > Packing version 1421756389 > Deploying to project "sole24ore" in http://localhost:6800/addversion.json > Server response (200): > {"status": "error", "message": "SyntaxError: invalid syntax"} > > > # Crawl responsibly by identifying yourself (and your website) on the > user-agent > #USER_AGENT = 'sole24ore (+http://www.yourdomain.com)' > > Looking forward to your kind help. > Kind regards. > Marco > -- You received this message because you are subscribed to the Google Groups "scrapy-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/scrapy-users. For more options, visit https://groups.google.com/d/optout.
