Guys I try to endorse force to control entropy.
I do :
$ sudo rm -rf bomnegocioSpider.py
$ scrapy genspider -d crawl
use the scaffolding spider
the results :
bomnegocioSpider.py :
import scrapy
from scrapy.contrib.linkextractors import LinkExtractor
from scrapy.contrib.spiders import CrawlSpider, Rule
from bomnegocio.items import BomnegocioItem
class bomnegocio(CrawlSpider):
name = 'bomnegocio'
allowed_domains =
['sp.bomnegocio.com/regiao-de-bauru-e-marilia/eletrodomesticos/fogao-industrial-itajobi-4-bocas-c-forno-54183713']
start_urls =
['http://sp.bomnegocio.com/regiao-de-bauru-e-marilia/eletrodomesticos/fogao-industrial-itajobi-4-bocas-c-forno-54183713']
rules = (
Rule(LinkExtractor(allow=r'Items/'), callback='parse_item',
follow=True),
)
def parse_item(self, response):
i = BomnegocioItem()
#i['domain_id'] =
response.xpath('//input[@id="sid"]/@value').extract()
i['tile'] = response.xpath('//div[@id="ad_title"]').extract()
#i['description'] =
response.xpath('//div[@id="description"]').extract()
return i
items.py :
# -*- coding: utf-8 -*-
# Define here the models for your scraped items
#
# See documentation in:
# http://doc.scrapy.org/en/latest/topics/items.html
import scrapy
class BomnegocioItem(scrapy.Item):
title = scrapy.Field()
pass
$ scrapy crawl bomnegocio -o results.csv -t csv
csv keep empty
$ nano craw.log
2014-12-11 16:51:23-0200 [scrapy] INFO: Scrapy 0.24.4 started (bot:
bomnegocio)
2014-12-11 16:51:23-0200 [scrapy] INFO: Optional features available: ssl,
http11
2014-12-11 16:51:23-0200 [scrapy] INFO: Overridden settings:
{'NEWSPIDER_MODULE': 'bomnegocio.spiders', 'SPIDER_MODULES':
['bomnegocio.spiders'], 'LOG_FILE': 'crawl.lo$
2014-12-11 16:51:23-0200 [scrapy] INFO: Enabled extensions: LogStats,
TelnetConsole, CloseSpider, WebService, CoreStats, SpiderState
2014-12-11 16:51:23-0200 [scrapy] INFO: Enabled downloader middlewares:
HttpAuthMiddleware, DownloadTimeoutMiddleware, UserAgentMiddleware,
RetryMiddleware, DefaultHea$
2014-12-11 16:51:23-0200 [scrapy] INFO: Enabled spider middlewares:
HttpErrorMiddleware, OffsiteMiddleware, RefererMiddleware,
UrlLengthMiddleware, DepthMiddleware
2014-12-11 16:51:23-0200 [scrapy] INFO: Enabled item pipelines:
2014-12-11 16:51:23-0200 [bomnegocio] INFO: Spider opened
2014-12-11 16:51:23-0200 [bomnegocio] INFO: Crawled 0 pages (at 0
pages/min), scraped 0 items (at 0 items/min)
2014-12-11 16:51:23-0200 [scrapy] DEBUG: Telnet console listening on
127.0.0.1:6023
2014-12-11 16:51:23-0200 [scrapy] DEBUG: Web service listening on
127.0.0.1:6080
2014-12-11 16:51:23-0200 [bomnegocio] DEBUG: Crawled (200) <GET
http://sp.bomnegocio.com/regiao-de-presidente-prudente/industria-comercio-e-agro/fogao-industrial-54033$
2014-12-11 16:51:23-0200 [bomnegocio] INFO: Closing spider (finished)
2014-12-11 16:51:23-0200 [bomnegocio] INFO: Dumping Scrapy stats:
{'downloader/request_bytes': 297,
'downloader/request_count': 1,
'downloader/request_method_count/GET': 1,
'downloader/response_bytes': 8225,
'downloader/response_count': 1,
'downloader/response_status_count/200': 1,
'finish_reason': 'finished',
'finish_time': datetime.datetime(2014, 12, 11, 18, 51, 23, 376075),
'log_count/DEBUG': 3,
'log_count/INFO': 7,
'response_received_count': 1,
'scheduler/dequeued': 1,
'scheduler/dequeued/memory': 1,
'scheduler/enqueued': 1,
'scheduler/enqueued/memory': 1,
'start_time': datetime.datetime(2014, 12, 11, 18, 51, 23, 257001)}
2014-12-11 16:51:23-0200 [bomnegocio] INFO: Spider closed (finished)
suggestions ? =(
Em sexta-feira, 12 de dezembro de 2014 14h12min09s UTC-2, Pedro Castro
escreveu:
>
> Hi, everbody.
>
> My question is the following : scrapy export empty csv.
>
> I try to post my code here, but became confused.
>
> My doubt on the stackoverflow :
>
> http://stackoverflow.com/questions/27447399/scrapy-export-empty-csv
>
>
> Thank you for your attention and I now look forward to hearing your views.
>
>
--
You received this message because you are subscribed to the Google Groups
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.