Hi.I'm a beginner in Scrapy, Python. I tried to crawl a set of urls and I
came across this error 'Missing scheme in request url: %s' % self._url) in
scrapinghub. Below is the code.
import scrapy
from bs4 import BeautifulSoup,SoupStrainer
import urllib2
from scrapy.selector import Selector
from scrapy.http import HtmlResponse
import re
import pkgutil
from pkg_resources import resource_string
data = pkgutil.get_data("friday2","resources/urllist.txt")
class FridaySpider (scrapy.Spider):
name = 'fridayspider'
start_urls = [url.strip() for url in data]
def parse(self, response):
soup = BeautifulSoup(response.text,'lxml')
url = response.url
yield{
"title" : soup.title.string,
"url" : response.url,
}
Thanks in advance :) !!
--
You received this message because you are subscribed to the Google Groups
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.