Hello people,

as title says, problem with images....here is my code

pipelines.py

class MyImagePipeline(ImagesPipeline):


    headers = {
        'Host': 'cdn.autodoc.de',
        'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64; rv:50.0) 
Gecko/20100101 Firefox/50.0',
        'Accept': 
'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
        'Accept-Language': 'en-US,en;q=0.5',
        'Connection': 'keep-alive',
        'Upgrade-Insecure-Requests': '1',
        'Pragma': 'no-cache',
        'Cache-Control': 'no-cache',
    }

    def get_media_requests(self, item, info):

        for image_url in item['image_urls']:
            # r = requests.get(image_url, stream=True)
            #
            # if r.ok:
            #     with 
open('/home/dimitris/stock/Dropbox/cargr/autoparts/images/%s.png' % 
str(uuid.uuid4()),
            #               'wb') as pic:
            #         for chunk in r:
            #             pic.write(chunk)

            yield scrapy.Request(image_url, headers=self.headers)

i intentionally left the requests code in there...i have tried with the 
requests library in a terminal and the pics download properly without even 
changing the user-agent  


somewhere in my crawler class i have 

pic = response.xpath('//div[@class="image"]/span/img/@src').extract()
item['image_urls'] = pic



which returns

 'image_urls': [u'http://cdn.autodoc.de/thumb?id=7079085&lng=en'],

in my items.py i have 

    image_urls = scrapy.Field()
    images = scrapy.Field()

settings.py

ITEM_PIPELINES = { 'autoparts.pipelines.AutopartsPipeline': 700,
                  'autoparts.pipelines.MyImagePipeline': 600
                  }


in the terminal i just see this error 

2016-12-28 13:03:12 [scrapy.core.engine] DEBUG: Crawled (301) <GET 
https://cdn.autodoc.de/thumb?id=7079085&lng=en> 
(referer: None)
2016-12-28 13:03:12 [scrapy.pipelines.files] WARNING: File (code: 301): 
Error downloading file from <GET 
https://cdn.autodoc.de/thumb?id=7079085&lng=en> 
referred in <None




i have also tried replacing https with http, in the browser returns the 
same pic

any suggestion would be appreciated :)

thanks

-- 
You received this message because you are subscribed to the Google Groups 
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to scrapy-users+unsubscr...@googlegroups.com.
To post to this group, send email to scrapy-users@googlegroups.com.
Visit this group at https://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.

Reply via email to