Hi,
Just have a quick question about itempipelines and nulls / missing values
I have a field in my spider to extract a model number. Now in some cases
this is null.
I had a pipeline set-up like the following however I always got an error. I
essentially wanted to either add in a default value or skip over this
record.
This example pipeline below would always fail!
class ModelNumberPipeline(object):
def process_item(self, item, spider):
if item['modelnumber']:
return item
else:
raise DropItem("Missing Model Number in %s" % item)
The error was the following:
'price': u'$29.95',
'product_id': u'3231002',
'site_id': u'1',
'site_type': u'1'}
Traceback (most recent call last):
File
"/usr/local/lib/python2.7/dist-packages/scrapy/middleware.py", line 62, in
_process_chain
return process_chain(self.methods[methodname], obj, *args)
File
"/usr/local/lib/python2.7/dist-packages/scrapy/utils/defer.py", line 65, in
process_chain
d.callback(input)
File
"/usr/local/lib/python2.7/dist-packages/twisted/internet/defer.py", line
382, in callback
self._startRunCallbacks(result)
File
"/usr/local/lib/python2.7/dist-packages/twisted/internet/defer.py", line
490, in _startRunCallbacks
self._runCallbacks()
--- <exception caught here> ---
File
"/usr/local/lib/python2.7/dist-packages/twisted/internet/defer.py", line
577, in _runCallbacks
current.result = callback(current.result, *args, **kw)
File "/home/brendanb/scrapy/cbs-crawler/cbs/pipelines.py", line
15, in process_item
if item['model_number']:
File "/usr/local/lib/python2.7/dist-packages/scrapy/item.py",
line 50, in __getitem__
return self._values[key]
exceptions.KeyError: 'model_number'
If I changed this pipeline to just use a default value pipeline I can fudge
it.
class ModelNumberPipeline(object):
def process_item(self, item, spider):
item.setdefault('model_number', '')
return item
My question is why would this pipeline fail on checking for a null value.?
thanks
Brendan
--
You received this message because you are subscribed to the Google Groups
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.