You can use this test
if item.get('modelnumber'):
just like for a regular dict
Le 21 mars 2014 01:11, "BrendanB" <[email protected]> a écrit :
> Hi,
>
> Just have a quick question about itempipelines and nulls / missing values
>
> I have a field in my spider to extract a model number. Now in some cases
> this is null.
>
> I had a pipeline set-up like the following however I always got an error.
> I essentially wanted to either add in a default value or skip over this
> record.
>
> This example pipeline below would always fail!
>
> class ModelNumberPipeline(object):
> def process_item(self, item, spider):
> if item['modelnumber']:
> return item
> else:
> raise DropItem("Missing Model Number in %s" % item)
>
>
> The error was the following:
>
> 'price': u'$29.95',
> 'product_id': u'3231002',
> 'site_id': u'1',
> 'site_type': u'1'}
> Traceback (most recent call last):
> File
> "/usr/local/lib/python2.7/dist-packages/scrapy/middleware.py", line 62, in
> _process_chain
> return process_chain(self.methods[methodname], obj, *args)
> File
> "/usr/local/lib/python2.7/dist-packages/scrapy/utils/defer.py", line 65, in
> process_chain
> d.callback(input)
> File
> "/usr/local/lib/python2.7/dist-packages/twisted/internet/defer.py", line
> 382, in callback
> self._startRunCallbacks(result)
> File
> "/usr/local/lib/python2.7/dist-packages/twisted/internet/defer.py", line
> 490, in _startRunCallbacks
> self._runCallbacks()
> --- <exception caught here> ---
> File
> "/usr/local/lib/python2.7/dist-packages/twisted/internet/defer.py", line
> 577, in _runCallbacks
> current.result = callback(current.result, *args, **kw)
> File "/home/brendanb/scrapy/cbs-crawler/cbs/pipelines.py", line
> 15, in process_item
> if item['model_number']:
> File "/usr/local/lib/python2.7/dist-packages/scrapy/item.py",
> line 50, in __getitem__
> return self._values[key]
> exceptions.KeyError: 'model_number'
>
>
> If I changed this pipeline to just use a default value pipeline I can
> fudge it.
>
> class ModelNumberPipeline(object):
> def process_item(self, item, spider):
> item.setdefault('model_number', '')
> return item
>
>
> My question is why would this pipeline fail on checking for a null value.?
>
> thanks
> Brendan
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "scrapy-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at http://groups.google.com/group/scrapy-users.
> For more options, visit https://groups.google.com/d/optout.
>
--
You received this message because you are subscribed to the Google Groups
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.