Hello!
Using Scrapy, I have been trying to scrape footwear data from a website. I want to scrape only the sizes that are available. But while scraping, the spider scrapes all the sizes of the products. Could someone help me with this? I have mentioned the details to my problem below. For example: These are the sizes displayed on the website: <https://lh3.googleusercontent.com/-4yOJMcwWLgE/V3Zjj8tbFhI/AAAAAAAAADA/QXVzEo9pcq0-brDF6sS1iZDhtqUcJd8mACLcB/s1600/sizes.png> The sizes that are unavailable are non-navigable and have a strikethrough on them. After scraping, I get values of all the sizes. <https://lh3.googleusercontent.com/-_Z8fwXWbT30/V3Zj5GE-5OI/AAAAAAAAADE/q3o6AokEJ5AOIChH2cuOR7pv5brQoE9_gCLcB/s1600/size1.png> Instead, I want ONLY the sizes that are *available* (i.e. Sizes 7,9,10) The result should look like this: <https://lh3.googleusercontent.com/-556L7yD8Fvw/V3ZkNFU9GzI/AAAAAAAAADI/s2bKMHoYiNs4-EymfaGLnXgVVstafVq4wCLcB/s1600/size2.png> In the Elements(Developer tools) tab, the unavailable sizes have li class value *"disabled"* and have *data-quantity="0"*. Can this be used to solve the problem? <li class="first popover-options disabled"><a href="#" style="border-color:rgb(247,247,247)" data-trigger="hover" class="btn-popover swatch-item" data-placement="top" data-price="3295" data-special-price="2142" data-simple-sku="SOME_VALUE_1" data-discount="" data-quantity="0" data-low-inventory="0" data-original-title="" title="" data-content="<span class="popover-close hidden-xs"></span><p>Euro Size 42</p>"><span>8</span></a><div class="content"><span class="popover-close hidden-xs"></span><p>Euro Size 42</p></div></li> Note: The "disabled" value hasn't been put for the available sizes. <li class="first popover-options "><a data-gaq-event="PDP~$~Size~$~BU024SH53NBYINDFAS-4705979|JWG0623821d28edd03d4d463319b7da981d3afae34117998753178d3952dc051bb06|7" href="#" style="border-color:rgb(247,247,247)" data-trigger="hover" class="btn-popover swatch-item " data-placement="top" data-price="3295" data-special-price="2142" data-simple-sku="SOME_VALUE_2" data-discount="35" data-quantity="1" data-low-inventory="1" data-original-title="" title="" data-content="<span class="popover-close hidden-xs"></span><p>Euro Size 41</p>"><span>7</span></a><div class="content"><span class="popover-close hidden-xs"></span><p>Euro Size 41</p></div></li> Also, the xpath of the unavailable and available sizes have no difference apart from their index numbers. *Available product size* //*[@id="size-block"]/div[1]/ul/li[2] *Unavaiable product size* //*[@id="size-block"]/div[1]/ul/li[3] This is the sample of the code which I have used. item['SizeA'] = sel.xpath( '//*[@id="size-block"]/div[1]/ul/li[1]/a/span/text()').extract() item['SizeB'] = sel.xpath( '//*[@id="size-block"]/div[1]/ul/li[2]/a/span/text()').extract() item['SizeC'] = sel.xpath( '//*[@id="size-block"]/div[1]/ul/li[3]/a/span/text()').extract() PS. I am new to web scraping. Thanks in advance, Mrun -- You received this message because you are subscribed to the Google Groups "scrapy-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to scrapy-users+unsubscr...@googlegroups.com. To post to this group, send email to scrapy-users@googlegroups.com. Visit this group at https://groups.google.com/group/scrapy-users. For more options, visit https://groups.google.com/d/optout.