Maybe try something like 
xpath("div[@class='x']/text()").extract_first().encode('utf8').strip()?

On Monday, February 20, 2017 at 11:51:30 AM UTC+1, Avishay Balderman wrote:
>
> I have a spider that runs on a site and extract the text from specific 
> table cells.
> Markup example below.
>
> <td class="x">I want this text</td>
>
> The real text I am looking for is in Hebrew and contains XML &gt; char.
> Example: לנגר שרונה <- לוי שולמית ט5 417
>
> My xpath expression works fine and I am able to find the relevant table 
> cells. The problem is that when I extract the text I get only *לנגר שרונה*
> which is only part of the text.
> Is it possible that the '&gt;' inside the text causes the problem?
> If it is - is there a workaround?
>
> Thanks
>
> Avishay
>
>
>
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to scrapy-users+unsubscr...@googlegroups.com.
To post to this group, send email to scrapy-users@googlegroups.com.
Visit this group at https://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.

Reply via email to