Keywords get stripped of punctuation using Python's string.isalnum() method:

https://github.com/stephenmcd/mezzanine/blob/3.1.10/mezzanine/generic/views.py#L31

Here's a debug session against that code, when the hindi keyword is posted:

[(c, c.isalnum()) for c in request.POST.get("text_keywords",
"").split(",")[0]]
[(u' ', False), (u'\u0935', True), (u'\u093f', False), (u'\u091c', True),
(u'\u094d', False), (u'\u091e', True), (u'\u093e', False), (u'\u0928',
True)]

And the results pasting the keyword straight into the session:

[(c, c.isalnum()) for c in "विज्ञान"]
[('\xe0', False), ('\xa4', False), ('\xb5', False), ('\xe0', False),
('\xa4', False), ('\xbf', False), ('\xe0', False), ('\xa4', False),
('\x9c', False), ('\xe0', False), ('\xa5', False), ('\x8d', False),
('\xe0', False), ('\xa4', False), ('\x9e', False), ('\xe0', False),
('\xa4', False), ('\xbe', False), ('\xe0', False), ('\xa4', False),
('\xa8', False)]

Encoding issues aside, isalnum() is clearly useless for this task. I've
changed the logic to only strip out known punctuation and leave everything
else intact, which should solve the issue:

https://github.com/stephenmcd/mezzanine/commit/6893844f060fcfc35e59fa3264b1a8154ef5c022

I'm not sure when a new release will be out, but if you need that right
now, you can certainly specify your Mezzanine in your requirements to point
to that exact commit on Github.


On Mon, Sep 1, 2014 at 2:58 PM, Shaunak Sinha <[email protected]> wrote:

>
>  Hello Everyone,
>
>           We have a model called 'Book' which extends the 'Displayable'
> class. We use the built-in KeywordsField to store tags for Books which
> helps to search them on the site.
> When we try to save tags in Hindi(one of the official languages in India),
> the spelling of the tag changes automatically to something else and then it
> gets saved.
>
> For eg:
>
> विज्ञान changes to वजञन.
>
> Can anyone help identifying this problem?
>
> Thanks!
>
>
>  --
> You received this message because you are subscribed to the Google Groups
> "Mezzanine Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.
>



-- 
Stephen McDonald
http://jupo.org

-- 
You received this message because you are subscribed to the Google Groups 
"Mezzanine Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to