Thanks James,
 I will try out some stunts and post the result on the group.

Thanks,
Gautam

On Fri, Jul 31, 2009 at 10:14 PM, James Healy <[email protected]> wrote:

>
> Gautam Rege wrote:
> > Example,
> >  I have some data stored in database in ISO-101646-1 and ISO-8859-1
> > encoding. i.e. devnagiri script. (for the sake of discussion). I need to
> > search on the contents.
> >
> > Can this be done in sphinx? Any ideas?
> >
> > Would it be better if the database is in UTF-8 encoding and the localized
> > data is then converted to UTF-8 and stored?
> > If so, how can I index and search on a localized string?
>
> I believe you *can* index single byte encodings (like ISO-8859-1 and
> friends) by setting the charset: option in sphinx.yml to "sbcs". This
> will index your content at a byte level, however you'll need to setup a
> charset_table if your language is anything other than english or
> russian.
>
> If possible, storing your content in a utf-8 database will make things
> significantly easier. TS uses utf-8 by default and all you'll need to do
> is setup a charset_table with the Unicode codepoints you want indexed.
>
> For more info, have a read of the sphinx docs
> (http://www.sphinxsearch.com/docs/current.html#charsets) and a blog post
> I wrote on using TS with Unicode
> (http://yob.id.au/blog/2008/05/08/thinking_sphinx_and_unicode/).
>
> -- James Healy <jimmy-at-deefa-dot-com>  Sat, 01 Aug 2009 02:39:31 +1000
>
> >
>


-- 
~~~~~~~~~~~~~~~
All wiyht. Rho sritched mg kegtops awound?

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Thinking Sphinx" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/thinking-sphinx?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to