Re: Multi-language indexing and searching

2007-06-21 Thread Daniel Alheiros
Hi Hoss. I've tried that yesterday using the same approach you just said (I've created the base fields for any language with basic analyzers) and it worked alright. Thanks again for you time. Regards, Daniel On 20/6/07 21:00, Chris Hostetter [EMAIL PROTECTED] wrote: : So far it sounds

Re: Multi-language indexing and searching

2007-06-20 Thread Chris Hostetter
: So far it sounds good for my needs, now I'm going to try if my other : features still work (I'm worried about highlighting as I'm going to return a : different field)... i'm not really a highlighting guy so i'm not sure ... but if you're okay with *simple* highlighting you can probably just

Re: Multi-language indexing and searching

2007-06-19 Thread Chris Hostetter
: range wouldn't be a problem in this case. The real issue I can see in this : approach, is related to Analyzers... How to make them deal with different : languages properly using one Solr instance with the same set of fields being : used by documents in different languages i would still use

Re: Multi-language indexing and searching

2007-06-15 Thread Chris Hostetter
: One bad thing in having fields specific for your language (in my point of : view) is that you will have to re-index your content when you add a new : language (some will need to start with one language and in future will have : others added). But OK, let's say the indexing is done. i don't see

Re: Multi-language indexing and searching

2007-06-13 Thread Daniel Alheiros
Hi Hoss One bad thing in having fields specific for your language (in my point of view) is that you will have to re-index your content when you add a new language (some will need to start with one language and in future will have others added). But OK, let's say the indexing is done. So using

RE: Multi-language indexing and searching

2007-06-12 Thread Teruhiko Kurosaka
Daniel, I was reading your email and responses to it with great interest. I was aware that Solr has an implicit assumption that a field is mono-lingual per system. But your mail and its correspondence made me wonder if this limitation is practical for multi-lingual search applications. For

Re: Multi-language indexing and searching

2007-06-12 Thread Yonik Seeley
On 6/12/07, Teruhiko Kurosaka [EMAIL PROTECTED] wrote: For bi-lingual or tri-lingual search, we can have parallel fields (title_en, title_fr, title_de, for example) but this wouldn't scale well. Due to search across multiple fields, or due to increased index size? Lucene and Solr requires

RE: Multi-language indexing and searching

2007-06-12 Thread Teruhiko Kurosaka
Hi Yonik, On 6/12/07, Teruhiko Kurosaka [EMAIL PROTECTED] wrote: For bi-lingual or tri-lingual search, we can have parallel fields (title_en, title_fr, title_de, for example) but this wouldn't scale well. Due to search across multiple fields, or due to increased index size? Due to the

RE: Multi-language indexing and searching

2007-06-12 Thread Chris Hostetter
: Due to the prolification of number of fields. Say, we want : to have the field title to have the title of the book in : its original language. But because Solr has this implicit : assumption of one language per field, we would have to have : the artifitial fields title_fr, title_de, title_en,

Re: Multi-language indexing and searching

2007-06-11 Thread Daniel Alheiros
This sounds OK. I can create a field name mapping structure to change the requests / responses in a way my client doesn't need to be aware of different fields. Thanks for this directions, Daniel On 8/6/07 21:32, Chris Hostetter [EMAIL PROTECTED] wrote: : Can't I have the same index, using

Re: Multi-language indexing and searching

2007-06-11 Thread Daniel Alheiros
Hi Henri, Thanks again, your considerations will sure help on my decision. Now I'll do my homework to check document volume / growth - expected index sizes and query load. Regards, Daniel Alheiros On 9/6/07 10:53, Henrib [EMAIL PROTECTED] wrote: Hi Daniel, Trying to recap: you are

Re: Multi-language indexing and searching

2007-06-09 Thread Henrib
Hi Daniel, Trying to recap: you are indexing documents that can be in different language. On the query side, users will only search in one language at a time get results in that language. Setting aside the webapp deployment problem, the alternative is thus: option1: 1 schema will all fields of

Re: Multi-language indexing and searching

2007-06-08 Thread Henrib
Hi Daniel, If it is functionally 'ok' to search in only one lang at a time, you could try having one index per lang. Each per-lang index would have one schema where you would describe field types (the lang part coming through stemming/snowball analyzers, per-lang stopwords al) and the same field

Re: Multi-language indexing and searching

2007-06-08 Thread Daniel Alheiros
Hi Henri. Thanks for your reply. I've just looked at the patch you referred, but doing this I will lose the out of the box Solr installation... I'll have to create my own Solr application responsible for creating the multiple cores and I'll have to change my indexing process to something able to

Re: Multi-language indexing and searching

2007-06-08 Thread Chris Hostetter
: Can't I have the same index, using one single core, same field names being : processed by language specific components based on a field/parameter? yes, but you don't really need the complexity you describe below ... you don't need seperate request handlers per language, just seperate fields