> Ishii-san,
>
> >>> Ok, probably we need to copy the English stemming rule to the one for
> >>> Japanese.
> >> Pardon my ignorance here, but is the concept of stemming even relevant
> >> to Japanese/Chinese/Korean? What little I know about ideographic
> >> languages suggests it wouldn't work wel
Ishii-san,
Ok, probably we need to copy the English stemming rule to the one for
Japanese.
Pardon my ignorance here, but is the concept of stemming even relevant
to Japanese/Chinese/Korean? What little I know about ideographic
languages suggests it wouldn't work well. And surely the specific
On 6/25/07, Tom Lane <[EMAIL PROTECTED]> wrote:
"Mike Rylander" <[EMAIL PROTECTED]> writes:
> I can certainly understand the benefit of making the default
> configuration a simple locale to language map, but there are
> definitely uses for searching using different stemmers/stop-lists even
> with
"Mike Rylander" <[EMAIL PROTECTED]> writes:
> I can certainly understand the benefit of making the default
> configuration a simple locale to language map, but there are
> definitely uses for searching using different stemmers/stop-lists even
> within the same corpus/index. So, as a datapoint for
On 6/25/07, Tom Lane <[EMAIL PROTECTED]> wrote:
Well, it's not hard at all to find chunks of English text that have
embedded bits of French, Spanish, or what-have-you, but that's not an
argument for trying to intermix the stemmers. I doubt that such simple
bits of program could tell the language
> Tatsuo Ishii <[EMAIL PROTECTED]> writes:
> > Ok, probably we need to copy the English stemming rule to the one for
> > Japanese.
>
> Pardon my ignorance here, but is the concept of stemming even relevant
> to Japanese/Chinese/Korean? What little I know about ideographic
> languages suggests it
Tatsuo Ishii <[EMAIL PROTECTED]> writes:
> Ok, probably we need to copy the English stemming rule to the one for
> Japanese.
Pardon my ignorance here, but is the concept of stemming even relevant
to Japanese/Chinese/Korean? What little I know about ideographic
languages suggests it wouldn't work
> Tatsuo Ishii wrote:
>
> > japanese '{ja_JP, C}'
> >
> > How would we know C -> japanese?
> >
> You can't do that. You can't have different languages (not locales)
> mapping to the same 'tsearch language' because the stemmer doesn't know
> that a specific word is in english or japanese. So you
Tatsuo Ishii wrote:
> japanese '{ja_JP, C}'
>
> How would we know C -> japanese?
>
You can't do that. You can't have different languages (not locales)
mapping to the same 'tsearch language' because the stemmer doesn't know
that a specific word is in english or japanese. So you have two options:
>> How would this work for initdb with locale C?
>
> I'm worrying about that too.
english '{en_GB, en_US, C}'
I suppose, that locale name always has a dot separator exept C locale ---
which is well known exception
---(end of broadcast)---
TIP 1
> >> How would this work for initdb with locale C?
> >
> > I'm worrying about that too.
>
> english '{en_GB, en_US, C}'
>
> I suppose, that locale name always has a dot separator exept C locale ---
> which is well known exception
So we would have to?:
japanese '{ja_JP, C}'
How would we know C
11 matches
Mail list logo