OK, not a lot of ConTerMedText people out here?

Here's what I've done to fix it.  It seems to work:

begin
ctx_ddl.create_preference('MYLEXER','BASIC_LEXER');
ctx_ddl.set_attribute('MYLEXER','NUMGROUP',CHR(255));
end;

So, I just changed the default NUMGROUP from "," to an unprintable ASCII
255.  I think it's safe to assume that a user's not going to be allowed to
enter that into a description.

Thanks, Rich!  :)

Rich

Rich Jesse                           System/Database Administrator
[EMAIL PROTECTED]                  Quad/Tech Inc, Sussex, WI USA


> -----Original Message-----
> From: Jesse, Rich 
> Sent: Wednesday, October 29, 2003 4:44 PM
> To: Multiple recipients of list ORACLE-L
> Subject: Removing NUMGROUP from lexer in ConTerMedText index
> 
> 
> Hey all,
> 
> I've setup a Context/Intermedia/Text/whateverTheHell index on 
> 8.1.7.4 on
> HP/UX to index about 250000 description fields in order for 
> our users to
> search on them.  This was two years ago, and now someone has 
> discovered at
> least one issue.
> 
> One description contains something like:
> 
>       BLEAH,120,1/4W
> 
> Using the default lexer, this stupidly parses into tokens of "BLEAH",
> "120,1" and "4W" instead of "BLEAH", "120", and "1/4W" (or 
> even "1" and
> "4W").  I think this is because of the default NUMGROUP for 
> US languages,
> which is a comma (",").  So when a user looks for "120 AND 1/4W", this
> description is missed because "120" isn't a valid token with 
> the default
> lexer.
> 
> There can be numerous other issues with NUMGROUP when lexing a
> free-formatted description, so I really don't want a 
> NUMGROUP.  I tried
> setting it to null using:
> 
>       ctx_ddl.set_attribute('MYLEXER','NUMGROUP','');
> 
> ..but this bombs with:
> 
>       ORA-20000: interMedia Text error:
>       DRG-10705: invalid value NULL for attribute NUMGROUP
> 
> Other than trying to find some char that will work with 250K 
> rows, is there
> a way to turn this off?  The thing that gets me is that 
> "120,1" isn't even a
> proper number, but ConTerMedText thinks it is and tokenizes it.
> 
> TIA,
> Rich
-- 
Please see the official ORACLE-L FAQ: http://www.orafaq.net
-- 
Author: Jesse, Rich
  INET: [EMAIL PROTECTED]

Fat City Network Services    -- 858-538-5051 http://www.fatcity.com
San Diego, California        -- Mailing list and web hosting services
---------------------------------------------------------------------
To REMOVE yourself from this mailing list, send an E-Mail message
to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L
(or the name of mailing list you want to be removed from).  You may
also send the HELP command for other information (like subscribing).

Reply via email to