Hi,

I would like to use full text search to index SQL statements, but I
need it to tokenize using more values than the current list.

The use case is that my app allows users to enter sql statements. I
want to find out which statements include specific table names and
column names. I can use a general lookup (pseudocode)

for ( String columnName: columnNames) {
     execute("SELECT id FROM sqltable WHERE sql_statement LIKE '%" +
columnName+ "%')")
}

which is pretty quick, but I have several tables and columns where I'd
need to do this lookup, so really i'd like to index the tables and
columns then do the lookup using the FT_SEARCH_DATA function to get
all the tables that have indexed columns containing "columnName".

However, a bit of testing has shown that while native FT indexing work
perfectly for sql snippits like this one that may well appear in a
where clause:

columnName <= 10

FT_SEARCH_DATA('columnName', 0, 0)
(1 row, 2 mS)

if the whitespace is removed the whole phrase is treated as a single
word...

columnName<=10

FT_SEARCH_DATA('columnName', 0, 0)
(0 rows, 2 mS)

FT_SEARCH_DATA('columnName<=10', 0, 0)
(1 row, 2mS)

Couldd a few more tokens be added to the default list (" \t\n\r\f+\"*
%&/()=?'!,.;:-_#@|^~`{}[]"), or can it be made possible for the native
search to have it's token list expanded by a config value or a
parameter, so I can split by the sql boolean operators too. A usage
example might be:

FT_ADD_TOKENS('><');
FT_CREATE_INDEX('PUBLIC', 'SQLTABLE', 'SQLSTATEMENT')

Or maybe the ability to do that already exists and I've missed it in
documentation?

Cheers,
Ian.


-- 
You received this message because you are subscribed to the Google Groups "H2 
Database" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/h2-database?hl=en.

Reply via email to