Another method beyond what was suggested above/below would be that since
the / could be considered a delimiter, you could consider each field a word
and insert each word into a separate table and index each word.  Have
another table reference the indexed word to match whatever table you've
mentioned.  This is my personal pref, as I've not a lot of experience with
FTS.

So using your sample data above, I designed this database;

CREATE TABLE [main].[RawData] (
  [PathID] INTEGER PRIMARY KEY AUTOINCREMENT,
  [Path] CHAR);

CREATE TABLE [main].[Keywords] (
  [KeywordID] INTEGER PRIMARY KEY AUTOINCREMENT,
  [Keyword] CHAR);
CREATE UNIQUE INDEX [main].[idxKeywords] ON [Keywords] ([Keyword] COLLATE
NOCASE);

CREATE TABLE [main].[DataKeywords] (
  [PathID] INTEGER NOT NULL,
  [KeywordID] integer NOT NULL,
  CONSTRAINT [sqlite_autoindex_DataKeywords_1] PRIMARY KEY ([PathID],
[KeywordID]));


insert into [main].[RawData] values(1, 'ab/cd/gf');
insert into [main].[RawData] values(2, 'ab/qw/ert');
insert into [main].[RawData] values(3, 'ab/fgrd/ert');
insert into [main].[RawData] values(4, 'ab/foo/bar/fgr');
insert into [main].[RawData] values(5, 'ab/bar/foo/foobar/etc');
insert into [main].[RawData] values(6, 'ab/etc/d');

insert into [main].[Keywords] values(1, 'ab');
insert into [main].[Keywords] values(2, 'cd');
insert into [main].[Keywords] values(3, 'gf');
insert into [main].[Keywords] values(4, 'qw');
insert into [main].[Keywords] values(5, 'ert');
insert into [main].[Keywords] values(6, 'fgrd');
insert into [main].[Keywords] values(8, 'foo');
insert into [main].[Keywords] values(9, 'bar');
insert into [main].[Keywords] values(10, 'fgr');
insert into [main].[Keywords] values(11, 'foobar');
insert into [main].[Keywords] values(12, 'etc');
insert into [main].[Keywords] values(13, 'd');

insert into [main].[DataKeywords] values(1, 1);
insert into [main].[DataKeywords] values(1, 2);
insert into [main].[DataKeywords] values(1, 3);
insert into [main].[DataKeywords] values(2, 1);
insert into [main].[DataKeywords] values(2, 4);
insert into [main].[DataKeywords] values(2, 5);
insert into [main].[DataKeywords] values(3, 1);
insert into [main].[DataKeywords] values(3, 6);
insert into [main].[DataKeywords] values(3, 5);
insert into [main].[DataKeywords] values(4, 1);
insert into [main].[DataKeywords] values(4, 8);
insert into [main].[DataKeywords] values(4, 9);
insert into [main].[DataKeywords] values(4, 10);
insert into [main].[DataKeywords] values(5, 1);
insert into [main].[DataKeywords] values(5, 9);
insert into [main].[DataKeywords] values(5, 8);
insert into [main].[DataKeywords] values(5, 11);
insert into [main].[DataKeywords] values(5, 12);
insert into [main].[DataKeywords] values(6, 1);
insert into [main].[DataKeywords] values(6, 12);
insert into [main].[DataKeywords] values(6, 13);

select RawData.PathID, Path from RawData
join DataKeywords on RawData.PathID=DataKeywords.PathID
join Keywords on DataKeywords.KeywordID=Keywords.KeywordID
where keyword='ert'

PathID Path
------ -----------
     2 ab/qw/ert
     3 ab/fgrd/ert

select RawData.PathID, Path from RawData
join DataKeywords on RawData.PathID=DataKeywords.PathID
join Keywords on DataKeywords.KeywordID=Keywords.KeywordID
where keyword='ab'
PathID Path
------ ---------------------
     1 ab/cd/gf
     2 ab/qw/ert
     3 ab/fgrd/ert
     4 ab/foo/bar/fgr
     5 ab/bar/foo/foobar/etc
     6 ab/etc/d


On Sun, Oct 26, 2014 at 8:27 AM, Baruch Burstein <bmburst...@gmail.com>
wrote:

> Hi!
>
> I have a column which represents a file path:
>
> ab/cd/gf
> ab/qw/ert
> ab/fgrd/ert
> ab/foo/bar/fgr
> ab/bar/foo/foobar/etc
> ab/etc/d
> etc...
>
> I happen to know in my case that the first part of the path is a certain
> fixed string ('ab' in the above example). I need to get the path with the
> first 2 parts stripped off. Currently I am doing:
>
>         substr(path, 4+instr(substr(path,4),'/'))
>
> But that seems long and probably inefficient.
> What is the best/simplest way to find the second occurrence of the '/' in a
> string?
>
> Also, a suggestion for an SQLite improvement: The builtin function instr()
> should have another form that takes 3 arguments, with the 3rd being either
> an offset from where to start the search, or which occurrence to search for
> (1st, 2nd, etc.)
>
> --
> ˙uʍop-ǝpısdn sı ɹoʇıuoɯ ɹnoʎ 'sıɥʇ pɐǝɹ uɐɔ noʎ ɟı
> _______________________________________________
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to