Re: [sqlite] small sqlite fts snippet function or Fts Bug!
Yeah, -ID 4 was just a desperate experiment for a hack with longer data in the search to see if it would lead the snippet function to start grabbing the data from the start (or at least one "word"/char more). The offsets beeing wrong and therefore the ... was kinda expected of me, but in case it worked I would have manually substracted the offset and put the markers in myself... so it wasn't part of the bug report just a test how it behaves in that case. Good to know for the future that its already fixed, thx for taking care of it so fast! boscowitch Am Mittwoch, den 28.01.2015, 02:04 +0700 schrieb Dan Kennedy: > On 01/27/2015 06:48 PM, boscowitch wrote: > > > > > > and the in an sqlite shell (SQLite version 3.8.8.1 2015-01-20 16:51:25) > > I get following for a select with snippet: > > > > EXAMPLE OUTPUT: > > sqlite> select docid,*,snippet(test) from test where german match "a"; > > 1|[1] a b c|1] a b c > > 2|[{[_.,:;[1] a b c|1] a b c > > 3|1[1] a b c|1[1] a b c > > 4|[1] a b c|1] a b c > > 5|[1] a b c|[1] a b c > > > > > > > > -As you can see for id 1 and 2 is at the right position > > but all beginning non-alphanumerical [,{, etc. are just left out in the > > snippet. > > > > > > -ID 4 does not help and breaks the offsets so even worse > > > > Thanks for reporting this. The issue with (1) and (2) is now fixed here: > >http://www.sqlite.org/src/info/adc9283dd9b > > I think it is a bug in the input data causing the problem in (4). The > values inserted into "test" and "testdata" are just slightly different. > > Dan. > > > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] small sqlite fts snippet function or Fts Bug!
On 01/27/2015 06:48 PM, boscowitch wrote: and the in an sqlite shell (SQLite version 3.8.8.1 2015-01-20 16:51:25) I get following for a select with snippet: EXAMPLE OUTPUT: sqlite> select docid,*,snippet(test) from test where german match "a"; 1|[1] a b c|1] a b c 2|[{[_.,:;[1] a b c|1] a b c 3|1[1] a b c|1[1] a b c 4|[1] a b c|1] a b c 5|[1] a b c|[1] a b c -As you can see for id 1 and 2 is at the right position but all beginning non-alphanumerical [,{, etc. are just left out in the snippet. -ID 4 does not help and breaks the offsets so even worse Thanks for reporting this. The issue with (1) and (2) is now fixed here: http://www.sqlite.org/src/info/adc9283dd9b I think it is a bug in the input data causing the problem in (4). The values inserted into "test" and "testdata" are just slightly different. Dan. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] small sqlite fts snippet function or Fts Bug!
Hello since it this bug report (+ a dirty-fix) it might be useful for both users and devs. that's why I send a copy to both mailing lists! I hope I don't bother the diligent devs who read all of both list, sry to them, and thx for sqlite btw. ;)! recently I wanted to use the snippet function in sqlite for my small sqlite dictionary (running on android but the bug occurs also on my linux desktop). but it behaved strangely when my entry started with "non-words" character(s) (not alphanumeric and all Unicode (or chars>128) in short simple tokenizer delimiters) the snippet never prints them if they are in the beginning of the first word here an examples to demonstrate: EXAMPLE SETUP SQL: create table testdata (german); create virtual table test using fts4(content="testdata",german); insert into testdata(german) VALUES ("[1] a b c"); insert into test(docid,german) VALUES(1,"[1] a b c "); insert into testdata(german) VALUES ("[{[_.,:;[1] a b c"); insert into test(docid,german) VALUES(2,"[{[_.,:;[1] a b c "); insert into testdata(german) VALUES ("1[1] a b c"); insert into test(docid,german) VALUES(3,"1[1] a b c "); insert into testdata(german) VALUES ("[1] a b c"); insert into test(docid,german) VALUES(4,"1[1] a b c "); insert into testdata(german) VALUES(char(8203,91,49,93,32,97,32,98,32,99)); insert into test(docid,german) VALUES(5,char(8203,91,49,93,32,97,32,98,32,99)); and the in an sqlite shell (SQLite version 3.8.8.1 2015-01-20 16:51:25) I get following for a select with snippet: EXAMPLE OUTPUT: sqlite> select docid,*,snippet(test) from test where german match "a"; 1|[1] a b c|1] a b c 2|[{[_.,:;[1] a b c|1] a b c 3|1[1] a b c|1[1] a b c 4|[1] a b c|1] a b c 5|[1] a b c|[1] a b c -As you can see for id 1 and 2 is at the right position but all beginning non-alphanumerical [,{, etc. are just left out in the snippet. -ID 3 works but has an additional 1 that should not be there so no solution... -ID 4 does not help and breaks the offsets so even worse -ID 5 works BUT this is a dirty fix i found. it adds an Unicode character ('ZERO WIDTH SPACE' (U+200B)) in front which obviously cant be seen and doesnt "break" the offsets (just shifts them all +1) I didn't test it yet on android but I hope so, since it supports Unicode ... obviously this is not a nice solution or one for more simpler/embedded systems. (btw. the same bug occurs also with fts3 and also with no special content option) here a small example for normal fts4 with a more custom snippet call: create virtual table test using fts4(german); insert into test VALUES("[1] a b c"); sqlite> select *,snippet(test,"#","#","...",0,64) from test where german match 'a'; [1] a b c|1] #a# b c regards boscowitch PS: please excuse the "german" ;) and all English spelling errors ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] FTS snippet()
Drake, if I do this, I get: SQL logic error or missing database. Thanks Gert ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] FTS snippet()
Quoth Gert Van Assche, on 2011-04-13 22:35:49 +0200: > SELECT snippet(example, '[', ']') FROM example WHERE CONTEXT MATCH > (SELECT TOKEN FROM example); You're asking to match a single independently arbitrarily chosen token from anywhere in the table (which is not even the same as "matching at least one token from the table"), not whether it matches the one from the same row. Can you do WHERE CONTEXT MATCH TOKEN instead? I think you still need a full table scan for that, but it should return the right results unless FTS4 has some relevant restriction on the RHS of a MATCH. ---> Drake Wilson ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] FTS snippet()
Hi all, I'm sure I'm doing something stupid here... CREATE VIRTUAL TABLE example USING fts4(TOKEN, CONTEXT); INSERT INTO example(TOKEN, CONTEXT) VALUES('one', 'This is just one sentence.'); INSERT INTO example(TOKEN, CONTEXT) VALUES('two', 'This is just one sentence. Sorry, it are two sentences.'); INSERT INTO example(TOKEN, CONTEXT) VALUES('three', 'More then three words in one sentence.'); SELECT snippet(example, '[', ']') FROM example WHERE CONTEXT MATCH (SELECT TOKEN FROM example); this returns This is just [one] sentence. This is just [one] sentence. Sorry, it are two sentences. More then three words in [one] sentence. while I was hoping for This is just [one] sentence. This is just one sentence. Sorry, it are [two] sentences. More then [three] words in one sentence. Can anyone tell me what I'm doing wrong? thanks gert ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] FTS, snippet & Unicode?
On Aug 27, 2008, at 4:52 AM, Alexandre Courbot wrote: > I know there is a patch at > http://www.sqlite.org/cvstrac/tktview?tn=3140,38 that is supposed to > improve Unicode support in FTS3. I suspect it to turn any Unicode > character into a token - however maybe you can use it as a basis to > implement what you need. Thanks for the pointer. WIll give it a try. Cheers, -- PA. http://alt.textdrive.com/nanoki/ ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] FTS, snippet & Unicode?
Alexey Pechnikov wrote: > > Is it included to 3.6.1 or 3.6.2 version? > No, it is not included in either version. The patch was submitted by the mozilla group, but it has not been checked in to SQLite. You can of course apply the patch to your own customized version of SQLite. HTH Dennis Cote ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] FTS, snippet & Unicode?
Hello! В сообщении от Wednesday 27 August 2008 06:52:09 Alexandre Courbot написал(а): > I know there is a patch at > http://www.sqlite.org/cvstrac/tktview?tn=3140,38 that is supposed to > improve Unicode support in FTS3. I suspect it to turn any Unicode > character into a token - however maybe you can use it as a basis to > implement what you need. Is it included to 3.6.1 or 3.6.2 version? Best regards, Alexey. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] FTS, snippet & Unicode?
I know there is a patch at http://www.sqlite.org/cvstrac/tktview?tn=3140,38 that is supposed to improve Unicode support in FTS3. I suspect it to turn any Unicode character into a token - however maybe you can use it as a basis to implement what you need. Alex. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] FTS, snippet & Unicode?
Hello, % sqlite3 -version 3.5.9 FTS's snippet seems to truncate Unicode sequences at time. For example, given the following text: Motto: ძალა ერთობაშია (Georgian) "Strength is in Unity" FTS's snippet would return the extract bellow for 'Unity, Freedom, Work': “… ��ია (Georgian) "Strength is in Unity" Anthem: Tavisupleba ("Freedom") Capital (and largest city) … America. Relations with NATO Georgia is working in becoming a full member of NATO. In …” Note how ერთობაშია has been truncated to ��ია. Thoughts? Thanks in advance. Cheers, -- PA. http://alt.textdrive.com/nanoki/ ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users