from:"Ralf Junker"

Re: [sqlite] fts3b-4.9 fails - FTS3 docid updates now possible with latest version?

2009-12-03 Thread Ralf Junker

On 02.12.2009 20:00, Dan Kennedy wrote:

> On Dec 2, 2009, at 3:07 PM, Ralf Junker wrote:
>
>> * Is it now intentionally possible to update FTS3 DOCIDs?
>
> It is. The failing sqlite test case has been replaced.

Thanks you, Dan! I had already noticed!

Ralf
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Error message from sqlite3_tokenizer_module.xCreate?

2009-12-03 Thread Ralf Junker

On 01.12.2009 09:01, Dan Kennedy wrote:

>> Monday 30 of November 2009 12:29:10 Ralf Junker napisał(a):

>> I am passing various arguments to sqlite3_tokenizer_module.xCreate.
>> In case they are invalid, I would like to return an explaining
>> error message in addition to SQLITE_ERROR. I did not find a way to
>> do this. Is it at all possible?

> I don't think it is possible at the moment. Unfortunately.

Dan, thanks for the confirmation.

Related to this, I noticed check-in [620a8a2b38] which avoids using an
uninitialized variable for the error message issued when
sqlite3_tokenizer_module.xCreate returns an error. The error message is now:

   pzErr = sqlite3_mprintf("unknown tokenizer");

I believe that the message is misleading because the tokenizer is not
unknown. The error is that a registered tokenizer can not be created.

Personally, I would like to see an error message giving the proper
reason and the name of the tokenizer:

   pzErr = sqlite3_mprintf(
 "cannot create tokenizer %s");

Also giving the tokenizer arguments is naturally helpful for
debugging and end-user support:

   pzErr = sqlite3_mprintf(
 "cannot create tokenizer %s with arguments %s");

Ralf
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] fts3b-4.9 fails - FTS3 docid updates now possible with latest version?

2009-12-02 Thread Ralf Junker

With the latest FTS3 changes, fts3b-4.9 no longer passes. This short SQL
emulates the test:

   DROP TABLE IF EXISTS t4;
   CREATE VIRTUAL TABLE t4 USING fts3(c);
   INSERT INTO t4 (docid, c) VALUES (12, 'still testing');
   UPDATE t4 SET docid = 14 WHERE docid = 12;

Note that SQLite 3.6.20 chokes with "SQL logic error or missing 
database" on the last line.

Everything runs fine with the latest FTS3. I can even query

   SELECT docid, * FROM t4 WHERE t4 MATCH 'still';

and it happily returns that the updated docid equals 14.

Questions:

* Is it now intentionally possible to update FTS3 DOCIDs?
* Should I modify my tests accordingly?

Thanks,

Ralf
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] sqlite3ExprCodeIN() problems with SQLITE_OMIT_SUBQUERY

2009-12-01 Thread Ralf Junker

On 01.12.2009 18:35, Jay A. Kreibich wrote:

> Important Note: The SQLITE_OMIT_* compile-time options are
> unsupported.

Oops! Thanks for focusing my eyes - they tend to skip introductions and
move right to the details.

Now having that read, let me point out that in spite of the statement
the SQLITE_OMIT_... options have worked nearly flawlessly for me over
the last few years. If there were little glitches, they were usually
fixed very quickly. Thank you, SQLite team!

Ralf
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] sqlite3ExprCodeIN() problems with SQLITE_OMIT_SUBQUERY

2009-12-01 Thread Ralf Junker

On 01.12.2009 18:05, D. Richard Hipp wrote:

> The key point to bare in mind here is that SQLITE_OMIT_SUBQUERY is
> not a supported compile-time option.  None of the major users of
> SQLite make use of SQLITE_OMIT_SUBQUERY.  We do not test it.  And it
> appears that it is broken in the current implementation.  It is
> unlikely to be fixed for 3.6.21.

Thank you for pointing this out! I was not aware that there are 
supported and unsupported compile-time options. Reading

   http://www.sqlite.org/compile.html

always made me believe that all options to omit features are supported.
Now I am not sure which ones I can really trust.

Could you clear up my confusion?

Ralf
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] sqlite3ExprCodeIN() problems with SQLITE_OMIT_SUBQUERY

2009-12-01 Thread Ralf Junker

Compiling with SQLITE_OMIT_SUBQUERY, sqlite3ExprCodeIN() is compiled out
but at the same time still required at other places in expr.c.

As such, expr.c does not link well with SQLITE_OMIT_SUBQUERY defined.

After I wrapped the remaining references to sqlite3ExprCodeIN() by
#ifndef SQLITE_OMIT_SUBQUERY it compiles fine. Am I safe to do so or are
there any caveats?

Ralf
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Problem with #include in fts3Int.h

2009-12-01 Thread Ralf Junker

I have a compilation problem with line 22 in fts3Int.h:

   #include 

According to http://gcc.gnu.org/onlinedocs/cpp/Include-Syntax.html, 
 includes are used for system header files, which sqlite3.h is not 
IMHO.

I tried to replace with

   #include "sqlite3.h"

but this resulted in a number of unknown identifiers further down the code.

Finally I changed it to

   #include "sqliteInt.h"

and the code compiled happily. Testing showed that it also worked OK.

Am I on the right track?

Ralf

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Error message from sqlite3_tokenizer_module.xCreate?

2009-12-01 Thread Ralf Junker

On 01.12.2009 09:01, Dan Kennedy wrote:

> I don't think it is possible at the moment. Unfortunately.

Thanks for the clarification, Dan!

I observe that you are currently writing the "official" FTS3
documentation in preparation for the next release of SQLite.

Maybe you want to make tokenizer error messages possible before the docs
are finished and the "unfortunate" xCreate API is carved in stone?

;-)

Ralf
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Error message from sqlite3_tokenizer_module.xCreate?

2009-11-30 Thread Ralf Junker

On 30.11.2009 20:33, Grzegorz Wierzchowski wrote:
> Monday 30 of November 2009 12:29:10 Ralf Junker napisał(a):
>> I am passing various arguments to sqlite3_tokenizer_module.xCreate. In case
>> they are invalid, I would like to return an explaining error message in
>> addition to SQLITE_ERROR. I did not find a way to do this. Is it at all
>> possible?
>>
>> Thanks, Ralf
>
> The last argument of xCreate() is  char **pzErr.
> It is exactly designed for the purpose you describe.
> See also description in doc-zip: sqlite-3_6_18-docs/vtab.html, or
> http://www.sqlite.org/vtab.html

Thank you for your answer! I believe you are mixing up the virtual table 
sqlite3_module.xCreate() in sqlite3.h and 
sqlite3_tokenizer_module.xCreate() in fts3_tokenizer.h.

The latter does not have the pzErr argument:

struct sqlite3_tokenizer_module {



int (*xCreate)(
 int argc,   /* Size of argv array */
 const char *const*argv, /* Tokenizer argument strings */
 sqlite3_tokenizer **ppTokenizer /* OUT: Created tokenizer */
   );

So I believe my question remains unanswered. Any suggestion, anyone?

Ralf
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Error message from sqlite3_tokenizer_module.xCreate?

2009-11-30 Thread Ralf Junker

I am passing various arguments to sqlite3_tokenizer_module.xCreate. In case 
they are invalid, I would like to return an explaining error message in 
addition to SQLITE_ERROR. I did not find a way to do this. Is it at all 
possible?

Thanks, Ralf

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Error message from sqlite3_tokenizer_module.xCreate?

2009-11-30 Thread Ralf Junker

I am passing various arguments to sqlite3_tokenizer_module.xCreate. In case 
they are invalid, I would like to return an explaining error message in 
addition to SQLITE_ERROR. I did not find a way to do this. Is it at all 
possible?

Thanks, Ralf

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Bug report: Memory reused after freed

2009-09-29 Thread Ralf Junker

At 14:04 29.09.2009, Dan Kennedy wrote:

>On Sep 29, 2009, at 4:30 PM, Ralf Junker wrote:
>
>> My memory manager reports that the SQL below results in memory being  
>> reused after it has already been freed when it is RUN FOR A SECOND  
>> TIME on the same database connection.
>
>Hi Ralf,
>
>Thanks for this report. I'm unable to reproduce the problem so far
>though.

No longer needed. The problem no longer shows with check-in [582bd76828] 
applied. In fact I believed that it was initiated by my report on this list.

Information below is just FYI.

>Are you compiling SQLite with any special symbols or anything
>like that?

Not that I am aware of. Certainly nothing except the usual which I did before 
the problem surfaced.

>Are you able to build other programs using this memory manager? If so,
>does the bug show up when you pass the problematic SQL to the command
>line tool twice? Or if I send you a C program are you able to build and
>test it?

I am not able to use the exact same memory manager when I run C code directly.

Alternatively, I did just now run SQLite on Embarcadero's C++ Builder with 
CodeGuard enabled. This results in a magnitude of error reports while the 
SQLite code still produces correct results AFAIKS. It is impossible for me to 
tell if these reports also contain the one I reported earlier. Knowing little 
about how well CodeGuard does its job, I am not sure how seriously I shall take 
its warnings as long as all works well.

Ralf 

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Bug report: Memory reused after freed

2009-09-29 Thread Ralf Junker

At 14:04 29.09.2009, Dan Kennedy wrote:

>On Sep 29, 2009, at 4:30 PM, Ralf Junker wrote:
>
>> My memory manager reports that the SQL below results in memory being  
>> reused after it has already been freed when it is RUN FOR A SECOND  
>> TIME on the same database connection.
>
>Hi Ralf,
>
>Thanks for this report. I'm unable to reproduce the problem so far
>though.

No longer needed. The problem no longer shows with check-in [582bd76828] 
applied. In fact I believed that it was initiated by my report on this list.

Information below is just FYI.

>Are you compiling SQLite with any special symbols or anything
>like that?

Not that I am aware of. Certainly nothing except the usual which I did before 
the problem surfaced.

>Are you able to build other programs using this memory manager? If so,
>does the bug show up when you pass the problematic SQL to the command
>line tool twice? Or if I send you a C program are you able to build and
>test it?

I am not able to use the exact same memory manager when I run C code directly.

Alternatively, I did just now run SQLite on Embarcadero's C++ Builder with 
CodeGuard enabled. This results in a magnitude of error reports while the 
SQLite code still produces correct results AFAIKS. It is impossible for me to 
tell if these reports also contain the one I reported earlier. Knowing little 
about how well CodeGuard does its job, I am not sure how seriously I shall take 
its warnings as long as all works well.

Ralf 

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Bug report: Memory reused after freed

2009-09-29 Thread Ralf Junker

My memory manager reports that the SQL below results in memory being reused 
after it has already been freed when it is RUN FOR A SECOND TIME on the same 
database connection.

Psydocode (I do not run C):

  sqlite3_initialize;
  sqlite3_open(DB_FILE_NAME, );
  sqlite3_exec(DB, SQL, NULL, NULL, NULL);
  sqlite3_exec(DB, SQL, NULL, NULL, NULL);
  sqlite3_close(DB);
  sqlite3_shutdown;

The database file does not exist before running sqlite3_open().

I can reproduce the problem reliably with [e4eb227b14]. There were simmilar 
memory reports during the development of fkeys.c. They have been fixed by 
[5b4d46374a] except for the one mentioned here.

Finally, here is the SQL:

  PRAGMA foreign_keys = 0;
  DROP TABLE IF EXISTS t1;
  DROP TABLE IF EXISTS t2;
  
  PRAGMA foreign_keys = 1;
  CREATE TABLE t1(a PRIMARY KEY, b UNIQUE);
  CREATE TABLE t2(x REFERENCES t1 ON UPDATE CASCADE ON DELETE CASCADE);
  INSERT INTO t1 VALUES('aaa', 1);
  INSERT INTO t2 VALUES('aaa');
  UPDATE t1 SET a = 'bbb';
  DELETE FROM t1;
  
  PRAGMA foreign_keys = 0;
  DROP TABLE t1;
  DROP TABLE t2;
  
  PRAGMA foreign_keys = 1;
  CREATE TABLE t1(x, y, PRIMARY KEY(x, y));
  CREATE TABLE t2(a, b, FOREIGN KEY(a, b) REFERENCES t1 ON UPDATE CASCADE);
  INSERT INTO t1 VALUES(1, 2);
  INSERT INTO t2 VALUES(1, 2);
  UPDATE t1 SET x = 5;

Ralf



Call stack leading to the problem:

Modified byte offsets (and lengths): 512(6)

The previous block size was: 2048

This block was previously allocated by thread 0x8BC, and the stack trace 
(return addresses) at the time was:
4088CE [DebugReallocMem]
413A48 [realloc][4198]
453A7E [...@sqlite3realloc]
453AF9 [sqlite3_realloc]
453C22 [...@sqlite3dbrealloc]
440AE6 [...@sqlite3vdbeswap]
440B37 [...@sqlite3vdbeaddop3]
440B8F [...@sqlite3vdbeaddop0]
4256CE [...@sqlite3triggersexist]
4539A3 [...@sqlite3dbfree]
425894 [...@sqlite3triggersexist]

The allocation number was: 632

The block was previously freed by thread 0x8BC, and the stack trace (return 
addresses) at the time was:
413781 [Free][3285]
45397C [sqlite3_free]
4539C7 [...@sqlite3dbfree]
440FA2 [...@sqlite3vdbejumphere]
440FD2 [...@sqlite3vdbeprogramdelete]
440F5F [...@sqlite3vdbejumphere]
440F88 [...@sqlite3vdbejumphere]
44265B [...@sqlite3vdbedelete]
4425A5 [...@sqlite3vdbefinalize]
42DA71 [sqlite3_exec]
45B9F7 [D:\DI_Stable\Lib\SQLite3\Test\Test.dpr][Test][initialization][45]

The current thread ID is 0x8BC, and the stack trace (return addresses) leading 
to this error is:
408698 [DebugGetMem]
4088CE [DebugReallocMem]
413A48 [realloc][4198]
453A7E [...@sqlite3realloc]
453AF9 [sqlite3_realloc]
453C22 [...@sqlite3dbrealloc]
440AE6 [...@sqlite3vdbeswap]
440B37 [...@sqlite3vdbeaddop3]
440B8F [...@sqlite3vdbeaddop0]
4256CE [...@sqlite3triggersexist]
4539A3 [...@sqlite3dbfree]

Current memory dump of 256 bytes starting at pointer address 7FEE3CD0:
A0 DC 46 00 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 
80 80 80 80 80 80
80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 
80 80 80 80 80 80
80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 
80 80 80 80 80 80
80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 
80 80 80 80 80 80
80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 
80 80 80 80 80 80
80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 
80 80 80 80 80 80
80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 
80 80 80 80 80 80
80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 
80 80 80 80 80 80
   Ü  F  .                                               
         
                                                     
         
                                                     
         
                                                     
         
                                                     
         
                                                     
         
                                                     
         
                                                     
         

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] GROUPY BY alias backward incompatibility

2009-05-30 Thread Ralf Junker

At 15:37 28.05.2009, D. Richard Hipp wrote:

>Have you tried these two queries on other SQL database engines besides  
>SQLite?  What do PostgreSQL and MySQL make of them?

I could now run the queries on Oracle Database 10g Express Edition Release 
10.2.0.1.0.

Prepare the table:

  create table t (c integer);

  insert into t values (1);
  insert into t values (2);

Query with t_outer (the one that fails SQLite > 3.5.3) runs without error:

  select
(select count(*) from t t_inner
 group by t_outer.c) -- t_outer !!!
  from t t_outer;

Query with t_inner (which works on all SQLite versions) fails with error 
"ORA-01427: single-row subquery returns more than one row":

  select
(select count(*) from t t_inner
 group by t_inner.c) -- t_inner !!!
  from t t_outer;

Preliminary conclusion:

  * Behaviour is inconsistent between MySQL and Oracle.

  * The query that fails on SQLite > 3.5.3 works on both MySQL and Oracle.

  * The query that works on SQLite all versions fails on Oracle.

Additional findings from other DB engines would be helpful. Anyone?

Other than that, I believe it would be desirable if SQLite would support the 
t_outer query as it did up to 3.5.3.

Ralf 

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] GROUPY BY alias backward incompatibility

2009-05-28 Thread Ralf Junker

At 15:37 28.05.2009, D. Richard Hipp wrote:

>Have you tried these two queries on other SQL database engines besides  
>SQLite?  What do PostgreSQL and MySQL make of them?

MySQL (5.0.21) reports no erros on either of both queries:

  select
(select count(*) from t t_inner
 group by t_outer.c) -- t_outer !!!
  from t t_outer;

  select
(select count(*) from t t_inner
 group by t_inner.c) -- t_inner !!!
  from t t_outer;

I do not have access to PostgreSQL right now.

The SQLite help [1] says: "The expressions in the GROUP BY clause do not have 
to be expressions that appear in the result." Reading this, I'd expect that 
both queries should run - even if the 1st one does not make much sense. 
Opinions?

Ralf

[1] http://www.sqlite.org/lang_select.html  

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] GROUPY BY alias backward incompatibility

2009-05-28 Thread Ralf Junker

At 13:27 28.05.2009, D. Richard Hipp wrote:

>On May 28, 2009, at 7:01 AM, Ralf Junker wrote:
>
>> Hello!
>>
>> Up to version 3.5.3, SQLite would happily execute this SQL without  
>> complaints:
>>
>>  drop table if exists t;
>>
>>  create table t (c);
>>
>>  select
>>(select count() from t t_inner
>> group by t_outer.c)
>>  from t t_outer;
>>
>
>This query does not make any sense.  What are you trying to do?

The above query is just a shortened version to demonstrate the backward 
incompatibilty. The original is far longer and calculates counts and average 
counts:

DROP TABLE IF EXISTS RemSymptoms;
DROP TABLE IF EXISTS SymCategory;

CREATE TABLE RemSymptoms (SymId);
CREATE TABLE SymCategory (CatID, SymId);

SELECT SymCategory.CatID, round(1000*COUNT(*) /
  (SELECT COUNT(*) 
 FROM SymCategory TempSymCategory 
 WHERE TempSymCategory.CatId=SymCategory.CatID 
 GROUP BY SymCategory.CatID))
 AS CatAvgRemCount, 
  (SELECT COUNT(*) 
 FROM SymCategory TempSymCategory 
 WHERE TempSymCategory.CatId=SymCategory.CatID 
 GROUP BY SymCategory.CatID) 
 AS CatSymCount 
  FROM RemSymptoms 
INNER JOIN SymCategory ON RemSymptoms.SymId=SymCategory.SymId
  GROUP BY SymCategory.CatID;

A user found that this query no longer works after upgrading to SQLite 3.5.4 or 
later.

Ralf 

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] GROUPY BY alias backward incompatibility

2009-05-28 Thread Ralf Junker

At 13:27 28.05.2009, D. Richard Hipp wrote:

>On May 28, 2009, at 7:01 AM, Ralf Junker wrote:
>
>> Hello!
>>
>> Up to version 3.5.3, SQLite would happily execute this SQL without  
>> complaints:
>>
>>  drop table if exists t;
>>
>>  create table t (c);
>>
>>  select
>>(select count() from t t_inner
>> group by t_outer.c)
>>  from t t_outer;
>>
>
>This query does not make any sense.  What are you trying to do?

The above query is just a shortened version to demonstrate the backward 
incompatibilty. The original is far longer and calculates counts and average 
counts:

DROP TABLE IF EXISTS RemSymptoms;
DROP TABLE IF EXISTS SymCategory;

CREATE TABLE RemSymptoms (SymId);
CREATE TABLE SymCategory (CatID, SymId);

SELECT SymCategory.CatID, round(1000*COUNT(*) /
  (SELECT COUNT(*) 
 FROM SymCategory TempSymCategory 
 WHERE TempSymCategory.CatId=SymCategory.CatID 
 GROUP BY SymCategory.CatID))
 AS CatAvgRemCount, 
  (SELECT COUNT(*) 
 FROM SymCategory TempSymCategory 
 WHERE TempSymCategory.CatId=SymCategory.CatID 
 GROUP BY SymCategory.CatID) 
 AS CatSymCount 
  FROM RemSymptoms 
INNER JOIN SymCategory ON RemSymptoms.SymId=SymCategory.SymId
  GROUP BY SymCategory.CatID;

A user found that this query no longer works after upgrading to SQLite 3.5.4 or 
later.

Ralf 

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] GROUPY BY alias backward incompatibility

2009-05-28 Thread Ralf Junker

Hello!

Up to version 3.5.3, SQLite would happily execute this SQL without complaints:

  drop table if exists t;

  create table t (c);

  select 
(select count() from t t_inner
 group by t_outer.c)
  from t t_outer;

This behaviour changed in version 3.5.4. From then on, SQLite issues an "SQL 
error near line 4: no such column: t_outer.c". This also shows in the most 
recent version (3.6.14).

I searched both the change log and the timeline for 3.5.4 but could not find 
any explicit mention of GROUP BY.

I did note, however, that http://www.sqlite.org/releaselog/3_5_4.html talks 
about bringing "the processing of ORDER BY into compliance with the SQL 
standard".

Questions:

* Does the ORDER BY change also apply to GROUP BY?

* Are there any test cases for the new behaviour?
  I failed to find any in 3.6.14 test suite.

* Post 3.5.3, the query works fine if I GROUP BY the inner table:

  select 
(select count() from t t_inner
 group by t_inner.c)
  from t t_outer;

  Is this the intended behaviour and should users update 
  their SQL accordingly?

Thanks for any answers,

Ralf

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Table-Alias Confusion or Bug?

2009-05-27 Thread Ralf Junker

At 17:33 27.05.2009, D. Richard Hipp wrote:

>On May 27, 2009, at 11:26 AM, Ralf Junker wrote:
>
>> Hello!
>>
>> Given this SQL:
>>
>>  drop table if exists t;
>>
>>  create table t (c);
>>
>>  select *
>>from t as ta
>>where t.c = ta.c;
>>
>> SQLite3 chokes with the following error:
>>
>>  SQL error near line 5: no such column: t.c
>>
>> Question:
>>
>> Is this the expected behaviour? I know that the where clause does  
>> not make sense, but shouldn't SQLite see both "t.c" and "ta.c" as  
>> the same columns instead of reporting the error?
>
>This is the behavior that I expected because I deliberately coded it  
>that way.  Once you us an AS clause, the table is thereafter known  
>only by its AS name, not its original name.

Thanks for the clarification!

Ralf 

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Table-Alias Confusion or Bug?

2009-05-27 Thread Ralf Junker

Hello!

Given this SQL:

  drop table if exists t;

  create table t (c);

  select *
from t as ta 
where t.c = ta.c;

SQLite3 chokes with the following error:

  SQL error near line 5: no such column: t.c

Question:

Is this the expected behaviour? I know that the where clause does not make 
sense, but shouldn't SQLite see both "t.c" and "ta.c" as the same columns 
instead of reporting the error?

Thanks for any response,

Ralf

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] group_concat bug

2009-05-07 Thread Ralf Junker

Confirmed and created ticket http://www.sqlite.org/cvstrac/tktview?tn=3841.

Ralf

At 03:10 07.05.2009, Steve Bauer wrote:
>The following example demonstrates what seems to be a bug in group_concat.  
>With the latest version of CVS:
>
>CREATE TABLE example (id INTEGER, x TEXT);
>CREATE TABLE table2 (key TEXT, x TEXT);
>CREATE TABLE list (key TEXT, value TEXT);
>
>INSERT INTO example VALUES (1, "a");
>INSERT INTO table2 VALUES ("a", "alist");
>INSERT INTO table2 VALUES ("b", "blist");
>INSERT INTO list VALUES ("a", 1);
>INSERT INTO list VALUES ("a", 2);
>INSERT INTO list VALUES ("a", 3);
>INSERT INTO list VALUES ("b", 4);
>INSERT INTO list VALUES ("b", 5);
>INSERT INTO list VALUES ("b", 6);
>
>SELECT example.id,
>   table2.x,
>   (SELECT group_concat(list.value) FROM list WHERE list.key = table2.key)
>FROM example, table2;
>
>Output:
>
>1|alist|1,2,3
>1|blist|,4,5,6
>
>I expected:
>
>1|alist|1,2,3
>1|blist|4,5,6

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Strange SAVEPOINT behaviour with locking_mode=exclusive

2009-04-20 Thread Ralf Junker

At 15:37 20.04.2009, D. Richard Hipp wrote:

>Ticket #3811 has been addressed by enhancing the documentation to  
>explain that journal_mode changes are only guaranteed to work if they  
>occur prior to the first transaction.  Whether or not a journal_mode  
>change works after the start of the first transaction is undefined.

Thanks!

>Even as I type this message, it occurs to me that we should probably  
>lock down the journal_mode at the start of the first transaction.   
>Otherwise, there are a bazillion cases of journal_mode changes at  
>strange times (such as in the middle of a nested transaction) that  
>could cause problems (segfaults) even if their behavior is undefined.

In this case, it would make sense IMO that setting the journal_mode should 
generally (and especially after the 1st transaction) report the active mode so 
applications can check if the journal_mode change was in fact effective.

Ralf 

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Strange SAVEPOINT behaviour with locking_mode=exclusive

2009-04-17 Thread Ralf Junker

I experience some strange behaviour with SAVEPOINT in combination with 
locking_mode=exclusive. Below is a script which I assembled from savepoint.text 
13.1 to 13.4. Those tests run OK in normal locking mode, but fail in exclusive 
locking more.

To reproduce, run the script below on a NEW and EMPTY database. The result 
difference is marked as follows:

  a  b
  1  2 
  3  4 
  5  6 
  7  8 
  9  10 
  11 12 
  13 14 <- missing rows if run in exclusive mode 
  15 16 <- on a new and empty database



PRAGMA locking_mode=exclusive;

DROP TABLE IF EXISTS t1;

BEGIN;
  CREATE TABLE t1(a PRIMARY KEY, b);
  INSERT INTO t1 VALUES(1, 2);
COMMIT;

PRAGMA journal_mode = off;

BEGIN;
  INSERT INTO t1 VALUES(3, 4);
  INSERT INTO t1 SELECT a+4,b+4  FROM t1;
COMMIT;

BEGIN;
  INSERT INTO t1 VALUES(9, 10);
  SAVEPOINT s1;
INSERT INTO t1 VALUES(11, 12);
COMMIT;

BEGIN;
  INSERT INTO t1 VALUES(13, 14);
  SAVEPOINT s1;
INSERT INTO t1 VALUES(15, 16);
ROLLBACK TO s1;
ROLLBACK;

SELECT * FROM t1;



I wonder if this behaviour is intended of if there is an error in the library?

Ralf

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Feature request: Report constraint name(s) in error message

2009-03-09 Thread Ralf Junker

Roger Binns wrote:

>> when a named constraint is violated, the name of the constraint which 
>> actually failed is not included in the error message.
>
>There has been a ticket about this for over 3 years, and also includes a
>patch to fix it:
>
>  http://www.sqlite.org/cvstrac/tktview?tn=1648

Thanks, the ticked escaped my notice. Given that there is a patch suggestion 
already, let's hope that it will eventually make it into the SQLite core.

Ralf 

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] tool to browse a sqlite database

2009-03-08 Thread Ralf Junker

RB Smissaert wrote:

>What SQLite version produced the file World.db3?

I am not 100% sure about the exact SQLite version which I used to create the 
original World.db3, but I am VACUUMing it regularly to bring it up to date with 
recent versions. So I expect it should be some version after 3.6.8.

>I ask as my wrapper doesn't pick correctly the fields of a table.
>This is Olaf Schmidt's VB wrapper dhRichClient with SQLite 3.6.11.

I just did a PRAGMA integrity_check; on World.db3 with the SQLite3.exe v. 
3.6.11 and it reports 'ok'. Looks like a wrapper problem to me.

>BTW, SQLiteSpy looks a very nice GUI tool.

Thanks!

Ralf 

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Feature request: Report constraint name(s) in error message

2009-03-08 Thread Ralf Junker

Hi,

when a named constraint is violated, the name of the constraint which actually 
failed is not included in the error message.

Example 1:

  create table con (
a text constraint must_not_be_null not null);
  insert into con values (null);

  Returns error "con.a may not be null".

Example 2:

  create table con (
a text 
constraint must_not_be_null check (a not null));
  insert into con values (null);

  Returns error "constraint failed".

Especially with many constraints in place, constraint names can be really 
helpful to fulfill the constraint. In this regard, error message 2 in 
particular is not helpful at all in fixing the violation.

It would therefore be nice if error messages could include the name of the 
constraint that failed.

Also nice to have: If multiple constraints fail at the same time, mention all 
failing constraints and their names in a single error message.

Ralf

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] tool to browse a sqlite database

2009-03-08 Thread Ralf Junker

BareFeet wrote:

>See a comparison of several GUI SQLite tools here:
>http://www.tandb.com.au/sqlite/compare/?ml

SQLiteSpy is missing from the list. It is available from

  http://www.yunqa.de

SQLiteSpy is a Unicode SQLite3 database browser GUI for Win32. Features include:

* Database at a Glance  The schema treeview displays all items contained in a 
database, including tables, columns, indexes and triggers. Press F5 to update 
the schema tree, double-click a table or view to display its data, use the 
context menu for frequently used commands.

* Grid Cell Editing  Table cells are editable in the grid: Display a table via 
the schema treeview, select a cell and press F2 to invoke the editor. Then 
modify and confirm to write your changes back to the table.

* Data Type Display  The native SQL data types are displayed with different 
background colors to help detect type errors. Type errors can cause performance 
degradation or wrong SELECT result sets if NULL values are confused with empty 
strings.

* Full Unicode  SQLiteSpy fully supports SQLite's Unicode capabilities. Data 
display and entry is completely realized as Unicode, including SQL commands.

* Multiple SQL Edits  Modern tabs are used to edit and display multiple SQL 
queries for easy comparison of query statements and results. SQL queries are 
executed by typing or loading them into the SQL edit. Then press F9 to run the 
query, or CTRL+F9 to run the current line or selection only.

* Time Measurement  SQL execution time is automatically measured and displayed 
to help optimize queries.

* Regular Expressions  The SQL keyword REGEXP is supported and adds the 
complete regular expression syntax of Perl 5.10 to SQLiteSpy.

* Mathematical SQL Functions  The following mathematical SQL functions are 
available in addition to the SQLite default: ACOS(), ASIN(), ATAN(), ATAN(), 
ATAN2(), CEIL(), CEILING(), COS(), COT(), DEGREES(), EXP(), FLOOR(), LN(), 
LOG(), LOG(), LOG2(), LOG10(), MOD(), PI(), POW(), RADIANS(), SIGN(), SIN(), 
SQRT(), TAN(), TRUNCATE().

* Data Compression  The SQL functions COMPRESS() applies zlib's deflate to any 
text or BLOB value. The raw deflate data stream is returned. UNCOMPRESS() 
inflates this stream back to the original. Integers, Doubles, and Nulls are 
returned unchanged.

* Compact Result Storage  The internal data storage mechanism uses SQLite's 
native data types for optimal compatibility. As a result, SQLiteSpy uses far 
less memory than other SQLite managers and handles large tables much more 
efficiently.

* Built in SQLite Engine  SQLiteSpy comes as a single file executable with the 
SQLite database engine already build into the application. There is no need to 
distribute any DLLs, which makes SQLiteSpy easy to deploy with customers.

* Easy Install & Uninstall  To run SQLiteSpy, just extract the SQLiteSpy.exe 
file to any directory and execute the file. No installation is needed  when 
first started, the program creates a single file SQLiteSpy.db3 (a SQLite3 
database) to store options and settings. It does not write any other files or 
to the registry. Uninstalling is as simple as deleting two files only: The 
application's executable and its options database file.

* Freeware  SQLiteSpy is Freeware for personal and educational use. If you are 
using SQLiteSpy commercially, your donation is welcome to promote the ongoing 
development of this software.

Ralf  

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Impossible to declare field type BIGINT PRIMARY KEY

2008-12-15 Thread Ralf Junker

Markus Hoenicka wrote:

>If there's a way to find out at runtime that a column has been defined as 
>INTEGER PRIMARY KEY instead of as INTEGER, all is well and I'll be able to fix 
>the sqlite driver accordingly.

  pragma table_info(table);

In SQLite, this SQL returns all columns defined for table along with their 
types. Check the "pk" result column to see if a named column is a PRIMARY KEY 
column.

Official documentation is here

  http://www.sqlite.org/pragma.html#pragma_table_info

Ralf 

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] FTS 3 Crash in 3.6.5

2008-11-13 Thread Ralf Junker

Shawn Wilsher wrote:

>I seem to have found a crash that is 100% reproducible in SQLite 3.6.5.  I'm 
>managed to make a reduced test case in a C file that can be found here: 
>http://files.shawnwilsher.com/2008/11/12/test.c

I could reproduce the problem and opened this ticket to make sure it does not 
get overlooked:

  http://www.sqlite.org/cvstrac/tktview?tn=3497

Ralf 

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] memsys3 vs memsys5

2008-09-09 Thread Ralf Junker


>> Is there any
>> reliable method to determine the minimum page-cache allocation size
>> needed for a given page_size?
>>
>
>sqlite3_status(SQLITE_STATUS_PAGECACHE_SIZE, ...)

Unfortunately, there is extra overhead involved which is not handled by the 
page cache but is allocated by malloc() if I understand correctly.

 From reading the sources (some weeks ago, the exact numbers might have 
changed) I determined an extra of at least 152 bytes are allocated for each 
page. 

Reference is nByteHdr in pager.c, plus an extra 4 byte (on 32 bit) for the slot 
the page hash (probably a little more to accommodate for empty slots). 
Interestingly, the page cache overhead size can differ between SQLite versions 
and also depends on SQLITE_OMIT_... compiler settings.

152 bytes might not seem much but this is nearly 15% for a page cache size of 
1024 bytes, and nearly 30% for a 512 byte page cache. So you safe overhead by 
choosing a larger page size.

Concluding, memory requirements are much higher than just 

  pragma page_size * pragma cache_size

This can sum up to lots of RAM when using page_size = 10 or more to speed 
up indexing of very large databases.

Users should be aware of this when running SQLite on extreme (low RAM or high 
size database) scenarios.

Ralf 

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] rtree woes with SQLITE_OMMIT_...

2008-07-19 Thread Ralf Junker


>> I am running the rtree module against an SQLite build which has  
>> lots of functionality SQLITE_OMIT_...ed.
>
>Can you be more specific? Exactly which SQLITE_OMIT symbols
>are defined?

Sure. Here is the list:

  SQLITE_OMIT_ALTERTABLE
  SQLITE_OMIT_ANALYZE
  SQLITE_OMIT_ATTACH
  SQLITE_OMIT_AUTHORIZATION
  SQLITE_OMIT_AUTOINCREMENT
  SQLITE_OMIT_AUTOVACUUM
  SQLITE_OMIT_BETWEEN_OPTIMIZATION
  SQLITE_OMIT_BLOB_LITERAL
  SQLITE_OMIT_CAST
  SQLITE_OMIT_CHECK
  SQLITE_OMIT_COMPLETE
  SQLITE_OMIT_COMPOUND_SELECT
  SQLITE_OMIT_EXPLAIN
  SQLITE_OMIT_FLAG_PRAGMAS
  SQLITE_OMIT_FOREIGN_KEY
  SQLITE_OMIT_GET_TABLE
  SQLITE_OMIT_GLOBALRECOVER
  SQLITE_OMIT_INCRBLOB
  SQLITE_OMIT_INTEGRITY_CHECK
  SQLITE_OMIT_LIKE_OPTIMIZATION
  SQLITE_OMIT_LOAD_EXTENSION
  SQLITE_OMIT_MEMORYDB
  SQLITE_OMIT_OR_OPTIMIZATION
  SQLITE_OMIT_PAGER_PRAGMAS
  SQLITE_OMIT_PROGRESS_CALLBACK
  SQLITE_OMIT_QUICKBALANCE
  SQLITE_OMIT_REINDEX
  SQLITE_OMIT_SCHEMA_VERSION_PRAGMAS
  SQLITE_OMIT_SHARED_CACHE
  SQLITE_OMIT_SUBQUERY
  SQLITE_OMIT_TCL_VARIABLE
  SQLITE_OMIT_TEMPDB
  SQLITE_OMIT_TRACE
  SQLITE_OMIT_TRIGGER
  SQLITE_OMIT_UTF16
  SQLITE_OMIT_VACUUM
  SQLITE_OMIT_VIEW
  SQLITE_OMIT_XFER_OPT

I have reconstructed a few tests from rtree1.test and the errors only show with 
the SQLITE_OIMIT_... symbols defined. Otherwise the tests run just fine.

>Rtree uses ALTER TABLE as part of the xRename() callback. The
>xRename() callback is invoked when the sqlite user does an
>ALTER TABLE. So, the rtree code that uses ALTER TABLE will
>never be invoked when the module is being used by an sqlite
>build that does not support ALTER TABLE. I suspect the same
>is true of fts3.

Right, I did not think of the fact that with SQLITE_OMIT_ALTERTABLE defined the 
parser already triggers a syntax error so it never reaches the virtual tree.

>It looks like there are a few extra lines of code that could
>be omitted from the build when SQLITE_OMIT_ALTERTABLE is
>defined though.

Agreed, but this would not save much footprint so it might not be worth the 
effort. I was more thinking in terms of detecting potential errors at 
compilation before they occur at run-time.

Many thanks for looking into this,

Ralf  

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] problem building reduced-size sqlite

2008-07-18 Thread Ralf Junker

Steve Friedman wrote:

>When trying to build sqlite (latest cvs update) with the following:
>
>../sqlite-latest/configure --disable-threadsafe --enable-tempstore=always \
>  --disable-tcl --enable-extensions \
>  CFLAGS="-g -O3 -DSQLITE_ENABLE_RTREE=1 \
>  -DTEMP_STORE=2 -DSQLITE_DEFAULT_CACHE_SIZE=100 \
>  -DSQLITE_DEFAULT_PAGE_SIZE=32768 \
>-DSQLITE_OMIT_ALTERTABLE=1 \

 From http://www.sqlite.org/compile.html:

1.5 Options To Omit Features

The following options can used to reduce the size of the compiled library by 
omitting optional features. This is probably only useful in embedded systems 
where space is especially tight, as even with all features included the SQLite 
library is relatively small. Don't forget to tell your compiler to optimize for 
binary size! (the -Os option if using GCC). Telling your compiler to optimize 
for size usually has a much large impact on library footprint than employing 
any of these compile-time options.

The macros in this section do not require values. The following compilation 
switches all have the same effect:
-DSQLITE_OMIT_ALTERTABLE
-DSQLITE_OMIT_ALTERTABLE=1
-DSQLITE_OMIT_ALTERTABLE=0 

If any of these options are defined, then the same set of SQLITE_OMIT_XXX 
options must also be defined when using the 'lemon' tool to generate a parse.c 
file. Because of this, these options may only be used when the library is built 
from source, not from the amalgamation or from the collection of pre-packaged C 
files provided for non-UNIX like platforms on the website. 

Ralf 

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] rtree woes with SQLITE_OMMIT_...

2008-07-18 Thread Ralf Junker

I am running the rtree module against an SQLite build which has lots of 
functionality SQLITE_OMIT_...ed.

Surprisingly, I receive strange errors like SQLITE_NOMEM for simple statements 
like

  CREATE VIRTUAL TABLE t6 USING rtree(ii, x1, x2);

or 

  SELECT ii FROM t6 WHERE x1>2;

Question: Does the rtree module rely on some SQLite functionality which I might 
have omitted?

I notice that rtree uses ALTER TABLE without checking for 
SQLITE_OMIT_ALTERTABLE. Even though this is not causing me problems right now, 
it might help to exclude alter table functionality from rtree.c conditinally or 
issue an appropriate error when called.

Btw, ALTER TABLE also applies to FTS3. Is this worth an extra thread or even a 
ticket?

Ralf

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Follow-up: changed default ordering of SELECT queries in 3.5.9?

2008-05-30 Thread Ralf Junker

Hello Jens Miltner,

>However, I'd still like to know whether the new behavior of returning  
>the DISTINCT results in 'arbitrary' order is expected.

Unless ORDER BY is specified, the result of any SELECT is by definition 
undefined. This undefined sort order has changed between previous versions and 
might do so again in the future.

Ralf 

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] SQLite allows "RowID" to be the name of a column

2008-05-22 Thread Ralf Junker

Darren Duncan wrote:

>Ralf Junker wrote:
>
>>Can you suggest an alternative to a single reserved name to represent the 
>>column which uniquely identifies a database record under any and all 
>>circumstances?
>
>Yes, change the interface to RowID into a routine call rather than a column 
>name; eg use "RowID()" rather than "RowID".  

I can not see how this would actually work with SQLite. Any use-created RowID 
column would override and hide the implicit rowid column even for the RowID() 
function, would it not?

>Then when using it in a SELECT, you can say "RowID() as foo" in the select 
>list where "foo" is different than a normal table field.  Such is how 
>'standard' SQL does it. 

What is 'standard" SQL? Can you give an example how this is used with other DB 
engines? I am not familiar with MySQL, but searching the documentation I could 
not find that it supports this concept. Maybe others do?

>Any manager app can read the database schema first and generate a name "foo" 
>that is distinct.

As things are at the moment, the implicit, unambigous RowID can not be 
retrieved from the database schema if all three "RowID", "_rowid_", and "OId" 
column names are overridden. This applies to SQL as well as to user-defined 
functions.

Ralf  

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] SQLite allows "RowID" to be the name of a column

2008-05-21 Thread Ralf Junker

Darren Duncan wrote:

>I think the real problem here is that SQL allows you to have non-distinct 
>rows in a table, when all rows should be distinct.

SQLite's implicit "RowID" does not allow non-distinct values (except for NULL, 
but this is documented behavior and only maintained for backwards 
compatability. It might change in a future version. The sooner, the better, 
IMHO).

>Working within SQL's flaws, the solution here is for every table to have a 
>unique constraint on one or more table columns.  Then applications just use 
>that to uniquely identify the row.

This is exactly the concept of "RowID". Nothing wrong with that. My only 
criticism is that this concept can be rendered non-functional by redefining the 
"RowID" so that it violates the uniqueness constraint. Example:

  CREATE TABLE x (
RowID TEXT);

Now the implicit unique RowID is no longer accessible via the "RowID" column. 
Workarounds are "_rowid_" or "OID", but they can be overwritten as well:

  CREATE TABLE x (
RowID TEXT,
_rowid_ text,
oid text);

For this table, it is no longer possible to access the implicit, unique RowID. 
General database applications (GUI managers, for example) can no longer (re-) 
identify a particular record!

>Rows should be identifiable by user-visible data, not hidden data, since a 
>database is supposed to model reality and people identify things based on 
>their someway-visible attributes.

This is what INTEGER PRIMARY KEY is for: It works as a duplicate for the 
implicity "RowID":

  CREATE TABLE x (
ID INTEGER PRIMARY KEY);

For this table, the visible "ID" and the implicit "RowID" access the same 
unique data. This is the recommended usage and poses no problems. Problems only 
arise if "RowID" is re-defined differently as demonstrated above!

>On a separate note, it is best for one to be able to name a table or column 
>et al anything one wants, with all the choice of names as you can store in 
>a text column for user data.  Reserved words aren't an issue as long as 
>entity names are referred to with an unambiguously different syntax, such 
>as quoted identifiers as SQL does support.  Then database users don't have 
>to worry about implementation details and can name tables and columns 
>whatever they want; saying they can't name their column "RowID" is a leaky 
>abstraction. 

Sure we all dislike restrictions. Can you suggest an alternative to a single 
reserved name to represent the column which uniquely identifies a database 
record under any and all circumstances?

Ralf 

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] SQLite allows "RowID" to be the name of a column

2008-05-21 Thread Ralf Junker

Federico Granata wrote:

>have you seen here http://www.sqlite.org/autoinc.html ? 

Yes, I did. This documentation actually made me realize that the problem is not 
an implementation flaw but a design error, IMO. See my other answer in this 
thread for more rationale.

Ralf 

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] SQLite allows "RowID" to be the name of a column

2008-05-21 Thread Ralf Junker

Bradley A. Town wrote:

>Ralf Junker wrote:
>
>> This alerts me to a potential danger for SQLite managers which must rely on 
>> some means to retrieve THE RowID which uniquely identifies a record for 
>> in-grid table editing. If the "RowID" name can be hijacked by other columns 
>> and given another purpose, it poses the danger that wrong wrong columns are 
>> updated and data is corrupted.
>>
>> How can I access the "RowID" given the above table declaration? I know about 
>> the "OID" and "_ROWID_" synonyms, but searching the documentation I find 
>> that they, too, can be used by other columns.
>>
>> I can therefore not see any non-ambiguous, reserved column name or API call 
>> to retrieve the implicit RowID value in such cases, especially if no primary 
>> key has been set like in the above schema.
>>
>> Any thoughts, especially from the SQLite developers?
>>
>> Thanks, Ralf
>>
>Creating another thread for this to avoid thread hijacking.

Thanks, I did not mean to hijack the thread.

But thinking more about hijacking "RowID" I am glad this is now a separate 
thread. Lack of a reseverd "RowID" column name to guarantee unambiguous record 
operations by general SQLite tools is a potential thread to data security IMO.

I would very much appreciate if this could be addressed in a future version of 
SQLite!

Ralf 

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Ticket 3128: DISTINCT -> GROUP BY can lead to performance degradation

2008-05-20 Thread Ralf Junker


>> This should NOT happen as SQLite usually rejects duplicate rowids with a
>constraint error.
>
>For this table, the badly-named column "RowID" is actually a text field
>that's allowed to be null, not a primary key.
>
>-- Describe NDXPARAMVALUES_LOCALIZEDSTRING
>CREATE TABLE 'ndxParamValues_LocalizedString'
>(
>'FamilyId' INTEGER NOT NULL ,
>'ParamId' INTEGER NOT NULL ,
>'RowID' TEXT COLLATE NOCASE,
>'Index' INTEGER ,
>'Value' INTEGER NOT NULL
>)

My oversight, sorry for that. I never questioned that SQLite would reject 
reserved word column names, but I now see that this is not so.

This alerts me to a potential danger for SQLite managers which must rely on 
some means to retrieve THE RowID which uniquely identifies a record for in-grid 
table editing. If the "RowID" name can be hijacked by other columns and given 
another purpose, it poses the danger that wrong wrong columns are updated and 
data is corrupted.

How can I access the "RowID" given the above table declaration? I know about 
the "OID" and "_ROWID_" synonyms, but searching the documentation I find that 
they, too, can be used by other columns.

I can therefore not see any non-ambiguous, reserved column name or API call to 
retrieve the implicit RowID value in such cases, especially if no primary key 
has been set like in the above schema.

Any thoughts, especially from the SQLite developers?

Thanks, Ralf  

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Ticket 3128: DISTINCT -> GROUP BY can lead to performance degradation

2008-05-20 Thread Ralf Junker

Correcting myself:

This should NOT happen as SQLite usually rejects duplicate rowids with a 
constraint error.

>This should happen as SQLite usually rejects duplicate rowids with a 
>constraint error.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Ticket 3128: DISTINCT -> GROUP BY can lead to performance degradation

2008-05-20 Thread Ralf Junker

I looked at the database attached to the ticked and noticed that the table 
contains NULL rowids as well duplicate rowids. This should happen as SQLite 
usually rejects duplicate rowids with a constraint error.

When I run this query:

  select rowid, count() from ndxparamvalues_localizedstring 
group by rowid
order by 2 desc

I get the following results:

  rowid count ()
  NULL 1759 
  10 1601 
  20 1341 
  30 1281 
  40 1254 
  50 1200 
  ... more results follow, 3161 in total.

So I wonder how you managed to fill your database with duplicate rowids?

I also wonder if this is somehow related to your "problem"?

Ralf

>I've only found one reference to slower queries with the DISTINCT/GROUP 
>BY optimization that went in back in November for 3.5.3 and later.  I 
>would have expected more given the number of our queries causing problems.
>
>The problem as I wrote in ticket 3128 
> appears to be with 
>queries that use DISTINCT and LIMIT.  If the query is somewhat slow (in 
>my example, it's joining several large tables), the new DISTINCT 
>(implemented in the code as GROUP BY) can be much, much slower because 
>it (apparently) collects all or most of the rows before applying GROUP 
>BY, even when there's a LIMIT.  Before the change (3.5.2 and earlier), 
>DISTINCT was aided by the LIMIT quite a bit.  In my example, the query 
>takes around 0.02 seconds in 3.5.2 and earlier, 3.1 seconds in 3.5.3 and 
>later.
>
>Has anyone else seen similar behavior?  If so, have you found a 
>workaround?  My workaround is actually a patch to the source to disable 
>the optimization when a LIMIT is given, though there might be cases 
>where this is undesirable (like maybe cases where the table has indices 
>that can be used by GROUP BY and the LIMIT is sufficiently high).
>
>Brad Town

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] sqlite3_transfer_bindings obsolete?

2008-05-19 Thread Ralf Junker


>> Drat.  It doesn't look like there's a way to see what's already been
>> bound to a statement either, correct?

See this thread for a previous disuccsion of the problem:

http://www.mail-archive.com/sqlite-users@sqlite.org/msg28610.html

Ralf 

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] PRAGMA journal_mode not implemented?

2008-05-07 Thread Ralf Junker


>In PERSIST mode, you have two files associated with your database  
>instead of one.  Whenever you move, copy, or rename the database file  
>you *must* also move, copy or rename the journal file to prevent  
>database corruption.
>
>The persistent journal file uses disk space that might otherwise have  
>been returned to the operating system and made available to other  
>programs.

Say I have a database open in PERSIST mode for fast operation. Before the app 
shuts down, I would like to detach the database of its journal file (to free 
some storage space and guarantee a self contained, single file database).

Will a simple "PRAGMA journal_mode=DELETE" do this for me and automatically 
delete the journal file when (or even before) I close the database?

Ralf 

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] FTS search negative term syntax

2008-05-01 Thread Ralf Junker


>Fair warning, though: It's not entirely clear that the fts search
>syntax should aim to hew too closely to consumer-oriented search
>syntax.

Interesting point, too. Up to now, I always perceived the FTS search syntax to 
be very much consumer-oriented. It it just too similar to major search engines  
to be primarily machine-oriented.

As it stands now, FTS syntax can of course be machine generated, if that is 
what you are aiming at. I believe that this should remain easy to do. And my 
suggested minus sign modification would not change this, would it?

>It's sort of in a strange place, most people would think it a
>poor idea (indeed, dangerous!) to put user-entered expressions in
>their WHERE clauses.

I am not sure I understand the danger. Say I sqlite3_bind() the FTS match 
query, do you see this as a serious security risk (FTS injection) or a 
potential performance jeopardy, or something else?

>Caveat for the above: I've spent all of five minutes thinking about
>your posting, and I was interrupted in the middle.  But I'll try to
>factor it in to future thinking.

Thanks for your time and your thoughts!

Ralf

PS: I see little traffic on <[EMAIL PROTECTED]>. Is this intentional, or should 
this and simliar topics better be discussed there? 

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] FTS search negative term syntax

2008-04-30 Thread Ralf Junker

Hello,

I have a small concern about the FTS negative term search syntax. Currently, 
all terms following any minus sign ("-") are excluded from the search. This is 
a very welcome feature, but consider searching for these hyphenated words:

  Coca-Cola   -> FTS finds Coca, but never Cola
  low-budget  -> FTS finds low, but never budget
  twelve-year-old -> FTS finds twelve, but never year and never old
  part-time   -> FTS finds part, but never time
  full-time   -> FTS finds full, but never time

These results do not match what most users will expect. Well, one can ask them 
to leave out the minus sign, but users will habitually leave it in because they 
learned from major search engines that it is the intended behavior. Consider 
Google, which explicitly states:

"Note: when you include a negative term in your search, be sure to include a 
space before the minus sign."

Source: 
http://www.google.com/support/bin/static.py?page=searchguides.html=basics

Therefore I would like to consider adding these search syntax rules:

1. A minus sign excludes a search term only when located at the beginning of 
the search query or after a white space (space, tab, etc.):

  "low-budget"  -> Find both low and budet.
  "low -budget" -> Find low, but not budget.
  "-low budget" -> Do not find low, but find budget.
  "-low-budget" -> Do not find the "low budget" phrase.

2. In case the minus sign is a term separator and two or more search terms are 
separated by sisngle minus signs only, they constitue a phrase search:

  "twelve-year-old" -> "twelve year old" (phrase search)
  "part-time"   -> "part time" (phrase search)

I believe that these changes would make the FTS search syntax more intuitive to 
use and more conformant to major search engines.

Would there be a chance that they could be implemented in current FTS3 and/or 
the upcomming FTS4? 

Any thoughts?

Ralf

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] View bindings for a statement

2008-04-26 Thread Ralf Junker

Cole Tuininga wrote:

>The question is, is there an easy way to extract the actual query
>(with the bound variable set) from the statement handle? 

This topic has already been discussed in length some time ago:

  http://www.mail-archive.com/sqlite-users@sqlite.org/msg28610.html

It would be beautiful if some of the interfaces suggested there would one day 
make it into SQLite!

Ralf 

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] How to set memory usage as high as possible -- but not too high?

2008-04-24 Thread Ralf Junker

Jay A. Kreibich wrote:

>> >Are you using a 32 bit or 64 bit process.
>> 
>> 32, but it does not matter to the problem.
>
>  When you give examples like "PRAGMA cache_size = 1000;", or 10M
>  pages which would take something on the order of 15 GB of memory to
>  service with the default 1K page size, a few of us start to wonder.

Good point. But I believe you misunderstood my intention. I was not interested 
in absolute but in relative numbers, regardless of a 32 or 64 bit system. All I 
need to know is related to the amount of RAM available when the application 
starts. The aim is to use as much RAM as possible, but never more than 
reasonably available. The last bit is the difficult one.

>> >I am curious why you think memory is the bottleneck anyway!
>> 
>> It has often been pointed out on this list that inserts into
>> indexed tables (regular or FTS) run faster with a high page cache.
>> My own tests 2nd this. A few 100 MB more or less can make an
>> difference of more than 100%.
>
>  Given that the default page cache is 2000 pages, or on the order of
>  3MB, it seems that you're hitting some serious limits.  If hundreds
>  of megabytes (!) is giving you a return on the order of 2x, then there
>  is no magic bullet-- you aren't going to find a setting that suddenly
>  gives you a 10x speedup.  You're hitting diminishing returns in a
>  serious kind of way.

Some numbers: 32 bit app, inserting 6 million records into 2 tables each (text, 
fts, and blob). The database finally grows to over 6 GB in size. As the last 
step, a simple index is created on one text field.

With the default 2000 pages cache size (1 KB page size), this takes about a 
full day or more. Raising the page cache to some 18 pages uses about 270 MB 
of memory but brings the timing down to less than one hour.

My testing shows that inserts with lots of random disk searches (indexes, fts) 
hugely benefit from a large cache size for the simple reason that it reduces 
disk IO.

>  Personally, I'd pick a number, like half your RAM size or ~1.5GB*
>  (whichever is smaller), set the page cache, and be done with it.

That's what I ended up doing. In addition, I regularly check 
sqlite3_memory_used() and reduce the page cache if it exceeds the limit. Then I 
call sqlite3_release_memory() repeatedly until the memory usage has dropped 
sufficiently.

>  It sounds like you've already found most of the other PRAGMAs that
>  are going to get you something.  You might be able to tune the size
>  of your INSERT transactions, but if you're around 1000 or so, going
>  higher isn't likely to buy you too much.

Currently I use just a single transaction for all inserts into a newly created 
database. This reduces the number of cache flushes to a single time when all 
data is inserted and just the used memory is being freed.

As another optimization option I am looking forward for the new journal pragma 
and will hopefully not need to use journal file after all.

Thanks for the feedback and my apologies for the late response,

Ralf 

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] How to retrieve number of cached pages in memory?

2008-04-23 Thread Ralf Junker


>  See the "Pager" data structure and associated variables and functions
>  in "sqlite-3.5.x/src/pager.c".

OK, it seems that the number I am interested in is stored as part of the Pager 
struct:

int nPage;  /* Total number of in-memory pages */

Now I just have to figure out how to access this number reliably.

Many thanks,

Ralf 

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] How to retrieve number of cached pages in memory?

2008-04-22 Thread Ralf Junker

I need to retrieve the number of pages a SQLite database connection has 
currently allocated in memory. The documentation unfortunately turned up no 
results. I know about "PRAGMA cache_size", but this returns the maximum number 
of pages possibly allowed in the cache, not the actual number of pages 
currently cached.

My aim is to calculate the accurate number of bytes actually consumed by a 
single cached page. This figure will then allow to set PRAGMA cache_size to a 
more precise value in order to limit memory usage.

I do mind using undocumented APIs and will not cry tears if they change without 
notice, so any pointers are welcome!

Many thanks,

Ralf

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] blob api

2008-04-22 Thread Ralf Junker


>> I'm using the sqlite3_blob_* api to write a larger text stream  
>> incrementally. Works a charm, but is there a way to
>> change the datatype of the blob to text afterwards ? I'd like to see  
>> the text easily in f.i. SQLiteSpy.
>
>Perhaps:  SELECT CAST(b AS TEXT) FROM table

Yes, this works well in SQLiteSpy.

>Really I suppose it depends on what SQLiteSpy is using to
>determine that the column type is BLOB.

SQLiteSpy determines the type of each record cell individually, just as it is 
returned by sqlite3_column_type(). So casting a BLOB to text will display it as 
such in SQLiteSpy.

In fact, SQLiteSpy colors the cell backgrounds by data type:

  White:  Text
  Green:  Integer
  Violet: Float
  Blue:   Blob
  Red :   Null

Ralf 

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] blob api

2008-04-22 Thread Ralf Junker


>Ok, I was kinda hoping for a more "permanent" solution such as:

Did you consider creating a view?

>int sqlite3_cast_column_type(sqlite3* db, const char* zDBname, const char* 
>zTable, const char* zColumn, int newColumnType);
>
>that would cast the column into the desired type, returning SQLITE3_ERROR if 
>the cast is invalid (like from double to integer, or text to numeric).

You can use a CASE statement for this:

  select case typeof (a) 
  when 'real' then a
  else 'Invalid type: ' || typeof (a) end  
  from my_table;

Ralf 

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] How to set memory usage as high as possible -- but not too high?

2008-04-14 Thread Ralf Junker

Dan,

many thanks for the quick response and detailed answers. However, a question or 
two still puzzle me.

>> * OPTION 1: PRAGMA cache_size = 1000;
>>
>> Advantage: SQLite will use ample memory, but no more than that.
>>
>> Disadvantage: Difficulty to establish exact memory requirements in  
>> advance. The help states that "Each page uses about 1.5K of  
>> memory.", but I found this to be wrong. Memory usage obviously  
>> depends on the page size, and my measurement shows that there is an  
>> additional small overhead of undocumented size. Is there a formula  
>> to calculate the required memory for a cache_size of x?

I'd be curious if you know an answer to this, too?

>> * OPTION 2: sqlite3_soft_heap_limit(max_mem_bytes);
>>
>> Disadvantage: My tests indicate that SQLite slows down drastically  
>> when it hits the memory limit. Inserts drop from a few hundred per  
>> second to just one or two per sec.
>
>That is an odd result. How did you test it? 

I set up a high cache_size and a lower sqlite3_soft_heap_limit() and started 
inserting blobs.

>What was the memory limit? Any chance the machine started using swap space?

I will test again and let you know.

>> * Does "PRAGMA cache_size=some_smaller_value;" cause SQLite to free  
>> used pages and release their memory straight away?
>
>No. If the cache_size parameter is set to a value that
>is less than the number of pages currently allocated for the
>cache, no more pages will be allocated. But no existing
>pages will be freed.

Good to know. So I would reduce the cache_size and then use 
sqlite3_release_memory() to free memory, right?

Maybe this is worth documenting?

>Does SQLite really run faster with 1GB available than it would with 100MB?

Yes. If there are indexes (regular or FTS) on the table, SQLite needs quick 
access to lots of pages for searching and rearranging b-tree entries. My 
timings show that 100MB or 500MB can sometimes make a difference of more than 
100%.

Richard recently talked about upcoming indexing performance improvements. I 
wonder if they are part of the performance refactoring due with the next 
release? :-)

Ralf 

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] How to set memory usage as high as possible -- but not too high?

2008-04-14 Thread Ralf Junker

Roger Binns wrote:

>Are you using a 32 bit or 64 bit process.

32, but it does not matter to the problem.

>Also is there a requirement to create the database in the filesystem?

Yes.

>If not you could ensure your swap is sufficiently large (I use a mininmum of 
>16GB on my machines :-) and create in a tmpfs filesystem, and then copy the 
>database to
>persistent storage when you are done.

The aim is to avoid slow swap memory but use fast RAM only.

>You also didn't list turning off synchronous etc while creating the database 
>and turning it back on when done.

Performance settings are:

* PRAGMA locking_mode=exclusive;
* PRAGMA synchronous=off;
* Disable journal file :-)

>I am curious why you think memory is the bottleneck anyway!

It has often been pointed out on this list that inserts into indexed tables 
(regular or FTS) run faster with a high page cache. My own tests 2nd this. A 
few 100 MB more or less can make an difference of more than 100%.

Ralf 

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] How to set memory usage as high as possible -- but not too high?

2008-04-13 Thread Ralf Junker

I need to create a huge database (about 6 GB, more than 6 mio records, blobs, 
and FTS text) in as little time as possible. Since memory is the key to speed, 
I try to use as much memory as is available. However, there is the danger of 
running out of memory. This is where memory usage control comes into play. I 
can see there are two options:


* OPTION 1: PRAGMA cache_size = 1000;

Advantage: SQLite will use ample memory, but no more than that.

Disadvantage: Difficulty to establish exact memory requirements in advance. The 
help states that "Each page uses about 1.5K of memory.", but I found this to be 
wrong. Memory usage obviously depends on the page size, and my measurement 
shows that there is an additional small overhead of undocumented size. Is there 
a formula to calculate the required memory for a cache_size of x?


* OPTION 2: sqlite3_soft_heap_limit(max_mem_bytes);

Advantage: Memory limit can be set to a known value (amount of free memory as 
returned from the OS).

Disadvantage: My tests indicate that SQLite slows down drastically when it hits 
the memory limit. Inserts drop from a few hundred per second to just one or two 
per sec.


* OPTION 3: Catch out-of-memory errors and reduce cache_size accordingly 
(untested scenario).

Advantage: Use memory up to the least bits available.

Disadvantage: How to avoid data loss after the out-of-memory error. Can I just 
call sqlite3_release_memory(some_bytes) and sqlite3_step again and again until 
it passes without the out-of-memory error?


This raises a few questions:

* Do sqlite3_soft_heap_limit(), or "PRAGMA cache_size=x;", or both establish 
SQLite's upper memory limit? Do they work independently of each other, i.e. 
does the lower limit always kick in first?

* Does "PRAGMA cache_size=some_smaller_value;" cause SQLite to free used pages 
and release their memory straight away?

* Is there another runtime -- important! -- setting to establish a maximum 
memory limit, possibly undocumented?


In the end this boils down to a simple problem:

* Wow to keep SQLite's memory usage as close to, but not exceeding the memory 
available to applications?

I will be very grateful for any suggestion!

Many thanks,

Ralf

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Unrecognized "Z" UTC time zone signifier

2008-02-22 Thread Ralf Junker

Aristotle Pagaltzis wrote:

>* [EMAIL PROTECTED] <[EMAIL PROTECTED]> [2008-02-21 13:45]:
>> Ralf Junker <[EMAIL PROTECTED]> wrote:
>> > SQLite does not recognize "Z" as the zero offset time zone
>> > specifier. 
>> 
>> If we start accepting any symbolic timezone names, seems like
>> we would then need to start accepting them all.
>Not hardly. FWIW, the IETF recommendation for timestamps in
>any new internet standards is to use the format specified in
>RFCÂ 3339, which is based on codified experience. For time zones,
>it prescribes that they be given as either a numeric offset or
>`Z` a shortcut for `+00`; no provision is made for other symbolic
>names as those only cause trouble. So you should have no trouble
>refusing requests to support those.

Richard did it, and it works like a charm:

  http://www.sqlite.org/cvstrac/chngview?cn=4805

Many thanks!

Ralf  

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Unrecognized "Z" UTC time zone signifier

2008-02-21 Thread Ralf Junker

[EMAIL PROTECTED] wrote:

>Ralf Junker <[EMAIL PROTECTED]> wrote:
>> SQLite does not recognize "Z" as the zero offset time zone specifier. 
>
>SQLite does not currently accept any timezone specifiers, other
>than a hard-coded timezone offset:
>
>   1981-04-06T14:45:15+01:00
>
>If we start accepting any symbolic timezone names, seems like we
>would then need to start accepting them all.  If am reluctant to
>open the floodgates

Yes, I know about your strict policy of adding new features to SQLite, and 
please know that I do appreciate it. But this does not mean we have go give up 
easily on new features, but provide better arguments instead. Let my try:

1. "Z" is part of the ISO standard and therefore used with external date and 
time data. Supporting it makes it much easier to import such data into SQLite.

2. "Z" is not a soft-coded timezone specifier like "CET" or similar. It is just 
a special case of the hard-coded "1981-04-06T14:45:15+00:00".

3. Using "Z" explictly distinguises UTC from local time zones and avoids 
disambiguities.

4. Asking for "Z" I do not want to open any floodgates. If this was my 
intention, I would have asked for 

  * ±[hh][mm] and ±[hh] -- currently missing but nice to have, IMHO
  * named timezones ('MET' or 'Europe/Moscow')  -- just a joke
  * daylight saving time support -- kidding only

Ralf 

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Unrecognized "Z" UTC time zone signifier

2008-02-21 Thread Ralf Junker

SQLite does not recognize "Z" as the zero offset time zone specifier. "Z" (for 
'Zulu time', an alternative name for UTC) is part of the ISO 8601 standard for 
date and time representations. See http://en.wikipedia.org/wiki/ISO_8601 for 
details.

In this regard, SQLite does not currently follow the standard and rejects the 
following valid ISO 8601 dates as NULL:

  select datetime ('1981-04-06T14:45:15Z');
  select datetime ('14:45:15Z');

As far as I can tell, "Z" support only requires a very minor change in date.c, 
parseTimezone() to recognize the if the "Z" character is present. No further 
timezone modification is necessray.

A test case scenario would be

  datetest 5.8 {datetime('1994-04-16 14:00:00Z')} {1994-04-16 14:00:00}
  # According to Wikipedia, timezone should directly follow time.
  # SQLite, however, allows whitespace inbetween.
  # Question: Does this conform to ISO 8601?
  datetest 5.9 {datetime('1994-04-16 14:00:00 Z')} {1994-04-16 14:00:00}
  # Whitespace after Z.
  datetest 5.10 {datetime('1994-04-16 14:00:00Z ')} {1994-04-16 14:00:00}
  # Whitespace before and after Z.
  datetest 5.11 {datetime('1994-04-16 14:00:00 Z ')} {1994-04-16 14:00:00}

If "Z" timezone support was implemented, I would volunteer to update the date 
time function documentation in the SQLite Wiki.

Ralf

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] How to specify regular expression in a query?

2008-01-28 Thread Ralf Junker

Hello Nicolas Williams,

>I suppose that to make this generic so that users can replace the
>regexp, like, and glob functions would require some new interfaces.

I believe so, too. The like and glob optimization is part of where.c and 
outside the reach of sqlite3_create_function().

>SQLite would have to be able to extract a constant prefix from the
>pattern in order to be able to use an idex in this case.

The regular expression engine I use is able to tell if a pattern is anchored at 
the beginning and which letter starts the pattern. I believe that this 
information is sufficient for an index to narrow down the search. SQLite just 
needs to provide the API to pass the prefix plus, possibly, which index to use.

I believe that this API would also ease implementations of Unicode LIKE and 
GLOB.

Ralf 


-
To unsubscribe, send email to [EMAIL PROTECTED]
-

Re: [sqlite] How to specify regular expression in a query?

2008-01-28 Thread Ralf Junker

Hello Bharath Booshan L,

>>select * from t where filepath regexp '/MyMovie(\.[^\.]+)*$';
>
>Will this query use index, if we had one, on filepath?

No. It will do a full table scan.

Ralf 


-
To unsubscribe, send email to [EMAIL PROTECTED]
-

Re: [sqlite] How to specify regular expression in a query?

2008-01-25 Thread Ralf Junker

Hello Bharath Booshan L,

yes, with SQLiteSpy you can do this:

drop table if exists t;
create table t (id integer primary key, filepath text);
insert into t values (1, '/Volumes/Backup/MyMovies/MyMovie.mp4');
insert into t values (2, '/Volumes/Backup/MyMovies/Hello.mp4');
insert into t values (3, '/Volumes/Tiger/MyMovie.mov');
select * from t where filepath regexp '/MyMovie(\.[^\.]+)*$';

Ralf

> I have to perform a search something similar to this
>
>  ID FilePath
>   1  /Volumes/Backup/MyMovies/MyMovie.mp4
>   2  /Volumes/Backup/MyMovies/Hello.mp4
>   3  /Volumes/Tiger/MyMovie.mov
>
>
>Search for file name MyMovie should retrieve
>
>   ID FilePath
>   1  /Volumes/Backup/MyMovies/MyMovie.mp4
>   3  /Volumes/Tiger/MyMovie.mov
>
>
>To simplify, I am searching for a file name from a collection of absolute
>file paths.
>
>How can I achieve this in SQLite?
>
>Is there anyways I can use regular expression in a query to perform string
>matching.


-
To unsubscribe, send email to [EMAIL PROTECTED]
-

Re: [sqlite] Proposed change to sqlite3_trace()

2008-01-11 Thread Ralf Junker

DRH wrote:

>Legacy applications should continue to work.  You might get a 
>compiler warning about a type mismatch on the 2nd parameter to
>sqlite3_trace().  But calling a 2-parameter function with 3
>parameters is suppose to be harmless in C. 

Harmless in C, but not so in other languages. I therefore suppose that the 
change will break compatability for quite a few non-C applications.

Given that it does break backward compatability for many (I personally would 
not mind given the feature enhancement), I also would not mind to see further 
enhancements along the line suggested by Roger.

Ralf 

-
To unsubscribe, send email to [EMAIL PROTECTED]
-

Re: [sqlite] Proposed change to sqlite3_trace()

2008-01-11 Thread Ralf Junker

Roger Binns wrote:

>The biggest problem with the trace api is that there is no way to find
>out what the bound parameters were.  If an application follows best
>practise using bound parameters all over then the trace api is rather
>useless.

Thanks for bringing this up again. There was a thread about how to retrieve 
bound parameters from a prepared (and bound) statement some time ago. Responses 
quite some interest in such functionality. Dan finally came up with a very 
smart, but unfortunately slow, workaround-solution. Search the archive for 
"sqlite3_bound_int" to find it.

I agree with your proposal and believe that it would make a nice addition to 
the profile callback.

More specifically, I would like to see in the callback a pointer to the 
currently running statement, plus its origin as proposed by Richard (SQL, 
Trigger, etc). I believe that the statement contains almost all information 
ever required, like SQL, bound parameters, and future additions. It only needs 
a few access functions to make use of them, but they will not require 
additional changes to the profile API.

Ralf  

-
To unsubscribe, send email to [EMAIL PROTECTED]
-

Re: [sqlite] Lemon: Conflicts with repeated TERMINALS

2007-11-29 Thread Ralf Junker

Joe Wilson <[EMAIL PROTECTED]> wrote:

>The following grammar may be clearer to you:

Yes, it is many thanks! I believe I am making progress! At least I can see the 
picture much clearer now and was able to come up with the following grammar 
with just one conflict unsolved:

  %left NEWLINE.   /* Do these matter here at all? */
  %nonassoc TEXT LINK.
  %left HEADING_START.
  %left HEADING_END.

  article ::= blocks.

  blocks ::= block. /* EOF */
  blocks ::= blocks NEWLINE./* EOF */
  blocks ::= blocks NEWLINE NEWLINE block.

  block ::= .   /* EOF */
  block ::= paragraph.
  block ::= heading.

  heading ::= HEADING_START text HEADING_END.

  paragraph ::= line.
  paragraph ::= paragraph NEWLINE line.

  line ::= text.

  text ::= textpiece.
  text ::= text textpiece.

  textpiece ::= TEXT.
  textpiece ::= LINK.

I of course appreciate any comments ;-) My idea is that

* A block can be either a paragraph or a heading. Multiple blocks are separated 
by two NEWLINEs.

* A paragraph is made up of n >= 1 lines. Each line within a paragraph ends 
with a single NEWLINE. Two NEWLINEs start a new block (see above).

* A line consists of text, which can be TEXT or LINK.

Not all works well with the grammer, and unfortunately I do not understand why. 
Given this input, for example:

  TEXT, NEWLINE

the parser gets stuck at

  paragraph ::= paragraph NEWLINE line.

instead of falling back to the line above

  paragraph ::= line.

to find the conditions of a paragraph fulfilled. Why does it not try the other 
alternatives? Or are there none in the grammar?

>Try reading some papers on parsing or search for the book
>"Compilers: Principles, Techniques, and Tools" (a.k.a. 
>the dragon book).

I certainly will.

>Also try writing on paper random sequences of tokens and 
>manually parse your grammar to see the conflicts firsthand.

As I throw different token sequences to my experimental parser I am slowly 
starting to make sense of the debugger output.

Ralf 

-
To unsubscribe, send email to [EMAIL PROTECTED]
-

Re: [sqlite] Undefined collation: Peculiar observations ...

2007-11-29 Thread Ralf Junker

Hello Trevor Talbot,

>> Trevor, I am not sure what you mean by "subselect on its own". Is this what 
>> you are looking for?
>
>> sqlite> SELECT * FROM (SELECT * FROM a);
>> SQL error: no such collation sequence: unknown
>
>Yes, exactly.  I was curious to see if it made any kind of difference.
>Unfortunately I don't have an explanation/fix for you though.

I created two tickets about these collation peculiarities yesterday.

Ralf 


-
To unsubscribe, send email to [EMAIL PROTECTED]
-

Re: [sqlite] Lemon: Conflicts with repeated TERMINALS

2007-11-29 Thread Ralf Junker

Many thanks, Joe,

>Your grammar is ambiguous. The text tokens run together for 
>various rules because the grammar lacks clear separators between 
>them. 

OK, I begin to understand. The "clear separators" need to be TERMINALs, right? 
I believed that these were imlicit because there are TEXT and LINK after all 
text tokens are fully expanded. Therefore I thought that the grammar would not 
be ambiguous.

>You can fix it a million ways by altering your grammar.

Thanks for the suggestions - I can see that they do not generate conflicts, but 
they certainly alter the grammar.

>Here is one way:
>
>  article ::= blocks.
>
>  blocks ::= block.
>  blocks ::= blocks block.
>
>  block ::= heading.
>  block ::= paragraph.
>
>  heading ::= HEADING_START text HEADING_END.
>  heading ::= HEADING_START text.
>  heading ::= HEADING_START.
>
>  paragraph ::= PARA text.
>
>  text ::= textpiece.
>  text ::= text textpiece.
>
>  textpiece ::= TEXT.
>  textpiece ::= LINK.

I observed the new PARA terminal token (the clear separator!?). Unfortunately 
the lexer does not generate such a token. Paragraph repeats are also removed.

>Here's another:
>
>  article ::= blocks.
>
>  blocks ::= block.
>  blocks ::= blocks block.
>
>  block ::= heading NEWLINE.
>  block ::= paragraph NEWLINE.
>
>  heading ::= HEADING_START text HEADING_END.
>  heading ::= HEADING_START text.
>  heading ::= HEADING_START.
>
>  paragraph ::= text.
>
>  text ::= textpiece.
>  text ::= text textpiece.
>
>  textpiece ::= TEXT.
>  textpiece ::= LINK.

This one also removes paragraph repeats, doesn't it? Unfortunately paragraphs 
need to repeat for my grammar. Is there a way to achieve this without conflicts?

>Lemon generates an .out file for the .y file processed.
>You can examine it for errors.

I have tried to make sense of the .out file before. It tells me where to look 
for the problem, but not how to fix it ...

I am sorry to appear stupid, but I still can not make sense of it all. Can 
someone still help, please?

Ralf 


-
To unsubscribe, send email to [EMAIL PROTECTED]
-

[sqlite] Lemon: Conflicts with repeated TERMINALS

2007-11-28 Thread Ralf Junker

I am trying to write a Wiki parser with Lemon. The Lemon features suite my 
needs perfectly, but I am unfortunately stuck with the problem of parsing 
conflicts.

All conflicts seem caused by repeat constructs like this:

  text ::= textpiece.
  text ::= text textpiece.

The complete grammar follows below and results in 10 conflicts.

I have read the manual, looked at tutorials, and searched the mailing list, but 
nothing helped me to reduce the number of conflicts. Changing token order even 
tends cause more of them.

Reading similar grammars for Bison makes me wonder why Bison apparently has no 
problems with them but Lemon does. Am I doing something wrong or is this simply 
not possible with Lemon?

Ralf

---

article ::= blocks.

blocks ::= block.
blocks ::= blocks block.

block ::= heading.
block ::= paragraph.

heading ::= HEADING_START text HEADING_END.
heading ::= HEADING_START text.
heading ::= HEADING_START.

paragraph ::= text NEWLINE.
paragraph ::= paragraph text NEWLINE.
paragraph ::= text.
paragraph ::= paragraph text.

text ::= textpiece.
text ::= text textpiece.

textpiece ::= TEXT.
textpiece ::= LINK.


-
To unsubscribe, send email to [EMAIL PROTECTED]
-

Re: [sqlite] Undefined collation: Peculiar observations ...

2007-11-28 Thread Ralf Junker


>> Imagine that a SQLite3 database opened in a custom application with a 
>> registered a collation sequence named "unknown" has created the following 
>> table:
>>
>>   CREATE TABLE a (b COLLATE unknown);
>>
>> Now open this table in the default SQLite3 CLI. Up to here, everything works 
>> as expected.
>>
>> Now some peculiar observations:
>
>> 2. Running simple queries like "SELECT * FROM a;" work fine. But subselects, 
>> in their most basic form and with no sorting or comparisons, do not:
>>
>> sqlite> SELECT * FROM a, (SELECT * FROM a);
>
>That's not just a subselect, it's also a join.  Does a subselect on
>its own have the same behavior?

Thanks all for the feedback.

Trevor, I am not sure what you mean by "subselect on its own". Is this what you 
are looking for?

sqlite> INSERT INTO a VALUES ('one');

sqlite> SELECT * FROM (SELECT * FROM a);
SQL error: no such collation sequence: unknown

sqlite> SELECT *, * FROM a;
one|one

Ralf 


-
To unsubscribe, send email to [EMAIL PROTECTED]
-

[sqlite] Undefined collation: Peculiar observations ...

2007-11-25 Thread Ralf Junker

Imagine that a SQLite3 database opened in a custom application with a 
registered a collation sequence named "unknown" has created the following table:

  CREATE TABLE a (b COLLATE unknown);

Now open this table in the default SQLite3 CLI. Up to here, everything works as 
expected.

Now some peculiar observations:


1. Issue "PRAGMA collation_list;" and notice that "unknow" lists next to the 
other registered collations, even though "unknown" is not registered with the 
default SQLite3 CLI:

sqlite> PRAGMA collation_list;
0|unknown
1|NOCASE
2|BINARY

Question 1: Is this the expected behaviour, or should not "PRAGMA 
collation_list;" rather list registered collations only?


2. Running simple queries like "SELECT * FROM a;" work fine. But subselects, in 
their most basic form and with no sorting or comparisons, do not:

sqlite> SELECT * FROM a, (SELECT * FROM a);
SQL error: no such collation sequence: unknown

This is surprising to me because I do not see where the collation sequence  
should matter to this query.

To demonstrate, here is the explain output of a table with a registered 
collation sequence. No mention of the collation name here:

sqlite> CREATE TABLE b (b collate nocase);
sqlite> EXPLAIN SELECT * FROM b, (SELECT * FROM b);
0|Goto|0|17|
1|Integer|0|0|
2|OpenRead|0|3|
3|SetNumColumns|0|1|
4|Integer|0|0|
5|OpenRead|2|3|
6|SetNumColumns|2|1|
7|Rewind|0|14|
8|Rewind|2|13|
9|Column|0|0|
10|Column|2|0|
11|Callback|2|0|
12|Next|2|9|
13|Next|0|8|
14|Close|0|0|
15|Close|2|0|
16|Halt|0|0|
17|Transaction|0|0|
18|VerifyCookie|0|4|
19|TableLock|0|3|b
20|Goto|0|1|
21|Noop|0|0|

Question 2: Why does this happen, and is there a way to work around the problem 
by issuing explicit collation sequences?

Thanks,

Ralf


-
To unsubscribe, send email to [EMAIL PROTECTED]
-

[sqlite] Can sqlite3_reset() ever fail?

2007-11-08 Thread Ralf Junker

Hello,

I wonder if sqlite3_reset() can ever fail. In particular, does sqlite3_reset() 
always reset the statement even if it returns an error code?

>From the documentation and mailing list, my understanding is that 
>sqlite3_reset() errors always relate to the latest (or possibly ongoing) VM 
>execution triggered by sqlite3_step(). Whatever VM error returned, the 
>statement itself will nevertheless be reset after the call.

I mostly conclude this reasoning from the sqlite3_finalize() documentation and 
source code. However, since this is not explicitly spelled out for 
sqlite3_reset(), I would like to ask if sqlite3_reset()

* can also be called at any point during the execution of the virtual machine?

* will also result in an error or interrupt if the virtual machine has not 
completed execution, roll back or cancel transactions, and return SQLITE_ABORT?

For the new sqlit3_prepare_v2 API, can sqlite3_reset() and sqlite3_finalize() 
return any error codes except for SQLITE_ABORT that indicate incomplete DB 
operations not already indicated by sqlite3_step()?

Ralf


-
To unsubscribe, send email to [EMAIL PROTECTED]
-

Re: [sqlite] How many virtual table implemenations are there out there?

2007-11-02 Thread Ralf Junker


>But to do so seems likely to require an incompatible change to the 
>virtual-table interface.

Could I kindly request an addition to the incompatible change to the 
virtual-table interface?

I would very much appreciate a corresponding function to 

  function xRowID(
  pCursor: psqlite3_vtab_cursor;
  pRowID: PInt64): Integer;

which would notify virtual table implementations that the the SQLite engine no 
longer uses this particular RowID like, for example:

  function xRowID_Done(
  pCursor: psqlite3_vtab_cursor;
  pRowID: PInt64): Integer;

The reason behind this is that some DB engines store RowIDs / BookMarks in 
malloced memory structures. Obviously, they have to be freed when no longer in 
use. Unfortunately, the current VT interface does not give notice when this is 
the case. With xRowID_Done, implementations will be able to free malloced 
memory when no longer needed by SQLite and thus avoid accumulating malloced 
RowIDs until the table is closed.

Ralf 


-
To unsubscribe, send email to [EMAIL PROTECTED]
-

Re: [sqlite] Retrieve bound host parameters from statement?

2007-10-30 Thread Ralf Junker

Hello Joe Wilson,

>> True, but we would need to access unsupported API to do so. And as we know 
>> only too well,
>> unsupported API is subject to change without notice any time ;-). Therefore 
>> I would rather not
>> write these myself but ask for the possibility to add them to the library 
>> officially, even if
>> "experimental" only.
>
>Then you'll be waiting forever.
>
>If you post a patch implementing the feature, at least some
>other like-minded programmers might get some benefit from it,
>or at least generate some feedback.

The simple patch is not enough. To have any value, it must be supported and 
tested for upcomming versions of SQLite. Using undocumented APIs always carries 
the risk that the patch will break in the future, possibly staying unnoticed 
until it caused serious problems to someone.

Dan's solution is safe in this regard, and should be preferred - even if using 
unofficial API calls will certainly be faster.

Maybe we'll be lucky and will not be waiting forever!?

Ralf 


-
To unsubscribe, send email to [EMAIL PROTECTED]
-

Re: [sqlite] Retrieve bound host parameters from statement?

2007-10-30 Thread Ralf Junker

Hello Dan Kennedy,

>> True, but we would need to access unsupported API to do so. 
>> And as we know only too well, unsupported API is subject to 
>> change without notice any time ;-). Therefore I would rather 
>> not write these myself but ask for the possibility to add them 
>> to the library officially, even if "experimental" only.
>
>Depends how desperate you are. Say you want to query statement 
>object X that has 4 variables, you could do this:
>
>  pTmp = sqlite3_prepare("SELECT ?, ?, ?, ?");
>  sqlite3_transfer_bindings(X, pTmp);
>  /* Use sqlite3_step() etc. to fish values out of pTmp */
>  sqlite3_transfer_bindings(pTmp, X);
>  sqlite3_finalize(pTmp);

Smart, many thanks!

Still, I believe that faster sqlite3_bound... functions would be a useful 
addition to the official API. It seems only logical to have corresponding read 
functions to complement the write functions.

They would, for example, help to fill the gap of the unresolved host parameters 
if the trace callback was changed to carry along the prepared statement being 
executed.

Ralf 


-
To unsubscribe, send email to [EMAIL PROTECTED]
-

Re: [sqlite] Retrieve bound host parameters from statement?

2007-10-29 Thread Ralf Junker


>> I wonder if it is possible to retrieve bound host parameters from a prepared 
>> SQL statement? I am
>> thinking of the opposite of the sqlite3_bind... family of functions like:
>> 
>>   int sqlite3_bound_int (sqlite3_stmt*, int*);
>>   int sqlite3_bound_double (sqlite3_stmt*, double*);
>
>You'd also need to specify the index of the ? parameter you're seeking. 

Certainly. Sorry for the ommission, glad you pointed this out.

>> They would be usefull to work around the sqlite3_trace() limitation which 
>> does not replace host
>> parameters in the SQL. With the sqlite3_bound... functions, the trace 
>> callback would be able
>> retrieve the parameter values from the statement and replace them or log 
>> them separately.
>
>You could create all this functionality in your wrapper level above
>the sqlite3 API.
>
>It would be easy enough for you to modify the sqlite3 sources to add
>such functions to fish the values out of the internal Vdbe.aVar Mem 
>array of the sqlite3_stmt. If the type does not match what is stored 
>internally, or something was not previously bound or out of range, I 
>imagine an SQLITE_ERROR could be returned. Or maybe you want your 
>bound* functions to coerce the bound value to the type you specify.

True, but we would need to access unsupported API to do so. And as we know only 
too well, unsupported API is subject to change without notice any time ;-). 
Therefore I would rather not write these myself but ask for the possibility to 
add them to the library officially, even if "experimental" only.

>Another complementary function, say sqlite3_bound_type, could 
>return the type(s) of the bound field. (I say types plural because
>sometimes a value can be a combination of types at the same time - 
>i.e., MEM_Real|MEM_Int). These internal types would have to be 
>exposed if you required such functionality.
>
>#define MEM_Null  0x0001   /* Value is NULL */
>#define MEM_Str   0x0002   /* Value is a string */
>#define MEM_Int   0x0004   /* Value is an integer */
>#define MEM_Real  0x0008   /* Value is a real number */
>#define MEM_Blob  0x0010   /* Value is a BLOB */

Indeed very much agreed to!

Ralf 


-
To unsubscribe, send email to [EMAIL PROTECTED]
-

[sqlite] Retrieve bound host parameters from statement?

2007-10-29 Thread Ralf Junker

Hello,

I wonder if it is possible to retrieve bound host parameters from a prepared 
SQL statement? I am thinking of the opposite of the sqlite3_bind... family of 
functions like:

  int sqlite3_bound_int (sqlite3_stmt*, int*);
  int sqlite3_bound_double (sqlite3_stmt*, double*);

They would be usefull to work around the sqlite3_trace() limitation which does 
not replace host parameters in the SQL. With the sqlite3_bound... functions, 
the trace callback would be able retrieve the parameter values from the 
statement and replace them or log them separately.

Ralf


-
To unsubscribe, send email to [EMAIL PROTECTED]
-

Re: [sqlite] Lemon: Help on conflic resolution?

2007-10-17 Thread Ralf Junker

Richard,

this helped me greatly! I also derived from your example that I can use 
multiple characters without conflicts like this:

---

doc ::= inline_list.

// List of allowed characters. Add more as you like.

c ::= CHAR.
c ::= SPACE.

// The c character repeat.

chars ::= c.
chars ::= chars CHAR.

// Any sequence of just c and 'c' (c surrounded by apostrophes).

inline ::= c.
inline ::= APOS chars APOS.
 
// The inline repeat.
 
inline_list ::= inline.
inline_list ::= inline_list inline. 

-

Many thanks!

Ralf


-
To unsubscribe, send email to [EMAIL PROTECTED]
-

Re: [sqlite] Lemon: Help on conflic resolution?

2007-10-17 Thread Ralf Junker

Richard,

this helped me greatly! I also derived from your example that I can use 
multiple characters without conflicts like this:

---

doc ::= inline_list.

// List of allowed characters. Add more as you like.

c ::= CHAR.
c ::= SPACE.

// The c character repeat.

chars ::= c.
chars ::= chars CHAR.

// Any sequence of just c and 'c' (c surrounded by apostrophes).

inline ::= c.
inline ::= APOS chars APOS.
 
// The inline repeat.
 
inline_list ::= inline.
inline_list ::= inline_list inline. 

-

Many thanks!

Ralf


-
To unsubscribe, send email to [EMAIL PROTECTED]
-

[sqlite] Lemon: Help on conflic resolution?

2007-10-17 Thread Ralf Junker

I am writing to ask for help about how to solve The Lemon parser conflicts.

As part of a larger grammar, I am need to implement this regular expression in 
Lemon:

  (.+|'.+')+

I tried lots of grammars, but all of them generated Lemon warnings. 

Maybe someone could have a look at the grammar below and let me know how the 
conflicts can be solved, and why they are generated in the first place?

Many thanks,

Ralf

--

%left CHAR. // Any character, except for apostrophe.
%left APOS. // Apostrophe only.

doc ::= inline.

// One ore more CHARs.

chars ::= CHAR.
chars ::= chars CHAR.

// Any sequence of just CHARs and 'CHARs' (surrounded by apostrophes).

inline ::= chars.
inline ::= APOS chars APOS.

// The repeat. This causes conflicts. Isn't it allowed? Workarounds?

inline ::= inline inline.


-
To unsubscribe, send email to [EMAIL PROTECTED]
-

Re: [sqlite] FTS3 where ?

2007-09-21 Thread Ralf Junker

Hello Andre du Plessis,

If you are using Delphi, FTS3 is already included in the latest DISQLite3 (Pro 
and Personal). Download is available from http://www.yunqa.de/delphi/.

The source code is available from CVS. You will find FTS3 in the /ext/ 
directory.

Ralf

>Fts3 which everyone is talking about, I cannot see any mention of it on
>the download page, does it mean that its just the development sourcecode
>which people are compiling at this point or are there some prebuilt
>dll's available, or has it not been officially released yet?


-
To unsubscribe, send email to [EMAIL PROTECTED]
-

Re: [sqlite] 3.5.0 alpha TryEnterCriticalSection linker error

2007-09-05 Thread Ralf Junker


>Isn't it time to drop the Win9X support from the default build?

I do not believe that just because Win9x is missing a single required call 
justifies dropping support for it altogether!

>I'm thinking that any optimization should be enabled for the majority of 
>users. Or if it's not really an optimization, why keeping it in the code then?

If possible, please keep the optimization.

>An alternative is to call this function when available using "GetProcAddress" 
>(this is the case for a lot of other modern calls that cannot be done right 
>now).

I second this alternative. 

According to http://msdn2.microsoft.com/en-us/library/ms686857.aspx, 
TryEnterCriticalSection() is available on all Windows NT sytems. Therefore an 
option to "GetProcAddress()" is checking for such OSes. The isNT() routine is 
already part of os_win.c and is used there frequently:

static int isNT(void){
if( sqlite3_os_type==0 ){
  OSVERSIONINFO sInfo;
  sInfo.dwOSVersionInfoSize = sizeof(sInfo);
  GetVersionEx();
  sqlite3_os_type = sInfo.dwPlatformId==VER_PLATFORM_WIN32_NT ? 2 : 1;
}
return sqlite3_os_type==2;
  }


sqlite3_mutex_try() would then extend to something like this (untested!):

int sqlite3_mutex_try(sqlite3_mutex *p){
  int rc;
  assert( p );
  assert( p->id==SQLITE_MUTEX_RECURSIVE || sqlite3_mutex_notheld(p) );
  if( isNT() && TryEnterCriticalSection(>mutex) ){
p->owner = GetCurrentThreadId();
p->nRef++;
rc = SQLITE_OK;
  }else{
rc = SQLITE_BUSY;
  }
  return rc;
}

Ralf 


-
To unsubscribe, send email to [EMAIL PROTECTED]
-

Re: [sqlite] rowid versus docid for fts3.

2007-09-02 Thread Ralf Junker

Scott Hess wrote:

>Unfortunately, the reason fts2 couldn't be "fixed" was because you
>can't perform the necessary ALTER TABLE if the column you're adding is
>a primary key.

Sure, I was aware of this problem.

>Since the only alternative would be to build a new
>table and copy everything over, it seemed more reasonable to just let
>the app developer do that, rather than forcing it on them under the
>covers.

True also. I know that my "compatible" proposal would not update existing 
FTS2.0 tables (with the vacuum bug) to FTS2.1 (fixed vacuum bug with rowid 
INTEGER PRIMARY key). But it should at least be possible to continue using old 
FTS2.0 tables with this new FTS2.1.

It should also be possible (untested and highly speculative) for FTS2.0 to read 
tables generated by the new FTS2.1. Old FTS2.0 will just not be able to write 
to or update tables created by FTS2.1. However, since reading should work well, 
it update existing tables can be updated with the FTS2.1 module only, 
alleviating the need for a 2nd FTS modules just for updating.

To sum up, I expect these benefits from my "rowid INTEGER PRIMARY KEY" 
suggestion:

Reading: Fully upward and backward compatible. Not at all with FTS3.

Writing: Upward compatible. Not with FTS3.

Updating: Possible within the same FTS2 module. Requires extra FTS3 module 
otherwise.

I have not written any code to test if all this does indeed make sense. Is 
anyone aware of any fallbacks, before I try?

Regards,

Ralf

>On 8/31/07, Ralf Junker <[EMAIL PROTECTED]> wrote:
>> This one just came to my mind:
>>
>>   CREATE TABLE (rowid INTEGER PRIMARY KEY, t TEXT);
>>
>> This promotes "rowid" to a visible column "rowid" which does not change 
>> during a VACUUM. "rowid" is already a reserved word in SQLite. Maybe this 
>> option is even compatible to FTS2?
>>
>> Ralf
>>
>> >ext/fts3.c in the current code fixes the fts2-vs-vacuum problem by
>> >adding "docid INTEGER PRIMARY KEY" to the %_content table.  This
>> >becomes an alias for rowid, and thus causes vacuum to not renumber
>> >rowids.  It is safe to add that column because the other columns in
>> >%_content are constructed such that even the following:
>> >
>> >CREATE VIRTUAL TABLE t USING fts3(docid);
>> >
>> >will work fine.
>> >
>> >I'm considering whether I should take it one step further, and make
>> >docid a reserved column name for fts3 tables.  My rational is that
>> >fts3 rowids are not quite the same as the rowids of regular tables -
>> >in fact, some use-cases would encourage users of fts3 to use rowids in
>> >exactly the way that fts2 was inappropriately using them!
>> >
>> >docid would be a hidden column, like rowid.  That means that you'll
>> >only see the column in SELECT and INSERT statements if you explicitly
>> >reference it.  It would operate WRT rowid exactly as an INTEGER
>> >PRIMARY KEY column would.
>> >
>> >Opinions?
>> >
>> >-scott

-
To unsubscribe, send email to [EMAIL PROTECTED]
-

Re: [sqlite] rowid versus docid for fts3.

2007-08-31 Thread Ralf Junker

This one just came to my mind:

  CREATE TABLE (rowid INTEGER PRIMARY KEY, t TEXT);

This promotes "rowid" to a visible column "rowid" which does not change during 
a VACUUM. "rowid" is already a reserved word in SQLite. Maybe this option is 
even compatible to FTS2?

Ralf

>ext/fts3.c in the current code fixes the fts2-vs-vacuum problem by
>adding "docid INTEGER PRIMARY KEY" to the %_content table.  This
>becomes an alias for rowid, and thus causes vacuum to not renumber
>rowids.  It is safe to add that column because the other columns in
>%_content are constructed such that even the following:
>
>CREATE VIRTUAL TABLE t USING fts3(docid);
>
>will work fine.
>
>I'm considering whether I should take it one step further, and make
>docid a reserved column name for fts3 tables.  My rational is that
>fts3 rowids are not quite the same as the rowids of regular tables -
>in fact, some use-cases would encourage users of fts3 to use rowids in
>exactly the way that fts2 was inappropriately using them!
>
>docid would be a hidden column, like rowid.  That means that you'll
>only see the column in SELECT and INSERT statements if you explicitly
>reference it.  It would operate WRT rowid exactly as an INTEGER
>PRIMARY KEY column would.
>
>Opinions?
>
>-scott


-
To unsubscribe, send email to [EMAIL PROTECTED]
-

RE: [sqlite] Problem loading extension in Delphi (FTS2)

2007-08-16 Thread Ralf Junker

DISQLite3 does not use SQLite.NET.

As I read Sam, he did not say so. He just compared the two to support his 
argument that "loadable extensions are not required to create custom functions 
and having access to source is not required for custom functions either".

Ralf

>I was not aware that DISQLite3 uses SQLite.NET.
>
>--- "Samuel R. Neff" <[EMAIL PROTECTED]> wrote:
>> 
>> loadable extensions are not required to create custom functions, and having
>> access to source is not required for custom functions either.  SQLite.NET
>> provides very clean support for custom functions written in any .NET
>> language and they are loaded automatically by the wrapper from any DLL
>> present in the application--they don't have to be added to the SQLite.NET
>> codebase.
>> 
>> Sam

-
To unsubscribe, send email to [EMAIL PROTECTED]
-

Re: [sqlite] Problem loading extension in Delphi (FTS2)

2007-08-16 Thread Ralf Junker

Hello Zlatko Matic,

>How about Lazarus version of DISQLite3? :)

DISQLite3 is Delphi only at the moment, maybe later! ;-)

Ralf 


-
To unsubscribe, send email to [EMAIL PROTECTED]
-

Re: [sqlite] DISQLite FTS

2007-08-16 Thread Ralf Junker


>Does DISQLite have its own implementation of FTS, so not using FTS2 at
>all?

DISQLite3 uses the original full text search modules, adapted to Delphi. Both 
FTS1 and FTS2 are already compiled in, and can both be used by the same 
application (like SQLiteSpy does).

>Does it use the same mechanism as FTS2 with virtual tables?

Yes.

>And have you compared speed and functionality to FTS2,

Speed is likely to be a little faster than external FTS2, resulting from 
register calling conventions.

DISQLite3's embedded FTS features are identical to external FTS. Just today I 
uploaded a new version which adds a Unicode Pascal tokenizer written in Delphi 
which you can use out of the box or as a basis for your own customized 
tokenizer (see demo).

>I guess what it comes down to is to know options available, however I
>think the FTS2 project is great and hopes that it continues to grow, as
>it can be used on all platforms.

FTS in DISQLite3 is cross-platform database file compatabile, just as the 
entire library. However, if you use custom tokenizers, user functions or 
collation sequences with your Delphi application you need to replicate them on 
other platforms.

>I guess what might be a problem is that I would not be able to use
>DISQLite's FTS implementation in Python or .net for example, or would I?

No, this is not a problem. Database files created by DISQLite3 can be read and 
modified by Python, .net, or any other SQLite3 compatible applications. If in 
doubt, run some test with SQLiteSpy: It uses DISQLite3 as its build-in SQLite3 
implementation.

Ralf 


-
To unsubscribe, send email to [EMAIL PROTECTED]
-

Re: [sqlite] Problem loading extension in Delphi (FTS2)

2007-08-16 Thread Ralf Junker

Hello Joe Wilson,

>Your product is not useful to a few users like me who require custom 
>sqlite functions for their databases.

I am not sure I understand currectly. Only loadable extensions are currently 
omited from DISQLite3.

sqlite3_create_function() is very well available in DISQLite3 Pro to create 
custom SQL functions. DISQLite3 also includes units with ready-made function 
extensions:

* REGEXP regular expression support provided by DIRegEx in DISQLite3RegExp.pas.

* Mathematical utility functions [(acos(), asin(), atan(), atan(), atan2(), 
ceil(), ceiling(), cos(), cot(), degrees(), exp(), floor(), ln(), log(), log(), 
log2(), log10(), mod(), pi(), pow(), radians(), sign(), sin(), sqrt(), tan(), 
truncate()] in DISQLite3Functions.pas.

Ralf  


-
To unsubscribe, send email to [EMAIL PROTECTED]
-

Re: [sqlite] Problem loading extension in Delphi (FTS2)

2007-08-15 Thread Ralf Junker

Hello Joe Wilson,

>Does it support external sqlite loadable extensions?

Loadable extensions are currently omitted. FTS1 and FTS2 extensions are 
provided as built-in modules. User-aware collations sequences using the Windows 
sorting functions are provided in place of the ICU extension. Full 
functionality is therefore available.

>The register calling convention may be a problem there.

It can remapped so cdecl extensions could be used.

Ralf 


-
To unsubscribe, send email to [EMAIL PROTECTED]
-

Re: [sqlite] Problem loading extension in Delphi (FTS2)

2007-08-14 Thread Ralf Junker

Hello Roberto,

>Might be a typo, but your declaration defines the calling convention as 
>'stdcall' (instead of 'cdecl'), was your sqlite3.dll custom built with this 
>calling convention? I don't think much of sqlite would work with stdcall. 

DISQLite3 intentionally uses the 'register' calling convention, internally and 
externally.

Delphi prefers 'register' over 'stdcall' or 'cdecl' as the most efficient, 
since it usually avoids creation of a stack frame. This results in a measurable 
performance improvement compared to sqlite3.dll.

>The problem with DISQLite3 is that it is not free and the sources for the 
>component is not available.

DISQLite3 Personal is free for non-commercial use. Source code is available 
after registering DISQLite3 Pro. Both editions benefit from 'register' calling 
conventions, include full text search (FTS1 and FTS2) as well as ample Delphi 
additions like class wrappers and convenience functions.

Ralf 


-
To unsubscribe, send email to [EMAIL PROTECTED]
-

Re: [sqlite] Problem loading extension in Delphi (FTS2)

2007-08-10 Thread Ralf Junker

Hello Henrik Ræder,

>I'm trying to load the FTS2 extension in Delphi, using the Aducom
>components. Am really close, but still stuck, and thinking it's a problem
>with the parameter to sqlite3_enable_load_extension().

DISQLite3 (http://www.yunqa.de/delphi/sqlite3/) does not expose such problems. 

Look at the full text search demo project which incorporates both FTS1 and FTS2 
into a single *.exe application, with _no_ DLLs or external files needed.

The new customizable tokenizer interface will be demonstrated by a 
Unicode-aware Pascal tokenizer with the upcoming version of DISQLite3 as soon 
as the FTS vacuum fix is official released.

Ralf 


-
To unsubscribe, send email to [EMAIL PROTECTED]
-

Re: [sqlite] autoincrement and fts2?

2007-07-17 Thread Ralf Junker


>>The standard way to have non-TEXT information associated with rows in
>>an fts table would be a separate table which joins with the fts table
>>on rowid. 
>
>I have not tested this, but if the FTS2 rowid is the standard SQLite rowid, I 
>believe that it will be affected by VACUUM change of rowids recently reported 
>on this list? If so, could this be fixed?

VACUUM does modify FTS2 rowids. Here is the test:

  drop table if exists a;
  
  create virtual table a using fts2 (t);
  
  insert into a (t) values ('one');
  insert into a (t) values ('two');
  insert into a (t) values ('three');
  
  select rowid, * from a;
  
  delete from a where t = 'two';
  vacuum;
  
  select rowid, * from a;

Unfortunately there is no workaround since table a is auto-generated by the 
FTS2 module. Created ticket #2510. 


-
To unsubscribe, send email to [EMAIL PROTECTED]
-

Re: [sqlite] autoincrement and fts2?

2007-07-17 Thread Ralf Junker


>The rowid is the standard SQLite rowid, so it does provide an INTEGER
>PRIMARY KEY AUTOINCREMENT column.
>
>The standard way to have non-TEXT information associated with rows in
>an fts table would be a separate table which joins with the fts table
>on rowid. 

I have not tested this, but if the FTS2 rowid is the standard SQLite rowid, I 
believe that it will be affected by VACUUM change of rowids recently reported 
on this list? If so, could this be fixed?

Ralf 


-
To unsubscribe, send email to [EMAIL PROTECTED]
-

Re: [sqlite] [Delphi] Escaping quote?

2007-06-27 Thread Ralf Junker


>Question, does the %q operator offer any advantages over calling QuotedStr ?

Yes: The %q operator just duplicates internal quotes, it does insert quotes at 
the beginning and the end of the string like QuotedStr does. You can can still 
use sqlite3_mprintf's %Q operator for that.

Ralf  


-
To unsubscribe, send email to [EMAIL PROTECTED]
-

Re: [sqlite] [Delphi] Escaping quote?

2007-06-26 Thread Ralf Junker


>I'm having a problem saving strings into a colum from a Delphi application 
>because they might contain the ( ' ) single quote character:
>
>Is there a function I should call either in SQLite or Delphi before running 
>the SQL query?

Why don't you use the '%q' operator of SQLite's sqlite3_mprintf? You can use 
sqlite3_mprintf with Delphi 6 or later. It is interfaced as a varargs function 
in DISQLite3, for example (http://www.yunqa.de/delphi/sqlite3/). The varargs 
directive allows to pass a variable number of arguments to sqlite3_mprintf, 
similar to Delphi's array of const declaration.

Here is a Delphi example:

//--

program SQLite3_printf;

{$APPTYPE CONSOLE}

uses
  DISQLite3Api;

var
  Input: PAnsiChar;
begin
  Input := 'Let''s meet at the pub tonight!';

  WriteLn('sqlite3_mprintf:');
  WriteLn(sqlite3_mprintf('insert into stuff (title) values (''%q'')', Input));
  WriteLn;

  WriteLn;
  WriteLn('Done - Press ENTER to Exit');
  ReadLn;
end.

//--

This is the relevant section from the sqlite3_mprintf C documentation:

The %q option works like %s in that it substitutes a null-terminated string 
from the argument list. But %q also doubles every '\'' character. %q is 
designed for use inside a string literal. By doubling each '\'' character it 
escapes that character and allows it to be inserted into the string.

For example, so some string variable contains text as follows:

  char *zText = "It's a happy day!";

One can use this text in an SQL statement as follows:

  char *zSQL = sqlite3_mprintf("INSERT INTO table VALUES('%q')", zText);
  sqlite3_exec(db, zSQL, 0, 0, 0);
  sqlite3_free(zSQL);

Because the %q format string is used, the '\'' character in zText is escaped 
and the SQL generated is as follows:

  INSERT INTO table1 VALUES('It''s a happy day!');  


-
To unsubscribe, send email to [EMAIL PROTECTED]
-

[sqlite] Ticket #2415

2007-06-15 Thread Ralf Junker

Hello Developers,

I notice that you are about to release version 3.4.0 shortly. I have just 
created ticked #2415 which I believe breaks existing code. Are there any chance 
that you find the time to look at it prior to 3.4.0?

Ralf


-
To unsubscribe, send email to [EMAIL PROTECTED]
-

Re: [sqlite] Japanese-Korean characters

2007-04-26 Thread Ralf Junker

Hello Ralf Junker,

>>Thanks for the link.  My OS is Linux.  Is there something available on these
>>lines for linux ?
>
>SQLiteSpy apparently runs on Linux with the help of wine, but I have not 
>tested this myself: http://www.winehq.com/

Here is a small report on this:

  http://appdb.winehq.org/appview.php?iAppId=2672

Ralf  


-
To unsubscribe, send email to [EMAIL PROTECTED]
-

Re: [sqlite] Japanese-Korean characters

2007-04-26 Thread Ralf Junker

Hello Pavan,

>Thanks for the link.  My OS is Linux.  Is there something available on these
>lines for linux ?

SQLiteSpy apparently runs on Linux with the help of wine, but I have not tested 
this myself: http://www.winehq.com/

Ralf 


-
To unsubscribe, send email to [EMAIL PROTECTED]
-

Re: [sqlite] Japanese-Korean characters

2007-04-26 Thread Ralf Junker

Hello Pavan,

>Can we store/retrieve Japanese/korean characters in sqlite db ?

Yes, you can well do so, as others have already pointed out.

If you are also looking for a Unicode GUI SQLite database manager to display 
and edit Japanese / Korean character databases, you might want to have a look 
at SQLiteSpy, freeware from http://www.yunqa.de/delphi/sqlitespy/ .

SQLiteSpy is designed to support any language when run on a Windows NT 4 and 
later operating system (Win2K, WinXp, Vista, etc.). I have received positive 
reports that it works well with German, French, Eastern European, Greek and 
Cyrillic characters. Given that your system fonts supports Japanese and Koean 
characters, they should work just as well. If you are experiencing problems, 
please contact me via e-mail and I will see what I can do.

Ralf  


-
To unsubscribe, send email to [EMAIL PROTECTED]
-

Re: [sqlite] FTS: index only, no text storage - Was: [sqlite] FTS: Custom Tokenizer / Stop Words

2007-03-14 Thread Ralf Junker

Scott Hess wrote:

>>I am optimistic that the proper implementation will use even less than 50%:
>
>Indeed :-). 

Glad to read this ;-)

>>I found that _not_ adding the original text turned out to be a great time
>>saver. This makes sense if we know that the original text is about 4 times
>>the size of the index. Storing lots of text by itself is already quite time
>>consuming even without creating a FTS index. So I do not expect really
>>bad slow downs by adding a docid->term index.
>
>Are you doing your inserts in the implied transactions sqlite provides
>for you if you didn't open an explicit transaction?  I'm found that
>when doing bulk inserts, the maintenance of the content table is a
>pretty small part of the overall time, perhaps 10%.

My timings vary: I have just measured the insertion speeds with and without 
storing the original text and was _very_ surprised by the results:

WITHtext storage: 1055 KB / sec
WITHOUT text storage: 4948 KB / sec

FTS without text storage performed almost 5 (five!) times faster than with text 
storage (running WinXP on a fairly recent system with a 5200 rotations per sec 
hard drive).

The testing scenario: There were no changes to the code except that I commented 
out the text bindings as described in my earlier message. The same documents 
were indexed (10739 files, 239959 KB size in total). Insertion took place in a 
single transaction, SYNCHRONOUS = OFF was used as the only tweak to the 
database. I ran all tests multiple times consecutively on an empty database to 
avoid OS file buffering interferences.

>>Snippets are of course nice to have out of the box as it is right now. But
>>even without storing the original text, snippets could be created by
>>
>>1. supplying the text through other means (additional parameter or
>>callback function), so that not FTS but the application would read
>>it from a disk file or decompress it from a database field.
>>
>>2. constructing token-only snippets from the document tokens and
>>offsets. This would of course exclude all non-word characters, but
>>would still return legible information.
>
>A use-case that was considered was indexing PDF data, in which case
>the per-document tokenization cost would probably be a couple seconds.
>If you ran a query which matched a couple thousand documents and
>proceeded to re-tokenize them for snippet generation, you'd be in deep
>trouble.  This is somewhat addressable by providing scoring mechanisms
>and using subselects (basically, have the subselect order by score,
>then cap the number of results, and have the main select ask for
>snippets).  A variant on that would be an index of a CD.  In that case
>it's pretty much essential that the index be able to efficiently
>answer questions without having to seek all over the disk.

Quite true.  But is this indeed a realistic scenario? It sounds a bit like the 
"select * from my-million-row-table" problem. Nothing wrong with this per se, 
but be aware of the consequences.

>Option 2 has some attraction, though, because you have the option of
>transparently segmenting the document into blocks and thus not having
>to re-tokenize the entire document to generate snippets.

Thanks!

>>>Being able to have an index without storing the original data was a
>>>weak goal when fts1 was being developed, but every time we visitted
>>>it, we found that the negatives of that approach were substantial
>>>enough to discourage us for a time.  [The "we" in that sentence means
>>>"me and the various people I run wacky ideas past."]  I'm keeping an
>>>eye out for interesting implementation strategies and the time to
>>>explore them, though.
>>
>>Maybe my arguments could influence the opinion of "we"? I would love
>>to see FTS without text storage, especially since I just lost a project to
>>another FTS product because duplicating data was unfortunately "out
>>of disk space".
>
>Feel free to drop me a description of the types of things you're doing
>out-of-band, maybe something will gel.  No promises!  Most of the
>current use-cases are pretty clear - since the data is already going
>to be in the database, letting fts2 store it is no big deal.  I can
>imagine pretty broad classes of problems which could come up when
>indexing data which is not in the database, so one of the challanges
>is to narrow down which problems are real, and which are figments.

I conclude from your remarks that the offsets() problem is not predominant and 
could be solved even without storing full text in the database. If so, snippets 
could be created as well from those offsets. I realize that this will 
commplicate the FTS2 implementation, so please excuse if I am arguing from a 
user's perspective.

For users, I can see the following benefits in separating FTS index and 
original text:

* Space savings when indexing external documents not stored in the database.

* Possibility to add FTS to text stored in compressed format in the database.

* Possibility to mix FTS text rows with

Re: [sqlite] FTS: index only, no text storage - Was: [sqlite] FTS: Custom Tokenizer / Stop Words

2007-03-13 Thread Ralf Junker

Hello Scott,

I was hoping that you would read my message, many thanks for your reply!

>UPDATE and DELETE need to have the previous document text, because the
>docids are embedded in the index, and there is no docid->term index
>(or, put another way, the previous document text _is_ the docid->term
>index).

This is very understandable given the present design.

>Keeping track of that information would probably double the
>size of the index.

With your estimate, the SQLite full text index (without document storage) would 
still take up only 50% of the documents' size. In my opinion, this is still a 
very good ratio, even if some specialized full text search engines apparently 
get away with less than 30%. I think you have done an enourmous job on FTS2!

I am optimistic that the proper implementation will use even less than 50%: My 
modifications are completely rudimentary and not at all optimized - the column 
to store the document text still exists. The only difference is that it is not 
used - it stores a null value which could be saved. In fact, the entire FTS 
table (the one without the suffixes) would not be needed and cut down storage 
space.

>A thing I've considered doing is to keep deletions
>as a special index to the side,

Would this open the door to "insert only, but no-modify and no-delete" indexes? 
I am sure users would like pay this cost for the benefit of even smaller FTS 
indexes!

>which would allow older data to be
>deleted during segment merges.  Unfortunately, I suspect that this
>would slow things down by introducing another bit of data which needs
>to be considered during merges.

I found that _not_ adding the original text turned out to be a great time 
saver. This makes sense if we know that the original text is about 4 times the 
size of the index. Storing lots of text by itself is already quite time 
consuming even without creating a FTS index. So I do not expect really bad slow 
downs by adding a docid->term index.

>Of course, there's no way the current system could generate snippets
>without the original text, because doclists don't record the set of
>adjacent terms.  That information could be recorded, but it's doubtful
>that doing so would be an improvement on simply storing the original
>text in the first place.  The current system _does_ have everything
>needed to generate the offsets to hits even without the original text,
>so the client application could generate snippets, though the code is
>not currently in place to expose this information.

Snippets are of course nice to have out of the box as it is right now. But even 
without storing the original text, snippets could be created by

1. supplying the text through other means (additional parameter or callback 
function), so that not FTS but the application would read it from a disk file 
or decompress it from a database field.

2. constructing token-only snippets from the document tokens and offsets. This 
would of course exclude all non-word characters, but would still return legible 
information.

>Being able to have an index without storing the original data was a
>weak goal when fts1 was being developed, but every time we visitted
>it, we found that the negatives of that approach were substantial
>enough to discourage us for a time.  [The "we" in that sentence means
>"me and the various people I run wacky ideas past."]  I'm keeping an
>eye out for interesting implementation strategies and the time to
>explore them, though.

Maybe my arguments could influence the opinion of "we"? I would love to see FTS 
without text storage, especially since I just lost a project to another FTS 
product because duplicating data was unfortunately "out of disk space".

All the best and keep up your good work,

Ralf  


-
To unsubscribe, send email to [EMAIL PROTECTED]
-

Re: [sqlite] FTS: index only, no text storage - Was: [sqlite] FTS: Custom Tokenizer / Stop Words

2007-03-13 Thread Ralf Junker

Ion Silvestru wrote:

>Just a question: did you eliminated stop-words in your tests?

No, I did not eliminate any stop-words. The two test runs were equal except for 
the small changes in FTS 2.

My stop words question was not intended for source code but for human language 
texts.

Ralf  


-
To unsubscribe, send email to [EMAIL PROTECTED]
-

< 1 2 3 >

101 - 200 of 272 matches

Mail list logo