[sqlite] Index Selection

2016-03-27 Thread Yuriy M. Kaminskiy
Igor Tandetnik  writes:

> On 3/26/2016 6:12 PM, Denis Burke wrote:
>> CREATE INDEX [IndxT1C3]
>> ON [T1](
>>  [C3] COLLATE [NOCASE]);
>> --
>> after doing this:
>> explain query plan
>> select C1 from T1
>> where C3='2016-01-02'
>
> You are requesting a case sensitive comparison; a case insensitive
> index cannot be used to satisfy it.
>
> If you expect SQLite to inspect the string literal character by
> character and prove that case sensitivity won't make a difference,
> then I'm afraid you expect too much.

It  was already discussed a week ago.

No, SQLite need not inspect string - any (even non-BINARY) index
(theoretically) can be used to speed-up `COLLATE BINARY =` comparison;
Queries
   SELECT * FROM T1 WHERE C3 COLLATE BINARY = ?1
and
   SELECT * FROM T1 WHERE (C3 COLLATE NOCASE = ?1) AND (C3 COLLATE BINARY = ?1)
returns exactly same result (with *any* ?1), but (given that COLLATE
NOCASE index exists and "good", but COLLATE BINARY index does not
exists) second query can be much faster; on other hand, if NOCASE index is
"bad" (i.e. there are very many records that [by COLLATE NOCASE rules] are
equal to ?1), or if table is small, such replacement can become
pessimisation, of course (and, reversely, if query optimized decided it
won't use index for whatever reason, it can always replace `(C3 COLLATE
NOCASE = ?1) AND (C3 COLLATE BINARY = ?1)` with `C3 COLLATE BINARY = ?1`).

As SQLite query planner knows which indexes exists and good [by
ANALYZE], it could've automatically replaced first query by second (but
currently it is not).

Obviously, this is NOT a bug, but just missed (non-obvious) optimization
opportunity (and, well, "we want more optimizations" at certain point
conflicts with "we want to keep sqlite lite" goal).

And this optimization is limited to equality (=) only, it won't work
with ORDER BY, <, > or BETWEEN. (Theoretically, it can be also used for
to slightly speed-up GROUP BY or DISTINCT, but that's more complex [and,
again, "complex optimization" and "lite" does not play together very well]).



[sqlite] Searching a table of patterns

2016-03-27 Thread Richard Hipp
On 3/27/16, David Rayna  wrote:
> One thing I have occasionally desired is be able to search a table of
> patterns given a string that might match some rows.
> This is the reverse of searching a table of strings to find ones that
> match a pattern.

CREATE TABLE t1(a INT, pattern TEXT);
INSERT INTO t1(a,pattern) VALUES(123,'x%y');

SELECT a FROM t1 WHERE 'xyzzy' LIKE pattern;

-- 
D. Richard Hipp
drh at sqlite.org


[sqlite] Using a "personal" partial index

2016-03-27 Thread Richard Hipp
On 3/27/16, David Rayna  wrote:
> I got excited when I saw the "indexed by" clause.
> A technique I used years ago with FoxPro & etc was to allow each user to
> create his own personal index of the same data.

That's not what INDEX BY does.

An you cannot really give each user their own private index, because
if user A inserts new data, then the new content would not appear in
user B's index, which user A knows nothing about.


-- 
D. Richard Hipp
drh at sqlite.org


[sqlite] Searching a table of patterns

2016-03-27 Thread David Rayna
One thing I have occasionally desired is be able to search a table of 
patterns given a string that might match some rows.
This is the reverse of searching a table of strings to find ones that 
match a pattern.
I was wondering if anyone else has come across this and if a small 
change to sqlite might make it possible.
In a few cases both at the same time would have been useful (find rows 
where pattern in column A matches value X and value in column B matches 
pattern Y).



[sqlite] Using a "personal" partial index

2016-03-27 Thread David Rayna
I got excited when I saw the "indexed by" clause.
A technique I used years ago with FoxPro & etc was to allow each user to 
create his own personal index of the same data.
FoxPro allowed storing index files separately and on each user's own 
local C: drive.
The index was on a complex computed "quality" value based on multiple 
table fields allowing each user to create his own ranking of the data 
based on his own input terms.

In sqlite, would I give each user their own name for the index which he 
could create, drop and redefine as desired?
How complex can the index value computation be? Would I need to 
implement it in C and access via sqlite source code?
Can select ... indexed by  work as a pre-filtering and sorting of the 
data for each user's own view which could be filtered more if needed?




[sqlite] Database layout in memory

2016-03-27 Thread René Czerny
Sorry for the late reply?
So from the connection object you can find the PCache? Do you have any hints as 
to how I could find the db connection object in memory? Is there some typical 
pattern that I could find?

Best,
Ren?

> On 28 Feb 2016, at 17:09, Keith Medcalf  wrote:
> 
> You would have to find the db (connection) object in memory, and trace that 
> through to the PCache to find all the database pages in memory -- 
> equivalently to what the backup api does when sequentially accessing pages.  
> The PCache must have an in-memory structure pointing to where each page is in 
> memory.  The database data pages themselves will not have that information.
> 
> On Sunday, 28 February, 2016 08:46, Ren? Czerny  > said"
> 
>> Thank you for the quick response, Keith!
>> As I understand it, the SQLite Backup API?s only work with an SQLite
>> object. E.g.: sqlite3_backup_init() needs a pointer to the database to
>> copy from. However, I do not have access to such a pointer, as the only
>> thing I get is a raw binary dump of the main memory containing the
>> database somewhere inside (think forensic dump).
>> I believe the Backup API?s won?t be applicable. :-/
> 
>>> On 28 Feb 2016, at 16:32, Keith Medcalf  wrote:
>>> 
>>> 
>>> Is there something wrong with using the backup api's?
>>> 
 -Original Message-
 From: sqlite-users-bounces at mailinglists.sqlite.org [mailto:sqlite-
>> users-
 bounces at mailinglists.sqlite.org] On Behalf Of Ren? Czerny
 Sent: Sunday, 28 February, 2016 08:22
 To: SQLite mailing list
 Subject: [sqlite] Database layout in memory
 
 Dear SQLite mailing list,
 
 after not finding anything on Google, I want to ask my question here:
 
 I am currently doing research on how to extract an SQLite inmemory-
 database from the image of a computer?s main memory and store it as a
 database file on disc. My previous attempts however failed, as the
 database is not in one place in memory, but seems to be fragmented. I
>> only
 managed to extract the database file containing the sqlite_master
>> table.
 Here is what I tried:
 
 1. Dump the main memory using LiME [0] on a Debian Wheezy system.
 2. Opened the dump in a hex-editor and searched for patterns that
>> indicate
 an SQLite database. (according to [1])
 3. Extracted the database file starting at the database header and
 retrieving (page-size * page-amount) bytes.
 
 The result did not include the tables? content, but only the schema.
>> The
 content is at a total different offset in the memory dump.
 
 So my questions are: Can you point me to a resource where SQLite in-
>> memory
 database layout is documented or described in a detailed way? Any other
 resources I should check out? Did I miss something? Is there another
>> way?
 
 Please note, that in my scenario I only have the memory dump and in
>> theory
 can?t make use of the live system.
 I am very glad for every input you could give me.
 
 Best regards,
 Ren? Czerny
 
 [0] https://github.com/504ensicsLabs/LiME
 [1] https://www.sqlite.org/fileformat2.html
 ___
 sqlite-users mailing list
 sqlite-users at mailinglists.sqlite.org
 http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>>> 
>>> 
>>> 
>>> ___
>>> sqlite-users mailing list
>>> sqlite-users at mailinglists.sqlite.org
>>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>> 
>> ___
>> sqlite-users mailing list
>> sqlite-users at mailinglists.sqlite.org > mailinglists.sqlite.org>
>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users 
>> 
> 
> 
> 
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org  mailinglists.sqlite.org>
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users 
> 


[sqlite] Database layout in memory

2016-03-27 Thread Keith Medcalf

You have a pointer to the db connection when you open the database.
The definition of the structure is in sqliteInt.h
it contains a pointer to the Db struct.
which contains pointers to other structures used to magange the Btree and the 
Pager and the PCache.


> -Original Message-
> From: sqlite-users-bounces at mailinglists.sqlite.org [mailto:sqlite-users-
> bounces at mailinglists.sqlite.org] On Behalf Of Ren? Czerny
> Sent: Sunday, 27 March, 2016 11:45
> To: SQLite mailing list
> Subject: Re: [sqlite] Database layout in memory
> 
> Sorry for the late reply?
> So from the connection object you can find the PCache? Do you have any
> hints as to how I could find the db connection object in memory? Is there
> some typical pattern that I could find?
> 
> Best,
> Ren?
> 
> > On 28 Feb 2016, at 17:09, Keith Medcalf  wrote:
> >
> > You would have to find the db (connection) object in memory, and trace
> that through to the PCache to find all the database pages in memory --
> equivalently to what the backup api does when sequentially accessing
> pages.  The PCache must have an in-memory structure pointing to where each
> page is in memory.  The database data pages themselves will not have that
> information.
> >
> > On Sunday, 28 February, 2016 08:46, Ren? Czerny  > said"
> >
> >> Thank you for the quick response, Keith!
> >> As I understand it, the SQLite Backup API?s only work with an SQLite
> >> object. E.g.: sqlite3_backup_init() needs a pointer to the database to
> >> copy from. However, I do not have access to such a pointer, as the only
> >> thing I get is a raw binary dump of the main memory containing the
> >> database somewhere inside (think forensic dump).
> >> I believe the Backup API?s won?t be applicable. :-/
> >
> >>> On 28 Feb 2016, at 16:32, Keith Medcalf  wrote:
> >>>
> >>>
> >>> Is there something wrong with using the backup api's?
> >>>
>  -Original Message-
>  From: sqlite-users-bounces at mailinglists.sqlite.org [mailto:sqlite-
> >> users-
>  bounces at mailinglists.sqlite.org] On Behalf Of Ren? Czerny
>  Sent: Sunday, 28 February, 2016 08:22
>  To: SQLite mailing list
>  Subject: [sqlite] Database layout in memory
> 
>  Dear SQLite mailing list,
> 
>  after not finding anything on Google, I want to ask my question here:
> 
>  I am currently doing research on how to extract an SQLite inmemory-
>  database from the image of a computer?s main memory and store it as a
>  database file on disc. My previous attempts however failed, as the
>  database is not in one place in memory, but seems to be fragmented. I
> >> only
>  managed to extract the database file containing the sqlite_master
> >> table.
>  Here is what I tried:
> 
>  1. Dump the main memory using LiME [0] on a Debian Wheezy system.
>  2. Opened the dump in a hex-editor and searched for patterns that
> >> indicate
>  an SQLite database. (according to [1])
>  3. Extracted the database file starting at the database header and
>  retrieving (page-size * page-amount) bytes.
> 
>  The result did not include the tables? content, but only the schema.
> >> The
>  content is at a total different offset in the memory dump.
> 
>  So my questions are: Can you point me to a resource where SQLite in-
> >> memory
>  database layout is documented or described in a detailed way? Any
> other
>  resources I should check out? Did I miss something? Is there another
> >> way?
> 
>  Please note, that in my scenario I only have the memory dump and in
> >> theory
>  can?t make use of the live system.
>  I am very glad for every input you could give me.
> 
>  Best regards,
>  Ren? Czerny
> 
>  [0] https://github.com/504ensicsLabs/LiME
>  [1] https://www.sqlite.org/fileformat2.html
>  ___
>  sqlite-users mailing list
>  sqlite-users at mailinglists.sqlite.org
>  http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
> >>>
> >>>
> >>>
> >>> ___
> >>> sqlite-users mailing list
> >>> sqlite-users at mailinglists.sqlite.org
> >>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
> >>
> >> ___
> >> sqlite-users mailing list
> >> sqlite-users at mailinglists.sqlite.org  users at mailinglists.sqlite.org>
> >> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
> 
> >
> >
> >
> > ___
> > sqlite-users mailing list
> > sqlite-users at mailinglists.sqlite.org  users at mailinglists.sqlite.org>
> > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
> 

[sqlite] sqlite fixed data loading extension

2016-03-27 Thread Don V Nielsen
Well, like I mentioned, I do have the basic functionality that I want in an
existing C# application. Unfortunately it works in the opposite direction
that I was hoping...meaning it implements Sqlite and not Sqlite
implementing it. The functionality would be more generally usable if Sqlite
could implement that logic as an extension.

And maybe I could leverage that experience into developing a window
extension...row_number over( partition by/order by ).  Couple years ago I
was seeking said functionality from Sqlite but none existed.  And now that
I think about it, I probably can't avoid said C# application because of the
lack of windowing. I needed at that time some kind of compiled language
because I could not get the i/o performance I need from a scripting
language.

[it's been a day since I drafted the above email. but now these thoughts
have crossed my mind]

Has anyone else attempted writing a window extension for Sqlite? With
enough people requesting it, what difficulties are involved such that it is
low on a development list? What negatives does a window extension impose
that make it undesirable. What am I in for and what advice can you provide?

I probably can get away with a scripting language controlling my process. A
lot of the effort can be handed off to Sqlite in batch, handling the
queries of 20mm rows.  The final result set that needs to be windowed is
typically 5/10m rows, which should be pretty manageable...performance wise.

dvn - just the ramblings of a simple mind.



On Fri, Mar 25, 2016 at 4:48 PM, James K. Lowden 
wrote:

> On Fri, 25 Mar 2016 06:49:22 -0500
> Don V Nielsen  wrote:
>
> > I have a need for something that can parse and load into sqlite tables
> > fixed length data.
>
> Insert your own separators.
>
> $ cat input
> 12345678910111213141516171819202122232425
>
> Print two 5-byte ranges separated by ", ".
>
> $ awk '{ OFS=", "; print substr($0, 1, 5), substr($0, 6, 5); }'  \
> input
> 12345, 67891
>
> --jkl
>
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>