Re: [sqlite] Difference in virtual tables logic triggered between version 3.7.15.2 to 3.8.0

2014-04-05 Thread Max Vlasov
On Sat, Apr 5, 2014 at 11:48 PM, Max Vlasov  wrote:
>
> This works for an old version of sqlite (3.6.10), but today Dominique
> Devienne mentioned some doubt about this approach and I decided to
> test it with some data with a recent version of sqlite.  With 3.8.4.3
> the same join query produced two rows and Nulls for CommaList and
> Value
>


Actually I found a way for it to work as assumed. My xBestIndex call
now makes estimatedCost huge (1) for non-constrained guess and
small (1) for constrained one. Although it works, I'd be glad to know
whether there is a way to return "impossible" instead of "expensive".

I think that I see why the query worked differently in different
sqlite versions.This trick by its nature interprets constrained data
(when Where is supplied) as superset of non-constrained one (when it
returns no data if there's no Where) and this definitely violates the
logic sqlite relies upon. So in a (normal) world where
where-constrained result is always smaller than full-scan data result,
sqlite is free to choose full-scan instead of a filtered scan for a
virtual table (probably it does by default after 3.8.0 for his join
query if provided estimatedCosts are equal).

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database to SQLite Population

2014-04-05 Thread Andy Goth

On 4/5/2014 12:33 PM, da...@dandymadeproductions.com wrote:

The assumption is that the networked database, datasource, could be on
the local lan or Internet.


So am I to understand that you indeed are concerned (1) that the pipe
could be slow, and (2) that the server may be heavily loaded?  Alright.


The 'snapshot' would not necessarily be everything, but based on a
SELECT statement of a set of the datasource content.


Okay, that's good.


The benefit I see from this local file/memory database is that I have
found some processing of data for analysis occurs over and over to
derive comparison results. By having the data local the user can
perform these analysis without constantly re-querying the production
database.


That makes sense, though there are costs, but you say...


It is assumed that the user knows that data can be stale at any point
beyond the initial load.


Okay, also good that this limitation is known and accepted.


The analysis tools can also remain unchanged since the data is still
coming from a RDBM.


Yes, good point.  This is definitely a massive plus over rolling your
own caching mechanisms.


The only reason the queue was considered is because it is quite
conceivable that a network datasource would be the choke point so a
queue being filled by several threads in process 1 would speed up the
population.


I'm trying to understand your statement because at first glance it seems
contradictory.  You assert the data source may be slow, but the solution
you present is normally used when interfacing with a slow data *sink*.

My best guess is you're considering simultaneously running multiple
source queries simultaneously to better utilize (monopolize, really) the
network, and this design indeed typically needs a queue to serialize its
output so it can be fed to a sink that accepts only one thing at a time.

I have two answers for you.  Pick whichever you like.  I apologize in
advance for the level of detail in answer #2.  I just want to be clear
so you can make the right decision.  Also I can't help but plug the
Wibble web server (featuring Tcl coroutines) since it's my baby.

Answer 1: Keep it simple.  Have one connection only, and just write
whatever you receive as you get it.  SQLite will not be your bottleneck.
One, it's very fast.  Two, you already say the source is the choke
point.  No need to complicate things.

Answer 2: You really do need to have multiple connections at a time, and
you're willing to have a more complex system to support this approach.

Well, SQLite can be used in a thread-safe manner, and it has a
write-ahead log (WAL) mode (not on by default) which improves the
performance of simultaneous writes.

That doesn't sound so bad, but remember that you're going to have to
write code to manage making lots of nonoverlapping subqueries as well as
handling the various failures that could crop up, going back and redoing
queries.  And you almost certainly will not be able to guarantee
atomicity across all your simultaneous connections, so they could very
well give incoherent data due to real-time changes to the database.

If you still want to proceed, there's more to think about.

What you seem to describe does not actually require true threads because
when connected to the Internet (via a typical ISP), you'll only actually
be able to receive one thing at a time.  You mention MyJSQLView in your
signature, and I assume that is related to Java.  Back in 2004 or so
Java got support for something I think they called NIO which imitates
the traditional and surprisingly powerful Unix select() mechanism
thereby facilitating I/O multiplexing within a single thread.  At the
time, it wasn't as advanced as Tcl's I/O event mechanism, but with some
work I was able to develop a Tcl-like framework on top of it.  Anyway,
if you were to use this, your process would resemble:

1. Allocate a predetermined number of connection state structures
2. If a connection is closed, open it
3. Use select() or NIO to wait for incoming data or other events
4. On connect succeed, submit the (sub-)query
5. On connection termination, mark the connection structure as closed
6. On writability and query request data is waiting to be sent, send it
7. On readability, receive and parse data, and load into SQLite
8. Go back to 3 until all expected data is received
9. Terminate when all expected data is received

As you can see, there is quite a bit of complexity, which is why I
prefer the Tcl way which is easier to explain:

1 . Open connections asynchronously
2 . Enter event loop
3a. Handler for connect success sends request query;
3b. and registers readability handler
4a. Handler for readability tries to read;
4b. on success, it loads data into SQLite;
4c. on failure, it closes and restarts connection as in step 1
5 . Terminate when all expected data is received

My old NIO wrapper made this paradigm work in Java, though a native
approach may have been made it into the offical class library in the
past decade.  Y

[sqlite] Difference in virtual tables logic triggered between version 3.7.15.2 to 3.8.0

2014-04-05 Thread Max Vlasov
I mentioned several times a technique involving particular virtual
table implementation that allows interpreting comma-list fields as
tables.

Basically this techique uses a virtual table that virtually contains
all data possible, but useful only when it is constrained with WHERE
clause working logically similar to IN operator. So with statement...

SELECT value FROM cmlist WHERE commalist='45,56,78,125'

... internal code accepts this Where data through xFilter and outputs
only values coming from this list.
The shema for the virtual table is

CREATE TABLE [xxx] ([CommaList] TEXT, [Value] TEXT)

so it returns the commalist itself for sqlite to double check (as I
recall it always compares not relying on the implementation of the
virtual table) and the required value to be used by the user of this
query.

This technique worked successfully even with joins. For joins this
works like this. Assume we have a table

CREATE TABLE [t] ([Ids] TEXT)

and it is populated with two rows

"1,5"
"6,7"

In this case the query

SELECT * FROM t left join cmlist on Ids=commalist

works as a multiplier producing 4 rows in total.

Ids CommaList Value
"1,5""1,5""1"
"1,5""1,5""5"
"6,7""6,7""6"
"6,7""6,7""7"

This works for an old version of sqlite (3.6.10), but today Dominique
Devienne mentioned some doubt about this approach and I decided to
test it with some data with a recent version of sqlite.  With 3.8.4.3
the same join query produced two rows and Nulls for CommaList and
Value

Ids CommaList Value
"1,5"""""
"6,7"""""

I looked in the code coming from xFilter for several versions and I
noticed that for versions up to 3.7.15.2 the xFilter call had idxNum
equal to 1 and idxStr contained actual constraint data, while starting
3.8.0 the idxNum was 0 and no constaint was passed.  So my virtual
table actually said no data since it always assumes there's no data
without required WHERE clause info.

What can be wrong here? I see there are possibly some optimization
involved that decides not constraint the virtual table at all, but I
don't know why. I also understand that without actual code and test
data it's hard to detect exactly what's wrong, but since the probable
version of change is known and at least queries are explained here, at
least an educated guess is possible.

Thanks

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database to SQLite Population

2014-04-05 Thread danap
> On 4/4/2014 1:21 PM, da...@dandymadeproductions.com wrote:
>> On working with the MyJSQLView database GUI access tool it has been
>> determined that a local file/memory database would be valuable to
>> perform recurring analysis on datasets from the connected datasource.
>> Sqlite is being considered as the local database.
>
> If I understand you correctly, you're suggesting making a local snapshot
> of a networked database to optimize performance.  I'm not sure what
> remote database you're using, but it seems to me with properly designed
> prepared statements there won't be much gain in downloading everything
> in advance to the local machine, especially since the download will
> certainly include more data than is actually needed.  Additionally
> consider the loss of coherency when the upstream database is modified
> but the local snapshot becomes stale.

The assumption is that the networked database, datasource, could be on the
local lan or Internet. The 'snapshot' would not necessarily be everything,
but based on a SELECT statement of a set of the datasource content. The
application already has a mechanism in place that the user can store queries
in a bucket for reuse. I guess a similar commercial term for this would
be ETL, but without the transform perhaps. One of the commercial tools out
there called Tableau I believe is using this exact concept.

The benefit I see from this local file/memory database is that I have found
some processing of data for analysis occurs over and over to derive
comparison
results. By having the data local the user can perform these analysis without
constantly re-querying the production database. The analysis tools can also
remain unchanged since the data is still coming from a RDBM. It is assumed
that the user knows that data can be stale at any point beyond the initial
load.

>
>> All the underlining code has been created for conversion between
>> datasource and local database. The code is now being developed to
>> perform the DB to DB transfer population. The basic algorithm being
>> considered is:
>>
>> Process 1, Loop: 1,2,3
>> 1. Read Datasource row from query.
>> 2. Create SQL INSERT statement.
>> 3. Write SQL INSERT into queue.
>>
>> Process 2, Loop: 4,5
>> 4. Read SQL INSERT from queue.
>> 5. Write SQL INSERT to SQLite db.}
>
> The queue seems to be an unnecessary intermediary.  Simply alternate
> between reading from the remote database and writing the received data
> to the SQLite database.  This simpler design is also more amenable to
> prepared statements which offer indispensible performance and security
> benefits.

Thank you for that input. My general thoughts were along your statement
of a simpler design. The only reason the queue was considered is because
it is quite conceivable that a network datasource would be the choke point
so a queue being filled by several threads in process 1 would speed up
the population.

>
> Do keep in mind that an SQL database consists not only of INSERTs (the
> data) but also CREATE TABLEs (the schema) plus indexes and triggers and
> views.

The data is the only interest here beside indexes for the new table data.
Completed routines have already been created to re-create the structure
of the datasource database.

>
>> Perhaps someone from this forum could possibly comment on another open
>> source project with similar type of db to db transfer that could be
>> studied or alternative algorithm.
>
> It doesn't matter what database you use, the algorithm remains the same:
> read source, write destination, repeat.  Many database systems have
> convenient import routines for common data sources.  For example, the
> SQLite shell has the .import command for loading a file into a table.
>
> --
> Andy Goth | 

These import routines are exactly how I have had to populate a local SQLite
database when testing analysis plugins. The process of setting up the
database
table(s) exporting from datasource data and importing to the local
file/memory
database would be much simply for non-expert users if automated so they
can focus on deriving results from analysis with a local higher
performance file/memory database.

Thank you Andy for your comments.

Dana M. Proctor
MyJSQLView Project Manager

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] about the apparently arriving soon "threads"

2014-04-05 Thread Richard Hipp
On Sat, Apr 5, 2014 at 9:16 AM, big stone  wrote:

> Hello,
>
> I see the "threads" branch of fossil has a lot of activity and seems close
> to be finalized.
>

No, it still has a long way to go.


>
> Will it be activated by default on the downloadable executable for windows
> ?
>

Probably not.  It might be possible to activate this feature using a
PRAGMA.  Or, it might require a start-time or compile-time setting.  That
is still all very much in flux.


> Will it apply to parallelisable CTE expression ?
>

No.  Multiple cores will only be used by CREATE INDEX and by large ORDER BY
or GROUP BY statements that cannot be satisfied using indices.

Devoting 4 cores to the task allows large ORDER BY statements to go about
25% faster.


>
> Will it be possible from 1 sqlite.exe command line (or 1 python
> sqlite.execute) to launch several SQL in parallel (and separated threads) ?
>

No.


>
> Typical workload case (that would be awesome if it could be 4 times
> quicker) =
>
> 
> - a long treatment reading  a big table, which could be splitted into N
> treatments reading 1/N th of the records of the big table.
> - the N thread are adding the resulting records in 1 table.
>
> This typical workload could also be defined as a big CTE :
>with
>   resulset1 as (...),
>   resulset2 as (...),
>   resulset3 as (...),
>   resulset4 as (...)
>   select * from resultset1
>   union all
>   select * from resultset2
>   union all
>   select * from resultset3
>   union all
>   select * from resultset4
>
> Regards,
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>



-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] about the apparently arriving soon "threads"

2014-04-05 Thread big stone
Hello,

I see the "threads" branch of fossil has a lot of activity and seems close
to be finalized.

Will it be activated by default on the downloadable executable for windows ?
Will it apply to parallelisable CTE expression ?

Will it be possible from 1 sqlite.exe command line (or 1 python
sqlite.execute) to launch several SQL in parallel (and separated threads) ?

Typical workload case (that would be awesome if it could be 4 times
quicker) =

- a long treatment reading  a big table, which could be splitted into N
treatments reading 1/N th of the records of the big table.
- the N thread are adding the resulting records in 1 table.

This typical workload could also be defined as a big CTE :
   with
  resulset1 as (...),
  resulset2 as (...),
  resulset3 as (...),
  resulset4 as (...)
  select * from resultset1
  union all
  select * from resultset2
  union all
  select * from resultset3
  union all
  select * from resultset4

Regards,
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Establishing a connection and downloading tables from an SQLite server

2014-04-05 Thread Donald Griggs
Greetings, Armondo.

Would a simple script invoking the Sqlite3 commandline interface
programdo what you want?
  http://sqlite.org/sqlite.html

sqlite3 wrote:

> To whom it may concern:
>
> I am absolutely new to to the world of SQLite (and SQL in general, to be
> honest, please don't assume any previous knowledge from me) and well, I
> stumbled upon an SQLite issue I can't seem to resolve on my own.
>
> I have an Asterisk VoIP phone server at my workplace. It was installed by
> an external company, and it provides a variety of services, like reports of
> the phone activity, etc.
>
> It was (and still is) working fine, but one of my colleagues wanted a
> different type of report that the external company wasn't supplying, and
> they wouldn't be able to supply them as he wanted, to be honest.
>
> The external company granted me access to the server's databases, and I
> learnt that they worked with SQLite. The server used a service called
> SQLite Manager to manage the SQLite databases. The manager has an easy
> enough to understand GUI, and I managed to download the table I needed
> manually and produce the wanted reports with software locally installed in
> my computer.
>
> That's all working fine. The thing is that it would be so much better and
> much more efficient if I could automatically download the needed table to
> produce the reports, because that way I could produce the reports much
> faster and get on with other things I have to do.
>
> I have Google'd how to do it with Powershell and well, I don't really know
> what I'm doing. I would really appreciate any assistance/guidance if
> possible.
>
> As a quick recap:
>
> I need to automatically download a table from a SQLite database on a local
> server.
>
> I have access to Visual Studio 2012 and Powershell if any programming is
> necessary.
>
> Anyways, I'm going to leave it at that.
>
> Thank you very much for taking the time to read this and hopefully for
> helping me in the future!
>
> Armando Gonzalez
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Establishing a connection and downloading tables from an SQLite server

2014-04-05 Thread Armando Gonzalez
To whom it may concern:

I am absolutely new to to the world of SQLite (and SQL in general, to be
honest, please don't assume any previous knowledge from me) and well, I
stumbled upon an SQLite issue I can't seem to resolve on my own.

I have an Asterisk VoIP phone server at my workplace. It was installed by
an external company, and it provides a variety of services, like reports of
the phone activity, etc.

It was (and still is) working fine, but one of my colleagues wanted a
different type of report that the external company wasn't supplying, and
they wouldn't be able to supply them as he wanted, to be honest.

The external company granted me access to the server's databases, and I
learnt that they worked with SQLite. The server used a service called
SQLite Manager to manage the SQLite databases. The manager has an easy
enough to understand GUI, and I managed to download the table I needed
manually and produce the wanted reports with software locally installed in
my computer.

That's all working fine. The thing is that it would be so much better and
much more efficient if I could automatically download the needed table to
produce the reports, because that way I could produce the reports much
faster and get on with other things I have to do.

I have Google'd how to do it with Powershell and well, I don't really know
what I'm doing. I would really appreciate any assistance/guidance if
possible.

As a quick recap:

I need to automatically download a table from a SQLite database on a local
server.

I have access to Visual Studio 2012 and Powershell if any programming is
necessary.

Anyways, I'm going to leave it at that.

Thank you very much for taking the time to read this and hopefully for
helping me in the future!

Armando Gonzalez
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] R*Tree and foreign key constraints

2014-04-05 Thread Bernd
I found nothing on the SQLite website stating that R*Tree tables don't 
support foreign key constraints but apparently they don't. Is this a 
limitation of virtual tables in general or is there something I missed?


Microsoft Windows [Version 6.1.7601]
Copyright (c) 2009 Microsoft Corporation. Alle Rechte vorbehalten.

C:\Users\Bernd>sqlite3
SQLite version 3.8.4.3 2014-04-03 16:53:12
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
sqlite> CREATE TABLE t_geometrie_typ
   ...> (auto_id INTEGER PRIMARY KEY, id TEXT NOT NULL);
sqlite> CREATE VIRTUAL TABLE t_geometrie_index USING RTREE
   ...> (id REFERENCES t_geometrie_typ(auto_id) ON DELETE CASCADE,
   ...> xmin, xmax, ymin, ymax);
sqlite> PRAGMA FOREIGN_KEYS = 1;
sqlite> INSERT INTO t_geometrie_typ
   ...> VALUES (1, 'some_id');
sqlite> INSERT INTO t_geometrie_index
   ...> VALUES (1, 2, 3, 4, 5);
sqlite> SELECT t.id
   ...> FROM t_geometrie_typ t, t_geometrie_index i
   ...> WHERE t.auto_id = i.id AND i.xmin >= 2 AND xmax <= 3
   ...> AND ymin >= 4 AND ymax <= 5;
some_id
sqlite> DELETE FROM t_geometrie_typ
   ...> WHERE id = 'some_id';
sqlite> SELECT * FROM t_geometrie_typ;
sqlite> SELECT * FROM t_geometrie_index;
1|2.0|3.0|4.0|5.0
sqlite> .q

Thanks, Bernd
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] comma-separated string data

2014-04-05 Thread RSmith

CREATE TABLE tmpcsv (
ID INTEGER PRIMARY KEY,
colA TEXT,
colCSV TEXT
);

INSERT INTO tmpcsv (colA, colCSV) VALUES
('foo', '4,66,51,3009,2,678'),
('bar', 'Sputnik,Discovery'),
('baz', '101,I-95,104');


  WITH csvrec(i,l,c,r) AS (
  SELECT tmpcsv.ID, 1, colCSV||',', '' FROM tmpcsv
UNION ALL
  SELECT i,
 instr(c,',') AS vLen,
 substr(c,instr(c,',')+1) AS vRem,
 substr(c,1,instr(c,',')-1) AS vCSV
  FROM csvrec
  WHERE vLen>0
)
  SELECT t.ID, t.colA, rt.r FROM tmpcsv AS t, csvrec AS rt
  WHERE t.ID=rt.i AND rt.r<>''
  ORDER BY t.ID
  LIMIT 100

Results:
IDcolA  r
------
1foo4
1foo66
1foo51
1foo3009
1foo2
1foo678
2barSputnik
2barDiscovery
3baz101
3bazI-95
3baz104


:)


(PS: The Limit is not strictly necessary, but advised during testing to avoid 
accidental eternal loops).

On 2014/04/04 23:12, peter korinis wrote:

Thanks, Simon.

The database was created by a developer as part of a larger project
involving a mobile app and portal. the portal is written in which calls the
db. so that's where the 'parsing' is done. But unfortunately a SQL alone
cannot provide the data in this case. This portal is in production so we're
not about to rewrite the code with a db redesign.

Does SQL have any string commands, like REGEXP or something else to strip
out the commas, and transform the string into multiple discrete values, then
use those perhaps in a SQL subquery . or something like that?

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] comma-separated string data

2014-04-05 Thread Dominique Devienne
On Saturday, April 5, 2014, Max Vlasov  wrote:

> On Fri, Apr 4, 2014 at 10:20 PM, peter korinis 
> >
> wrote:
> > A data column in a link table contains comma-separated string data, where
> >
> > How do you 'parse' a table entry like: "4,66,51,3009,2,678, ." to extract
> > these values and use them in an SQL statement, perhaps a WHERE id='66'?
>
> In similar cases I use my virtual table explained here:
> http://www.mail-archive.com/sqlite-users@sqlite.org/msg63453.html
> Actually the table works more like "function", so only one instance is
> required in the db to apply this trick.
> [...]
> This trick successfully works with joins and everything else.
>

I don't think it works in this case Max, because your technique relies on
the where clause being a literal, whereas here, if I'm reading between the
lines correctly, the poster wants the the equivalent of Oracle's TABLE()
operator.

In this case, a vtable can still help, but one specific to the source
table, with only the source table's PK columns plus the one to "un-nest" /
parse. Basically xNext behaves like a compound iterator, with the outer
iterator scanning the source table (using normal SQL and the SQLite API),
and the inner iterator returning the CSV values one at a time of the
current outer iterator's value / row. That's basically normalizing on the
fly. The vtable should ALSO declare an index on the PK columns to avoid
full scans with a where clause or a join. --DD
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users