Re: [sqlite] Reading a SharePoint file

2019-11-08 Thread Jose Isaias Cabrera

Jose Isaias Cabrera, on Friday, November 8, 2019 03:32 PM, wrote...
>
> Jens Alfke, on Friday, November 8, 2019 03:25 PM, wrote...
>
> > > On Nov 8, 2019, at 12:21 PM, Jose Isaias Cabrera, on
> > >
> > > Yeah, that is what I am doing now.  I was trying to save time
> [clip]
> > SharePoint might support WebDAV, and most OSs support (or used to support) 
> > mounting
> > WebDAV as a filesystem. In that case you could point SQLite at the database 
> > in
> > WebDAV mount. At that point it depends on whether the filesystem handles 
> > file reads
> > by issuing HTTP byte-range requests, or if it just downloads the entire 
> > file to a
> > local cache file first. In the former case it'd be faster than downloading, 
> > in the
> > latter case not.
>
> H, this is a good lead. Thanks.

In case somebody every wants to do this, here is a nice site to look at [1]: I 
got it to work on windows 10, which I didn't think was going to work, using,

NET USE * https://some.site.com/fs/more/folders/etc/

Which connected to the highest available drive (z:\), and then I just open the 
file:

z:\some.db.sqlite3

and BAA! It works.  Thanks again.

josé

[1] https://fuguhub.com/tutorials/mapping_windows_drive.lsp
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SQLITE_DETERMINISTIC and custom function optimization

2019-11-08 Thread Keith Medcalf

>But this makes me think of the upcoming virtual column feature. If you
>define a virtual table column whose value is equal to
>EXPENSIVE_FUNCTION(), do multiple references to that column in a query
>cause multiple calls to the function, or is it computed only once per
>row?

In the present case this will not help since expensive_function is dependent on 
a paramter and a table column ...

However, at present, a virtual column in a table will be calculated whenever it 
is used AND/OR a record is stored to the table containing the column, 
notwithstanding that the column in not stored.  For a stored virtual column it 
will only be calculated when a record is stored.

-- 
The fact that there's a Highway to Hell but only a Stairway to Heaven says a 
lot about anticipated traffic volume. 



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Reading a SharePoint file

2019-11-08 Thread Jose Isaias Cabrera

Jens Alfke, on Friday, November 8, 2019 03:25 PM, wrote...

> > On Nov 8, 2019, at 12:21 PM, Jose Isaias Cabrera, on
> >
> > Yeah, that is what I am doing now.  I was trying to save time
[clip]
> SharePoint might support WebDAV, and most OSs support (or used to support) 
> mounting
> WebDAV as a filesystem. In that case you could point SQLite at the database in
> WebDAV mount. At that point it depends on whether the filesystem handles file 
> reads
> by issuing HTTP byte-range requests, or if it just downloads the entire file 
> to a
> local cache file first. In the former case it'd be faster than downloading, 
> in the
> latter case not.

H, this is a good lead. Thanks.

josé
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SQLITE_DETERMINISTIC and custom function optimization

2019-11-08 Thread Keith Medcalf
I meant of course the following -- "result" is really ef in the where clause ...

with t(rowid, ef)
  as (
  select distinct rowid,
  expensive_function(?99, vdata)
from some_table
 ),
 u(rowid, vdist, oid, flags)
  as (
  select some_table.rowid,
 ef as vdist,
 ?1 as oid,
 flags | (case when ef < ?2 then ?3 else 0 end)
from some_table
join t 
  on t.rowid == some_table.rowid
   where (flags & ?3) == 0
 and (oid is null and ef < 0.6)
  or ef < vdist
 )
update some_table
   set (vdist, oid, flags) = (select vdist, oid, flags
from u
   where u.rowid = some_table.rowid)
  where some_table.rowid in (select rowid from u)


example, which materializes u, but requires more than one SQL statement, but 
still only requires you provide parameters to one statement:

create temporary table u(rowid integer primary key, vdist, oid, flags);
insert into temp.u
with t(rowid, ef)
  as (
  select distinct rowid,
  expensive_function(?99, vdata)
from some_table
 ),
 u(rowid, vdist, oid, flags)
  as (
  select t.rowid,
 ef as vdist,
 ?1 as oid,
 flags | (case when ef < ?2 then ?3 else 0 end)
from some_table
join t 
  on t.rowid == some_table.rowid
   where (flags & ?3) == 0
 and (oid is null and ef < 0.6)
  or ef < vdist
 )
select rowid, vdist, oid, flags
  from u;
update some_table
   set (vdist, oid, flags) = (select vdist, oid, flags
from temp.u
   where u.rowid = some_table.rowid)
  where some_table.rowid in (select rowid from temp.u)
drop table temp.u;

-- 
The fact that there's a Highway to Hell but only a Stairway to Heaven says a 
lot about anticipated traffic volume.

>-Original Message-
>From: Keith Medcalf 
>Sent: Friday, 8 November, 2019 13:21
>To: 'SQLite mailing list' 
>Subject: RE: [sqlite] SQLITE_DETERMINISTIC and custom function
>optimization
>
>
>SQLITE_DETERMINISTIC does not mean that the function is only called once
>for each unique set of arguments, only that when called with a unique set
>of arguments that it returns the same result.  This means that if it is a
>constant it can be factored out of being called more than once.  In your
>case, the parameters are not constant so it must be called once per row.
>
>The optimizer is prone to calculating things more often than it needs to,
>and is difficult to force to "materialize" things.  Since your expensive
>function needs to be calculated for every row of the table anyway, it
>would be better to just create a table that has it calculated once, then
>compute the updates table, then perform the update, then get rid of the
>extra tables.  You could do this in one statement if anyone could figure
>out how to get the optimizer to "materialize" table u in the following
>query (table t is materialized by the distinct, which does nothing other
>than force t to be materialized -- u is not so easy).
>
>with t(rowid, ef)
>  as (
>  select distinct rowid,
>  expensive_function(?99, vdata)
>from some_table
> ),
> u(rowid, vdist, oid, flags)
>  as (
>  select some_table.rowid,
> ef as vdist,
> ?1 as oid,
> flags | (case when ef < ?2 then ?3 else 0 end)
>from some_table
>join t
>  on t.rowid == some_table.rowid
>   where (flags & ?3) == 0
> and (oid is null and ef < 0.6)
>  or result < vdist
> )
>update some_table
>   set (vdist, oid, flags) = (select vdist, oid, flags
>from u
>   where u.rowid = some_table.rowid)
>  where some_table.rowid in (select rowid from u)
>
>
>example, which materializes u, but requires more than one SQL statement,
>but still only requires you provide parameters to one statement:
>
>create temporary table u(rowid integer primary key, vdist, oid, flags);
>insert into temp.u
>with t(rowid, ef)
>  as (
>  select distinct rowid,
>  expensive_function(?99, vdata)
>from some_table
> ),
> u(rowid, vdist, oid, flags)
>  as (
>  select t.rowid,
> ef as vdist,
> ?1 as oid,
> flags | (case when ef < ?2 then ?3 else 0 end)
>from some_table
>join t
>  on t.rowid == some_table.rowid
>   where (flags & ?3) == 0
> and (oid is null and ef < 0.6)
>  or result < vdist
> )
>select rowid, vdist, oid, flags
>  from u;
>update some_table
>   set (vdist, oid, flags) = (select vdist, oid, flags
>from temp.u
>   where u.rowid = some_table.rowid)
>  where some_table.rowid in (select rowid from temp.u)
>drop table temp.u;
>

Re: [sqlite] Reading a SharePoint file

2019-11-08 Thread Jens Alfke


> On Nov 8, 2019, at 12:21 PM, Jose Isaias Cabrera  wrote:
> 
> Yeah, that is what I am doing now.  I was trying to save time to just be able 
> to read a few tables and see if I needed to update it, so then, download it 
> and upload it.  But now, I have to download it, and read it, and delete the 
> local one. Thanks.

SharePoint might support WebDAV, and most OSs support (or used to support) 
mounting WebDAV as a filesystem. In that case you could point SQLite at the 
database in WebDAV mount. At that point it depends on whether the filesystem 
handles file reads by issuing HTTP byte-range requests, or if it just downloads 
the entire file to a local cache file first. In the former case it'd be faster 
than downloading, in the latter case not.

—Jens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SQLITE_DETERMINISTIC and custom function optimization

2019-11-08 Thread Keith Medcalf

SQLITE_DETERMINISTIC does not mean that the function is only called once for 
each unique set of arguments, only that when called with a unique set of 
arguments that it returns the same result.  This means that if it is a constant 
it can be factored out of being called more than once.  In your case, the 
parameters are not constant so it must be called once per row.

The optimizer is prone to calculating things more often than it needs to, and 
is difficult to force to "materialize" things.  Since your expensive function 
needs to be calculated for every row of the table anyway, it would be better to 
just create a table that has it calculated once, then compute the updates 
table, then perform the update, then get rid of the extra tables.  You could do 
this in one statement if anyone could figure out how to get the optimizer to 
"materialize" table u in the following query (table t is materialized by the 
distinct, which does nothing other than force t to be materialized -- u is not 
so easy).

with t(rowid, ef)
  as (
  select distinct rowid,
  expensive_function(?99, vdata)
from some_table
 ),
 u(rowid, vdist, oid, flags)
  as (
  select some_table.rowid,
 ef as vdist,
 ?1 as oid,
 flags | (case when ef < ?2 then ?3 else 0 end)
from some_table
join t 
  on t.rowid == some_table.rowid
   where (flags & ?3) == 0
 and (oid is null and ef < 0.6)
  or result < vdist
 )
update some_table
   set (vdist, oid, flags) = (select vdist, oid, flags
from u
   where u.rowid = some_table.rowid)
  where some_table.rowid in (select rowid from u)


example, which materializes u, but requires more than one SQL statement, but 
still only requires you provide parameters to one statement:

create temporary table u(rowid integer primary key, vdist, oid, flags);
insert into temp.u
with t(rowid, ef)
  as (
  select distinct rowid,
  expensive_function(?99, vdata)
from some_table
 ),
 u(rowid, vdist, oid, flags)
  as (
  select t.rowid,
 ef as vdist,
 ?1 as oid,
 flags | (case when ef < ?2 then ?3 else 0 end)
from some_table
join t 
  on t.rowid == some_table.rowid
   where (flags & ?3) == 0
 and (oid is null and ef < 0.6)
  or result < vdist
 )
select rowid, vdist, oid, flags
  from u;
update some_table
   set (vdist, oid, flags) = (select vdist, oid, flags
from temp.u
   where u.rowid = some_table.rowid)
  where some_table.rowid in (select rowid from temp.u)
drop table temp.u;

This will calculate expensive_function only once per row in some_table.  It 
assumes and requires that "some_table" is a rowid table.

-- 
The fact that there's a Highway to Hell but only a Stairway to Heaven says a 
lot about anticipated traffic volume.

>-Original Message-
>From: sqlite-users  On
>Behalf Of Mario M. Westphal
>Sent: Friday, 8 November, 2019 12:08
>To: sqlite-users@mailinglists.sqlite.org
>Subject: [sqlite] SQLITE_DETERMINISTIC and custom function optimization
>
>Hi all,
>
>
>
>I have a table with matrices stored as blobs. Each matrix has about 800
>bytes.
>
>This table has between 20,000 and 500,000 rows.
>
>
>
>I use a custom function "EXPENSIVE_FUNCTION" which performs a calculation
>using a matrix supplied via sqlite3_bind_pointer() as ?99 and the matrix
>in
>the vdata column.
>
>I create the function using the SQLITE_DETERMINISTIC flag. My hope was
>that
>the EXPENSIVE_FUNCTION is called only once per row. But that's not the
>case.
>
>
>
>The query looks like this:
>
>
>
>UPDATE some_table
>
>
>
>SET
>
>vdist = EXPENSIVE_FUNCTION(?99,vdata),
>
>oid = ?1,
>
>flags = flags | (CASE WHEN EXPENSIVE_FUNCTION(?99,vdata) < ?2 THEN ?3
>ELSE 0
>END)
>
>
>
>WHERE
>
>(flags & ?3) = 0  AND
>
>(oid IS NULL AND EXPENSIVE_FUNCTION(?99,vdata) < 0.6) OR
>
>(EXPENSIVE_FUNCTION(?99,vdata) < vdist)
>
>
>
>The EXPENSIVE_FUNCTION function is referred multiple times in the update
>statement. But it always returns the same result (for any given row).
>
>
>
>My stats report that SQLite calls EXPENSIVE_FUNCTION 91,806 times for a
>table with 45,775 rows.
>
>256 rows are modified. This takes (only) 3.3 seconds.
>
>
>
>The profiler tells me that sqlite3VdbeExec() spends 47% in
>vdbeMemFromBtreeResize and 36% in EXPENSIVE_FUNCTION.
>
>
>
>Can I change something so SQLite calls EXPENSIVE_FUNCTION only once per
>row?
>
>
>
>Thanks in advance.
>
>
>
>-- Mario
>
>___
>sqlite-users mailing list
>sqlite-users@mailinglists.sqlite.org
>http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org

Re: [sqlite] Reading a SharePoint file

2019-11-08 Thread Jose Isaias Cabrera

Jens Alfke, on Friday, November 8, 2019 03:16 PM, wrote...
>
> > On Nov 8, 2019, at 11:57 AM, Jose Isaias Cabrera, on
> >
> > Is there any way that SQLite can read a file on a Sharepoint site?
>
> Download the file over HTTP and then open the local file with SQLite, is the 
> obvious answer.

Yeah, that is what I am doing now.  I was trying to save time to just be able 
to read a few tables and see if I needed to update it, so then, download it and 
upload it.  But now, I have to download it, and read it, and delete the local 
one.  Thanks.

josé
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Reading a SharePoint file

2019-11-08 Thread Jose Isaias Cabrera


Simon Slavin, on Friday, November 8, 2019 03:15 PM, wrote...
>
> On 8 Nov 2019, at 7:57pm, Jose Isaias Cabrera, on
>
> > Is there any way that SQLite can read a file on a Sharepoint site?
>
> Not without downloading it to a local drive first.

Thanks, Simon.

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Reading a SharePoint file

2019-11-08 Thread Jens Alfke


> On Nov 8, 2019, at 11:57 AM, Jose Isaias Cabrera  wrote:
> 
> Is there any way that SQLite can read a file on a Sharepoint site? ie 
> https://some.site.com/sites/ROC/Docs%20comp/Shared.Projects_DB.sqlite3
>  
> 
>  Thanks.

Download the file over HTTP and then open the local file with SQLite, is the 
obvious answer.

If you really wanted to you could write custom code to download from HTTP into 
memory and then open it as an in-memory database, but that doesn't seem to have 
any real advantages.

—Jens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Reading a SharePoint file

2019-11-08 Thread Simon Slavin
On 8 Nov 2019, at 7:57pm, Jose Isaias Cabrera  wrote:

> Is there any way that SQLite can read a file on a Sharepoint site?

Not without downloading it to a local drive first.

You could write a Virtual File System for SQLite which handled files on a 
Sharepoint site.  But it would be difficult and as far as I know nobody has 
done it yet.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SQLITE_DETERMINISTIC and custom function optimization

2019-11-08 Thread Jens Alfke


> On Nov 8, 2019, at 11:08 AM, Mario M. Westphal  wrote:
> 
> The EXPENSIVE_FUNCTION function is referred multiple times in the update
> statement. But it always returns the same result (for any given row).

There was a similar thread (that I started, I think) from two years ago with 
subject "Common subexpression optimization of deterministic functions". It's 
worth a read.

The takeaway is that SQLite's query optimizer does not remove redundant calls 
to a deterministic function. IIRC, there weren't any good workarounds for this.

… But this makes me think of the upcoming virtual column feature. If you define 
a virtual table column whose value is equal to EXPENSIVE_FUNCTION(), do 
multiple references to that column in a query cause multiple calls to the 
function, or is it computed only once per row?

—Jens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Reading a SharePoint file

2019-11-08 Thread Jose Isaias Cabrera

Is there any way that SQLite can read a file on a Sharepoint site? ie 
https://some.site.com/sites/ROC/Docs%20comp/Shared.Projects_DB.sqlite3
 Thanks.

josé


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SQLite with branching

2019-11-08 Thread Bernardo Ramos


I included WAL mode and mmap on the LiteTree simple benchmark.

It turns out that WAL mode is as fast as LiteTree on Linux (with a hard 
disk) for writes and a little slower on reads.


On MacBook Pro (with SSD) LiteTree is faster on both writing and 
reading.


SQLite's mmap make it slightly faster than just with WAL. It is faster 
than LiteTree on reads (no page data copy on both cases). But this 
depends on the benchmark code. Sometimes it is slower than using just 
WAL mode (apparently with small dbs).


Sometimes mmap is way faster than all others (on a virtual machine: 
Windows hosting Linux).


But honestly, I do not know the reason of these differences.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SQLITE_DETERMINISTIC and custom function optimization

2019-11-08 Thread Simon Slavin
On 8 Nov 2019, at 7:08pm, Mario M. Westphal  wrote:

> The query looks like this:

Try using

EXPLAIN QUERY PLAN on your query.



You might be able to see the two places where your function is called.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Deterministic random sampling via SELECT

2019-11-08 Thread Merijn Verstraaten
On 7 Nov 2019, at 20:47, Chris Peachment  wrote:
>  1. generate a list of pseudo-random numbers, using a pre-defined
> seed value, over the range 1 .. count(*) of records in table,
> 
>  2. use that list as record id values to select the desired subset
> of the data in the table.
> 
> This would be done in two separate operations, possibly with a
> storage of the generated numbers in a separate table which could
> be used in the query of the main data.

Yeah, this and some of the other ideas were some things I considered as 
fallback ideas if things weren't possible within the query, although it does 
complicate things a bit.

I actually just had another idea: How is the behaviour of window functions 
defined? i.e. if there is an ORDER BY clause are rows added/removed from the 
window in the order of the order by clause, or is the behaviour of 
row_number/rank/dense_rank special cased and only those functions guarantee the 
order?

Because if window functions do follow there own order by, you could easily 
recover a deterministic evaluation order via the window specification.

If not, I guess I'll have to give up and do it the "hard way".

- Merijn
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] SQLITE_DETERMINISTIC and custom function optimization

2019-11-08 Thread Mario M. Westphal
Hi all,

 

I have a table with matrices stored as blobs. Each matrix has about 800
bytes.

This table has between 20,000 and 500,000 rows.

 

I use a custom function "EXPENSIVE_FUNCTION" which performs a calculation
using a matrix supplied via sqlite3_bind_pointer() as ?99 and the matrix in
the vdata column.

I create the function using the SQLITE_DETERMINISTIC flag. My hope was that
the EXPENSIVE_FUNCTION is called only once per row. But that's not the case.

 

The query looks like this:

 

UPDATE some_table 

 

SET 

vdist = EXPENSIVE_FUNCTION(?99,vdata), 

oid = ?1, 

flags = flags | (CASE WHEN EXPENSIVE_FUNCTION(?99,vdata) < ?2 THEN ?3 ELSE 0
END)

 

WHERE

(flags & ?3) = 0  AND

(oid IS NULL AND EXPENSIVE_FUNCTION(?99,vdata) < 0.6) OR 

(EXPENSIVE_FUNCTION(?99,vdata) < vdist)

 

The EXPENSIVE_FUNCTION function is referred multiple times in the update
statement. But it always returns the same result (for any given row).

 

My stats report that SQLite calls EXPENSIVE_FUNCTION 91,806 times for a
table with 45,775 rows. 

256 rows are modified. This takes (only) 3.3 seconds.

 

The profiler tells me that sqlite3VdbeExec() spends 47% in
vdbeMemFromBtreeResize and 36% in EXPENSIVE_FUNCTION.

 

Can I change something so SQLite calls EXPENSIVE_FUNCTION only once per row?

 

Thanks in advance.

 

-- Mario

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Feature request: (VALUES (1), (2), (3)) AS t(n)

2019-11-08 Thread Ainar Garipov
(Hopefully this works.)

> How about something like:
>
> with t(a, b) as (values (1, 1), (2, 2)) select a, b from t;

Yeah, CTEs are an obvious alternative.  I mostly request this AS t(n)
feature because I have had some otherwise-portable PostgreSQL queries
that I needed to tweak for SQLite.  The thing with PostgreSQL is that
until very recently CTEs would generally perform worse than sub-queries,
so a lot of people avoided it.  My SQLite version used CTEs with no
issues, but still I thought I might as well request the feature, to
improve compatibility.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] database disk image is malformed

2019-11-08 Thread Jukka Marin
On Thu, Nov 07, 2019 at 09:26:46AM -0800, Shawn Wagner wrote:
> This line stood out:
> 
> > The main process opens the databases and then forks the other processes
> which can then perform database operations using the already opened
> databases.
> 
> From
> https://sqlite.org/howtocorrupt.html#_carrying_an_open_database_connection_across_a_fork_
> :
> 
> > Do not open an SQLite database connection, then fork(), then try to use
> that database connection in the child process. All kinds of locking
> problems will result and you can easily end up with a corrupt database.
> SQLite is not designed to support that kind of behavior. Any database
> connection that is used in a child process must be opened in the child
> process, not inherited from the parent.
> 
> In this kind of situation, I usually use pthread_atfork() callbacks to
> automate closing databases and then re-opening them in the parent and child.

Okay, thanks!  I suspected it could be something like this, but couldn't
find anything in the SQLite docs.

I'll change my code and see what happens :-)

Thanks again for the quick reply!

  Jukka Marin
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users