Re: [sqlite] Cost/viability of using "layered" views as filters?

2013-10-02 Thread Richard Hipp
On Wed, Oct 2, 2013 at 6:47 PM, Stephan Beal  wrote:

>
> a) is this a gross/unconscionable inefficiency or is this a viable
> strategy, provided the number of filters stays small (say, 2-3)?
>

The query optimizer in SQLite will probably flatten the nested views into a
single simple query.  So it probably doesn't matter.

You can verify this by running EXPLAIN QUERY PLAN.


>
> b) how much is this costing me (abstractly speaking), in terms of search
> performance? Is the search hit cost here linear (which should still be
> okay... except maybe for the core TCL repo) or is it worse?
>

I'm guessing that sqlite3_step() runs at the same speed.
sqlite3_prepare_v2() might take a couple extra microseconds, but not enough
extra time that you could measure it.


-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] How to use my own python function in a SQLite WHERE clause?

2013-10-02 Thread Bao Niu
I coded a module that has a ChineseDate class, which automatically parse
Chinese dates characters into a python-dateutil object(which actually is a
datetime.datetime object as well).
I used detect_types=sqlite3.PARSE_DECLTYPES to tell the connection object
to store this ChineseDate type directly into one column in my schema. When
inserted into that column, the python ChineseDate object will be
automatically adapted into a complex string, containing the first occuring
date and the interval, etc. Now I want to select all the records with their
MyDate column later than a certain 兔年八月十五, so I built this query:
SELECT * FROM myTable WHERE MyDate > MyModule.ChineseDate("兔年八月十五")
This comparison is legal in the pure Python context because both sides are
a ChineseDate instance. But for some reason I got an
sqlite3.operationalError from running it.
Am I in the wrong path to achieve my goal?
Or more generally, *is this kind of in-place comparison of two native
python objects supported in sqlite3 WHERE clause*?
Or it is not supported at all. WHERE A >= B, A and B must be BOTH dates or
integers or strings ONLY and no other custom types?
Many thanks.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Cost/viability of using "layered" views as filters?

2013-10-02 Thread Stephan Beal
Hi, all,

(My apologies in advance for the length of this post. 90% of it is just
setting the stage for a couple relatively simple questions.)

As part of the Fossil SCM, earlier this evening i did something kind of
strange with sqlite and now i'm looking for feedback on whether or not this
is a generically a valid implementation strategy (it's fine for this
particular context, but i'd like to re-use/expand it a bit if it's not
inherently evil or grossly inefficient)...

In short, we have a table (called 'event') of historical data which holds
everything needed by the so-called timeline view. Anyone who uses Fossil
will know what the 'timeline' is, and everyone else can see examples here:

http://www.fossil-scm.org/fossil/timeline
http://sqlite.org/src/timeline/

We use that same data to generate the /reports pages:

http://www.fossil-scm.org/fossil/reports

The task at hand was the filter this report data based on event type
(checkin, wiki change, ticket change, etc.). Historically speaking, we've
supported such flags by generating different SQL depending on the contents
of the appropriate client-provided flags. This time, however, i decided
that i didn't want to touch the 3 or 4 routines which generate/run the SQL,
and instead went for the proverbial Extra Level of Indirection and shoved a
temporary view between the reporting code and the event table to perform
the filtering. The implementation is rather trivial and, all things
considered, turned out quite nicely (IMO):

http://www.fossil-scm.org/fossil/vdiff?from=e3a2d8f381a1fd59=68e3e0a5dcc5f4ab=1

(Pedantic note: that unused return value will be used in the up-coming step
2 of this change.)

The interesting parts are really the purpose ones, where "event" has been
replaced by "v_reports".

After reading through that once, please take special note of lines
1914-1915. Those are functionally a no-op which is _guaranteed_ to be less
efficient than implementing this "the hard way," but such is the generic
price of having the Extra Level of Indirection. i do not know how _much_ of
a cost that entails, in terms of sqlite internals and the effect on search
performance, but for this case a few extra milliseconds or a few more KB of
RAM isn't a concern (maybe it will be if i run this against the core TCL
repo, which is particularly large... i need to try that out later).

So, now i'm curious...

Let's say i want to add a similar filter for 'user' (we already have this
filter, implemented without a view, but let's just pretend for a
moment that we don't)...

is there a tremendous efficiency penalty if i implement several layers of
filtering based on views, each view filtering out one more part of the
equation from the next view down the chain?

e.g. for the above we have:

a) event table
b) v_reports view which filters event based on event.type

And then, hypothetically:

c) v_reports2 which filters v_reports by v_reports.user.

and so on, and so forth.

Obviously, this doesn't support arbitrary combinations of filters all that
well because each view has to derive from some other view (for this
particular case), but for this use case we're not likely to ever have/need
more than about 3 or 4 filters.

Why do it this way at all? Because (IMO) it's much nicer-looking (in code)
than modifying the SQL for each particular filter case. e.g. compare the
above diff with blocks like this one which have to be repeated across
several different functions:

http://www.fossil-scm.org/fossil/artifact/4a5030799da3ead5fb5f13cd49aebc8860f9e933?ln=2168-2171

So... my questions are, basically:

a) is this a gross/unconscionable inefficiency or is this a viable
strategy, provided the number of filters stays small (say, 2-3)?

b) how much is this costing me (abstractly speaking), in terms of search
performance? Is the search hit cost here linear (which should still be
okay... except maybe for the core TCL repo) or is it worse?

Your insights are much appreciated,

-- 
- stephan beal
http://wanderinghorse.net/home/stephan/
http://gplus.to/sgbeal
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Tcl interface: array not created when ...

2013-10-02 Thread Rob Sciuk



To: sqlite-users@sqlite.org
Subject: [sqlite] Bug : Tcl interface: array not created when call "db 
eval {...} ?array-name? ?script?"


[ The following text is in the "ISO-8859-1" character set. ]
[ Your display is set for the "US-ASCII" character set.  ]
[ Some characters may be displayed incorrectly. ]

Hi All!
I use sqlite3 ver 3.7.15.1 and simple test:

1. package require sqlite3
2. sqlite3 db :memory:
3. db eval {CREATE TABLE x(a,b)}
4. db eval {INSERT INTO x VALUES(1,2)}
5. #set tempvar ""
6. db eval {SELECT a,b FROM x} tempvar {
7.   puts "a = $tempvar(a)"
8. }
9. db close

When line 5 commented this test run ok. If we uncomment line 5 then
tcl interpreter
fail with error "can't read "tempvar(a)": variable isn't array". Looks 
like

sqlite engine don't check/unset variable "array-name" (like most other tcl
commands, thats accept "varName"/"arrayName").


--
dixi.

Dixi,

try replacing "set tempvar """ with:

 array set tempvar {}

Cheers,
Rob.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Hints for the query planner

2013-10-02 Thread Nico Williams
On Wednesday, October 2, 2013, Richard Hipp wrote:
>
> In the current beta (at http://www.sqlite.org/download.html) the function
> is either unlikely(X) or likelihood(X,Y).  In the second form, Y must be a
> floating point constant between 0.0 and 1.0, inclusive.  The first form is
> equivalent to likelihood(X, 0.0625).


 Perfect.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Implementing a new index

2013-10-02 Thread Richard Hipp
On Wed, Oct 2, 2013 at 12:22 AM, Charles Moyes  wrote:

>
> Would you be open to accepting a patch for a new spatial index extension
> (namely: M-tree as described in [1])? This may become a new personal
> project.
>

(1) I'll not make that decision without first seeing the patch.
(2) There is paperwork involved in dedicating the code to the public domain
(3) Test scripts and documentation are expected to be part of the patch.
(4) Estimated long-term maintenance costs will play a big role in the
decision.
(5) The coding style should be the same as the rest of the SQLite code.

-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Version 3.8.1 beta

2013-10-02 Thread Ryan Johnson

On 02/10/2013 5:13 AM, Joe Mistachkin wrote:

Jan Nijtmans wrote:

Well, I did some digging as well, and it looks like the libtool upgrade
is the coolpit:

See:
Classic. The error message is completely inscrutable to someone who's 
never seen it before, but technically makes sense after you know what 
the problem is. Maybe somebody could nudge the autotools guys to make 
the message a little more helpful?


Ryan

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Implementing a new index

2013-10-02 Thread Charles Moyes
Hi all,

Would you be open to accepting a patch for a new spatial index extension
(namely: M-tree as described in [1])? This may become a new personal
project.

Best wishes,
Charles

[1] Ciaccia, Paolo; Patella, Marco; Zezula, Pavel (1997). "M-tree An
Efficient Access Method for Similarity Search in Metric
Spaces".
*Proceedings of the 23rd VLDB Conference Athens, Greece, 1997*. IBM Almaden
Research Center: Very Large Databases Endowment Inc. pp. 426–435. p426.
Retrieved 2010-09-07.

-- 
Charles W. Moyes III
Cornell University Engineering '12
http://www.charlesmoyes.com/
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] import failing

2013-10-02 Thread Hoover, Jeffrey
I am trying to import a tab delimited file into a table containing 21 columns.

It fails on this record
Sapurv10037426m 176 RPS-BLAST(BLASTP)   cdd gnl|CDD|239293  
72  159 1   89  33.756.5174 70.792  
cd02995, PDI_a_PDI_a'_C, PDIa family, C-terminal TRX domain (a') subfamily; 
composed of the C-terminal redox active a' domains of PDI, ERp72, ERp57 (or 
ERp60) and EFP1. PDI, ERp72 and ERp57 are endoplasmic reticulum (ER)-resident 
eukaryotic proteins involved in oxidative protein folding. They are oxidases, 
catalyzing the formation of disulfide bonds of newly synthesized polypeptides 
in the ER. They also exhibit reductase activity in acting as isomerases to 
correct any non-native disulfide bonds, as well as chaperone activity to 
prevent protein aggregation and facilitate the folding of newly synthesized 
proteins. PDI and ERp57 have the abb'a' domain structure (where a and a' are 
redox active TRX domains while b and b' are redox inactive TRX-like domains). 
PDI also contains an acidic region (c domain) after the a' domain that is 
absent 
 in ERp57. ERp72 has an additional a domain at the N-terminus (a"aabb'a' domain 
structure). ERp57 interacts with the lectin chaperones, calnexin and 
calreticulin, and specifically promotes the oxidative folding of glycoproteins, 
while PDI shows a wider substrate specificity. ERp72 associates with several ER 
chaperones and folding factors to form complexes in the ER that bind nascent 
proteins. EFP1 is a binding partner protein of thyroid oxidase, which is 
responsible for the generation of hydrogen peroxide, a crucial substrate of 
thyroperoxidase, which functions to iodinate thyroglobulin and synthesize 
thyroid hormones. 0   null104 1e-16


With this message
expected 21 columns of data but found 16

If I change the substring a"aabb'a' to a''aabb'a' (two consecutive 's instead 
of ") the record loads.


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Bug : Tcl interface: array not created when call "db eval {...} ?array-name? ?script?"

2013-10-02 Thread Konstantin Kushnir
Hi All!
I use sqlite3 ver 3.7.15.1 and simple test:

1. package require sqlite3
2. sqlite3 db :memory:
3. db eval {CREATE TABLE x(a,b)}
4. db eval {INSERT INTO x VALUES(1,2)}
5. #set tempvar ""
6. db eval {SELECT a,b FROM x} tempvar {
7.   puts "a = $tempvar(a)"
8. }
9. db close

When line 5 commented this test run ok. If we uncomment line 5 then
tcl interpreter
fail with error "can't read "tempvar(a)": variable isn't array". Looks like
sqlite engine don't check/unset variable "array-name" (like most other tcl
commands, thats accept "varName"/"arrayName").


-- 
dixi.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Hints for the query planner

2013-10-02 Thread Ryan Johnson

On 02/10/2013 2:19 AM, Baruch Burstein wrote:

On Fri, Sep 13, 2013 at 5:29 AM, Ryan Johnson
wrote:


-- Join cardinality: Bach was a *very* prolific composer whose output
likely dwarfs the (surviving) output of his contemporaries
select p.title, c.name, p.year from composers c join pieces p on p.c_id =
c.id where c.name like '%bach%' and p.year between 1700 and 1750


How would you mark this even using the suggested function syntax? The
likelihood of "c.name like '%bach%' " being true depends on the order the
query optimizer decides to evaluate the 2 predicates in, which in turn
depends on the likelihood of the predicate!
Predicate order doesn't matter; the optimizer will push both down to 
their respective source tables before the join, where they will execute 
independently of each other. The problem would come if the above join 
were input to some outer query and the optimizer needed to reason about 
the cardinality of the join based on the selectivity of the inputs: the 
inputs would be highly filtered but the join would impose unexpectedly 
little additional filtering due to the correlation between famous 
composes named Bach and the half century of interest.


Ryan






___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Hints for the query planner

2013-10-02 Thread Richard Hipp
On Wed, Oct 2, 2013 at 1:40 AM, Nico Williams  wrote:

> On Tuesday, September 10, 2013, Richard Hipp wrote:
>
> > SURVEY QUESTION:
> >
> > The question for today is what to call this magic hint function:
> >
> > (1)  unlikely(EXPR)
> > (2)  selective(EXPR)
> > (3)  seldom(EXPR)
> > (4)  seldom_true(EXPR)
> > (5)  usually_not_true(EXPR)
>
>
>  (1), on account of: it's already used in various programming languages.
>  However, because o your point below i prefer "likely(,
> )", perhaps with the probability being required (can the
> probability be a non-constant expression?).
>

In the current beta (at http://www.sqlite.org/download.html) the function
is either unlikely(X) or likelihood(X,Y).  In the second form, Y must be a
floating point constant between 0.0 and 1.0, inclusive.  The first form is
equivalent to likelihood(X, 0.0625).

Additional links:

(1) http://www.sqlite.org/draft/releaselog/3_8_1.html
(2) http://www.sqlite.org/draft/lang_corefunc.html#unlikely
(3) http://www.sqlite.org/draft/lang_corefunc.html#likelihood

-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Version 3.8.1 beta

2013-10-02 Thread Jan Nijtmans
2013/10/2 Joe Mistachkin :
> Thanks a lot for looking into this.

Your' welcome. This patch appears to work (but maybe -no-undefined
should come in through @LDFLAGS@ or $(LTLINK_EXTRAS))

Regards,
   Jan Nijtmans

Index: Makefile.in
==
--- Makefile.in
+++ Makefile.in
@@ -146,11 +146,11 @@
 LIBTOOL = ./libtool
 ALLOWRELEASE = @ALLOWRELEASE@

 # libtool compile/link/install
 LTCOMPILE = $(LIBTOOL) --mode=compile --tag=CC $(TCC) $(LTCOMPILE_EXTRAS)
-LTLINK = $(LIBTOOL) --mode=link $(TCC) $(LTCOMPILE_EXTRAS) @LDFLAGS@
$(LTLINK_EXTRAS)
+LTLINK = $(LIBTOOL) --mode=link $(TCC) $(LTCOMPILE_EXTRAS) @LDFLAGS@
$(LTLINK_EXTRAS) -no-undefined
 LTINSTALL = $(LIBTOOL) --mode=install $(INSTALL)

 # nawk compatible awk.
 NAWK = @AWK@
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Version 3.8.1 beta

2013-10-02 Thread Joe Mistachkin

Jan Nijtmans wrote:
>
> Well, I did some digging as well, and it looks like the libtool upgrade
> is the coolpit:
> 
> See:
> 

Thanks a lot for looking into this.

--
Joe Mistachkin

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Version 3.8.1 beta

2013-10-02 Thread Jan Nijtmans
2013/10/2 Warren Young :
> I imagine I could find it with fossil bisect, but I was hoping that just
> mentioning the problem here would cause someone close to the
> development effort to smack their heads and say "Ah, it must be
>sqlite3_foobie_bletch(), which we just added."

Well, I did some digging as well, and it looks like the libtool upgrade
is the coolpit:

See:

>Historically, the default behaviour of Libtool was as if `-no-undefined' was
>always passed on the command line, but it proved to be annoying to
>developers who had to constantly turn it off so that their ELF libraries could
>be featureful. Now it has to be defined explicitly if you need it.
>
>There are is a tradeoff:
>
>If you don't specify `-no-undefined', then Libtool will not build shared 
>libraries
>on platforms which don't allow undefined symbols at link time for such a 
>library.
>
>It is only safe to specify this flag when you know for certain that all of the
>libraries symbols are defined at link time, otherwise the `-no-undefined'
>link will appear to work until it is tried on a platform which requires all
>symbols to be defined. Libtool will try to link the shared library in this
>case (because you told it that you have not left any undefined symbols),
>but the link will fail, because there are undefined symbols in spite of
>what you told Libtool.

It looks like "-no-undefined" must be added somewhere.

Regards,
   Jan Nijtmans
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Version 3.8.1 beta

2013-10-02 Thread Warren Young
On Oct 1, 2013, at 7:49 PM, Joe Mistachkin  wrote:

> Warren Young wrote:
>> 
>> It can't build a shared library (DLL) on Cygwin any more:
>> 
>> ./libtool --mode=link gcc   -g -O2 -DSQLITE_OS_WIN=1 -I. -I./src 
>> -I./ext/rtree -D_HAVE_SQLITE_CONFIG_H -DBUILD_sqlite -DNDEBUG 
>> -I/usr/include -DSQLITE_THREADSAFE=1 -DSQLITE_OMIT_LOAD_EXTENSION=1 
>> -o libsqlite3.la sqlite3.lo  \
>>  -rpath "/usr/local/lib" -version-info "8:6:8"
>> libtool: link: warning: undefined symbols not allowed in i686-pc-cygwin 
>> shared libraries
>> 
> 
> Are there any further details you could provide that would point us in the
> direction of which symbols are "undefined"?

I tried Googling to see if there was a mode for libtool that would make it tell 
you, but I don't see it.

I imagine I could find it with fossil bisect, but I was hoping that just 
mentioning the problem here would cause someone close to the development effort 
to smack their heads and say "Ah, it must be sqlite3_foobie_bletch(), which we 
just added."

Let me know if enlightenment fails to occur; I will begin a-bisecting.

> It looks like you are compiling the amalgamation.  What were the command
> line options used to generate it?

I checked tip out of the fossil repo, and just did "./configure && make".

I actually started with a much more complex build setup, but pared it back 
piece by piece hoping to find out which of my additions was to blame.  Turned 
out, none of them were.

It's possible I missed backing some change out.  Tomorrow I will try re-opening 
the repo in a fresh directory.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Hints for the query planner

2013-10-02 Thread Baruch Burstein
On Fri, Sep 13, 2013 at 5:29 AM, Ryan Johnson
wrote:

> -- Join cardinality: Bach was a *very* prolific composer whose output
> likely dwarfs the (surviving) output of his contemporaries
> select p.title, c.name, p.year from composers c join pieces p on p.c_id =
> c.id where c.name like '%bach%' and p.year between 1700 and 1750
>

How would you mark this even using the suggested function syntax? The
likelihood of "c.name like '%bach%' " being true depends on the order the
query optimizer decides to evaluate the 2 predicates in, which in turn
depends on the likelihood of the predicate!


-- 
˙uʍop-ǝpısdn sı ɹoʇıuoɯ ɹnoʎ 'sıɥʇ pɐǝɹ uɐɔ noʎ ɟı
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users