On 9/12/15, Aurel Wisse <aurel.wisse at gmail.com> wrote:
> @Luuk : I tried your query and I cancelled after 9 minutes when it wasn't
> finished.
>
> @Richard : The query isn't done after 15 minutes. I added a small
> modification, reinserting the WITH clause inside the recursive WITH (and
> using dates instead of securities in the remainder of the query) :

Did you create the index on securities(calc_date) first, as shown in
my example.  I should have highlighted that point - the index is
important for performance.

>
> CREATE TEMP TABLE offsets AS
>
>   WITH RECURSIVE dtoff(calc_date, tmw) AS (
>
> WITH dates AS (SELECT DISTINCT calc_date FROM securities)
>
>     SELECT NULL, (SELECT min(calc_date) FROM dates)
> ...
> This is inevitable as "securities" is a table with 6.2 million rows. This
> is the schema:
>
> CREATE TABLE security (calc_date TEXT, security_code INT, mx_full REAL,
> mx_adj REAL, mx_std REAL, fu_full REAL, fu_adj REAL, few_full REAL, few_adj
> REAL);
>
>
> There are about 4000 distinct calc_date in securities. I want to avoid an
> intermediate table with distinct dates. If I were to use an intermediate
> table, the offset would be trivial to create on an ordered version of
> dates, self joined with the JOIN clause d1.rowid+1=d2.rowid.
>
>
> The expected output is (first five rows):
>
> 2000-11-30|2000-12-01
>
> 2000-12-01|2000-12-04
>
> 2000-12-04|2000-12-05
>
> 2000-12-05|2000-12-06
>
> 2000-12-06|2000-12-07
>
>
> I have the impression that nontrivial SELECT queries in result columns are
> quite expensive.
>
>
> The fact of the matter is that the original recursive query, up to this
> point, is the most efficient solution with the non recursive query second.
> Richard: Any chance aggregate recursive queries can be reactivated ?
>
>
> Thanks,
>
>
> Aurel
>
>
>
> On Sat, Sep 12, 2015 at 12:25 PM, Richard Hipp <drh at sqlite.org> wrote:
>
>> On 9/12/15, Aurel Wisse <aurel.wisse at gmail.com> wrote:
>> > On 9/11/15, Richard Hipp <drh at sqlite.org> wrote:
>> >
>> > Here is the example.
>>
>> Have you considered using code like the following instead:
>>
>> -- Create a table "securities(calc_date)" and fill it with lots
>> -- with lots of dates in a crazy order.
>> --
>> CREATE TABLE securities(calc_date);
>> WITH RECURSIVE c(x) AS (VALUES(1) UNION SELECT x+1 FROM c WHERE x<1000)
>> INSERT INTO securities(calc_date)
>>    SELECT datetime(2440587.5 + ((x*123456789012345)%16700000)/1000.0)
>> FROM
>> c;
>> CREATE INDEX securities_calcdate ON securities(calc_date);
>>
>>
>> -- Compute the "offsets(calc_date, tmw)" which has one entry for each
>> -- calc_date in securities and for which tmw is the first non-Sunday date
>> -- in securities that comes after calc_date.
>> --
>> CREATE TABLE offsets AS
>>   WITH RECURSIVE dtoff(calc_date, tmw) AS (
>>     SELECT NULL, (SELECT min(calc_date) FROM securities)
>>     UNION
>>     SELECT dtoff.tmw, (SELECT min(securities.calc_date) FROM securities
>>                         WHERE securities.calc_date>dtoff.tmw
>>                           AND strftime('%w',securities.calc_date)<>'0')
>>       FROM dtoff
>>   )
>>   SELECT * from dtoff WHERE calc_date NOTNULL ORDER BY calc_date;
>> SELECT * FROM offsets;
>>
>>
>> --
>> D. Richard Hipp
>> drh at sqlite.org
>> _______________________________________________
>> sqlite-users mailing list
>> sqlite-users at mailinglists.sqlite.org
>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>>
>
>
>
> --
> Aurel Wisse
> (514) 814-5421
> _______________________________________________
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>


-- 
D. Richard Hipp
drh at sqlite.org

Reply via email to