[sqlite] sqlite window functions extension?

2015-10-16 Thread Don V Nielsen
I'm disappointed.  I killed it after 1.5hrs, and suspected everything you
explained in the other thread.  It is always true.

Que Sera Sera

dvn

On Fri, Oct 16, 2015 at 2:33 PM, Igor Tandetnik  wrote:

> On 10/16/2015 3:23 PM, Don V Nielsen wrote:
>
>>  limit (
>>SELECT net_non_pieces
>>FROM crrt_net_non [b]
>>WHERE [b].zip = zip AND [b].crrt = crrt
>>
>
> Again, I don't think this does what you think it does. Test with at least
> two rows in crrt_net_non, with different values for net_non_pieces, and
> confirm that as many rows are being updated as you've hoped. I suspect you
> may be disappointed.
>
> --
> Igor Tandetnik
>
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>


[sqlite] sqlite window functions extension?

2015-10-16 Thread Igor Tandetnik
On 10/16/2015 3:23 PM, Don V Nielsen wrote:
>  limit (
>SELECT net_non_pieces
>FROM crrt_net_non [b]
>WHERE [b].zip = zip AND [b].crrt = crrt

Again, I don't think this does what you think it does. Test with at 
least two rows in crrt_net_non, with different values for 
net_non_pieces, and confirm that as many rows are being updated as 
you've hoped. I suspect you may be disappointed.
-- 
Igor Tandetnik



[sqlite] sqlite window functions extension?

2015-10-16 Thread Don V Nielsen
Wrapping up this thread.

What I am going to go with is the following.  I'm using the initial WITH to
reduce as much as possible the number of records to be repeatedly searched:
maybe 25/30%.  The addresses table currently has 4.4mm rows; it will go up
to 11.5mm when this goes to prod.  I'm testing performance right now.
Might not be too good.

begin transaction;
with ver_addresses AS (
  SELECT rowid as id,zip,crrt
  from addresses
  where version_id = '0060'
)
update addresses
set segment = '60'
where
  rowid in (
select id
from ver_addresses [a]
where [a].zip = [addresses].zip and [a].crrt = [addresses].crrt
limit (
  SELECT net_non_pieces
  FROM crrt_net_non [b]
  WHERE [b].zip = zip AND [b].crrt = crrt
)
  );
commit;


On Fri, Oct 16, 2015 at 9:24 AM, Don V Nielsen 
wrote:

> The final solution at this point is Ruby.  I really want to push
> everything I can into Sqlite because *it is so freakin fast!*
> Unfortunately. I I am just not getting it.  Also unfortunately, iterating,
> getting data, and updating data in scripting languages is not efficient.
> I've implemented a transaction for each crrt_net_non iteration, but that is
> still painfully slow.  I will have to be more creative in my data handling
> (move to dbi versus activerecord) and partitioning of transactions
> (committing after every 100,000 updates, for example.)
>
> NetNon.all.each {|nn|
>   ActiveRecord::Base.transaction {
> addrs = Address.
> where(zip:nn.zip,crrt:nn.crrt,version_id:non_sel_ver.ver_id).
> select(:id,:segment).
> take(nn.net_non_pieces)
> addrs.each {|addr| addr.update(segment:'xx') }
>   }
>   pb.inc
> }
>
> In English..get all rows from crrt_net_non: it has the control quantity
> net_non_pieces.  Select from addresses all rows with the same zip and crrt
> values, and matching the version to be updated.  I only need columns id and
> segment, and take only the number of records as calculated in
> net_non_pieces.  Update the segment code.  Loop.
>
> On Thu, Oct 15, 2015 at 9:47 AM, Don V Nielsen 
> wrote:
>
>> I'm surprised that and extension for this type of functionality has not
>> been been developed by someone with the c/c++.  It's seems like a natural
>> fit.  I wish I had the kind of ability & smarts to do it.
>>
>> On Thu, Oct 15, 2015 at 9:04 AM, Igor Tandetnik 
>> wrote:
>>
>>> On 10/15/2015 9:36 AM, Don V Nielsen wrote:
>>>
  limit ifnull( (
select net_non_pieces from crrt_net_non net
where net.zip=zip and net.crrt=crrt

>>>
>>> I suspect this WHERE clause is equivalent to "where net.zip=net.zip and
>>> net.crrt=net.crrt" - that is, always true. The actual limit value comes
>>> from whichever row accidentally happens to be first.
>>>
>>> --
>>> Igor Tandetnik
>>>
>>> ___
>>> sqlite-users mailing list
>>> sqlite-users at mailinglists.sqlite.org
>>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>>>
>>
>>
>


[sqlite] sqlite window functions extension?

2015-10-16 Thread Don V Nielsen
The final solution at this point is Ruby.  I really want to push everything
I can into Sqlite because *it is so freakin fast!*  Unfortunately. I I am
just not getting it.  Also unfortunately, iterating, getting data, and
updating data in scripting languages is not efficient.  I've implemented a
transaction for each crrt_net_non iteration, but that is still painfully
slow.  I will have to be more creative in my data handling (move to dbi
versus activerecord) and partitioning of transactions (committing after
every 100,000 updates, for example.)

NetNon.all.each {|nn|
  ActiveRecord::Base.transaction {
addrs = Address.
where(zip:nn.zip,crrt:nn.crrt,version_id:non_sel_ver.ver_id).
select(:id,:segment).
take(nn.net_non_pieces)
addrs.each {|addr| addr.update(segment:'xx') }
  }
  pb.inc
}

In English..get all rows from crrt_net_non: it has the control quantity
net_non_pieces.  Select from addresses all rows with the same zip and crrt
values, and matching the version to be updated.  I only need columns id and
segment, and take only the number of records as calculated in
net_non_pieces.  Update the segment code.  Loop.

On Thu, Oct 15, 2015 at 9:47 AM, Don V Nielsen 
wrote:

> I'm surprised that and extension for this type of functionality has not
> been been developed by someone with the c/c++.  It's seems like a natural
> fit.  I wish I had the kind of ability & smarts to do it.
>
> On Thu, Oct 15, 2015 at 9:04 AM, Igor Tandetnik 
> wrote:
>
>> On 10/15/2015 9:36 AM, Don V Nielsen wrote:
>>
>>>  limit ifnull( (
>>>select net_non_pieces from crrt_net_non net
>>>where net.zip=zip and net.crrt=crrt
>>>
>>
>> I suspect this WHERE clause is equivalent to "where net.zip=net.zip and
>> net.crrt=net.crrt" - that is, always true. The actual limit value comes
>> from whichever row accidentally happens to be first.
>>
>> --
>> Igor Tandetnik
>>
>> ___
>> sqlite-users mailing list
>> sqlite-users at mailinglists.sqlite.org
>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>>
>
>


[sqlite] sqlite window functions extension?

2015-10-15 Thread Igor Tandetnik
On 10/15/2015 9:36 AM, Don V Nielsen wrote:
>  limit ifnull( (
>select net_non_pieces from crrt_net_non net
>where net.zip=zip and net.crrt=crrt

I suspect this WHERE clause is equivalent to "where net.zip=net.zip and 
net.crrt=net.crrt" - that is, always true. The actual limit value comes 
from whichever row accidentally happens to be first.
-- 
Igor Tandetnik



[sqlite] sqlite window functions extension?

2015-10-15 Thread Don V Nielsen
I'm surprised that and extension for this type of functionality has not
been been developed by someone with the c/c++.  It's seems like a natural
fit.  I wish I had the kind of ability & smarts to do it.

On Thu, Oct 15, 2015 at 9:04 AM, Igor Tandetnik  wrote:

> On 10/15/2015 9:36 AM, Don V Nielsen wrote:
>
>>  limit ifnull( (
>>select net_non_pieces from crrt_net_non net
>>where net.zip=zip and net.crrt=crrt
>>
>
> I suspect this WHERE clause is equivalent to "where net.zip=net.zip and
> net.crrt=net.crrt" - that is, always true. The actual limit value comes
> from whichever row accidentally happens to be first.
>
> --
> Igor Tandetnik
>
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>


[sqlite] sqlite window functions extension?

2015-10-15 Thread Don V Nielsen
The correlated query was not accepted in the select statement.  However, I
modified your original UPDATE query, as follows, and it did execute, and
passed the proper quantity of rows for the first zip/crrt combination,
only.  The rows were not updated.  It counted correctly, but did not change
the values.

sql> UPDATE tmp_addresses
  set segment='xx'
  where rowid in (
select x2.rowid from tmp_addresses x2 where x2.zip=zip and x2.crrt=crrt
order by
  CASE
  WHEN x2.version_id = '0060' THEN 0
  WHEN x2.version_id = '0064' THEN 2
  WHEN x2.version_id = '0061' THEN 3
  ELSE 99
  END ASC
limit ifnull( (
  select net_non_pieces from crrt_net_non net
  where net.zip=zip and net.crrt=crrt
), 0)
  )
[2015-10-15 06:54:18] *42 row(s)* affected in 189ms

I have not yet attempted your most recent suggestion.  I will get to that
some time today.



On Wed, Oct 14, 2015 at 9:21 PM, Igor Tandetnik  wrote:

> On 10/14/2015 9:29 PM, Don V Nielsen wrote:
>
>> But I am having problems with the LIMIT statement.  It throws an exception
>> no matter what table alias is used: X or x2.  It says "no such column".
>>
>
> Ah, interesting. LIMIT clause doesn't appear to allow correlated
> subqueries; only self-contained expressions. Here goes that idea.
>
> Something like this should work:
>
> select X.*
> from addresses X join crrt_net_non_pieces using (zip, crrt)
> where net_non_pieces > (
>   select count(*) from addresses x2
>   where X.zip=x2.zip and X.crrt=x2.crrt and
>   (x2.version_id < X.version_id or (x2.version_id = X.version_id and
> x2.rowid < X.rowid))
> );
>
> I took the liberty to simplify the ordering expression, for purposes of
> exposition (it's very long and would need to be repeated four times).
> Replace all occurrences of T.version_id with your CASE clause, calculated
> against table T. Also, I'm breaking ties by rowid; your original problem
> statement is underspecified unless there's a total order on Addresses.
>
> --
> Igor Tandetnik
>
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>


[sqlite] sqlite window functions extension?

2015-10-15 Thread R.Smith


On 2015-10-14 11:20 PM, Don V Nielsen wrote:
> X has columns zip & crrt, just like crrt_net_non.  These form a composite
> key identifying groups within x.  A value "53001.R501" would be an
> example...53001 being the zip code and R501 being the carrier route.  There
> are 52 rows in X that have the key 53001.R501.  A calculation determined
> that I need 42 rows from that key and saved the result in crrt_net_non, the
> row looking like "53001.R501.52.6.46.42".  What I need is a sql function
> that can iterate over crrt_net_non, then grab the rows from X, "53001.R501"
> being first key, sort them into an internal group sequence, then update a
> code of the first 42 rows of that sorted group, and then doing this until
> crrt_net_non is exhausted.

Hi Don, usually a CTE can be pressed into service to make things work. 
Maybe others know exactly what you mean here but it's a bit Greek to me.

As Igor suggested - perhaps some data to accompany this schema and an 
example result-set (or resulting updated table) from it will make it 
clear what you want to achieve with the query. Feel free to simplify so 
the essence of what is needed remain, but be sure to give an example 
that cannot be achieved by other arbitrary means.




[sqlite] sqlite window functions extension?

2015-10-14 Thread Igor Tandetnik
On 10/14/2015 9:29 PM, Don V Nielsen wrote:
> But I am having problems with the LIMIT statement.  It throws an exception
> no matter what table alias is used: X or x2.  It says "no such column".

Ah, interesting. LIMIT clause doesn't appear to allow correlated 
subqueries; only self-contained expressions. Here goes that idea.

Something like this should work:

select X.*
from addresses X join crrt_net_non_pieces using (zip, crrt)
where net_non_pieces > (
   select count(*) from addresses x2
   where X.zip=x2.zip and X.crrt=x2.crrt and
   (x2.version_id < X.version_id or (x2.version_id = X.version_id and 
x2.rowid < X.rowid))
);

I took the liberty to simplify the ordering expression, for purposes of 
exposition (it's very long and would need to be repeated four times). 
Replace all occurrences of T.version_id with your CASE clause, 
calculated against table T. Also, I'm breaking ties by rowid; your 
original problem statement is underspecified unless there's a total 
order on Addresses.
-- 
Igor Tandetnik



[sqlite] sqlite window functions extension?

2015-10-14 Thread Don V Nielsen
The only reason NUM exists in the result set is so I could use it to limit
the outer most select.

But I am having problems with the LIMIT statement.  It throws an exception
no matter what table alias is used: X or x2.  It says "no such column".

Thanks for your time, by the way.

On Wed, Oct 14, 2015 at 6:08 PM, Igor Tandetnik  wrote:

> On 10/14/2015 6:24 PM, Don V Nielsen wrote:
>
>> zip crrt   version_id   NUM   segment
>> [truncated num 1..38]
>> 53001   R501   0060 39xx
>> 53001   R501   0060 40xx
>> 53001   R501   0060 41xx
>> 53001   R501   0060 42xx
>> [truncated num 1..24]
>> 53001   R502   0060 21xx
>> 53001   R502   0060 22xx
>> 53001   R502   0060 23xx
>> 53001   R502   0060 24xx
>>
>
> select * from addresses X where rowid in (
>   select x2.rowid from addresses x2 where X.zip=x2.zip and X.crrt=x2.crrt
>   order by (CASE ...)
>   limit ifnull( (
>  select net_non_pieces from crrt_net_non net
>  where X.zip=net.zip and X.crrt=net.crrt
>   ), 0)
> );
>
> Do you actually need NUM column in the resultset? That one would be tricky
> to pull off.
> --
> Igor Tandetnik
>
>
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>


[sqlite] sqlite window functions extension?

2015-10-14 Thread Igor Tandetnik
On 10/14/2015 6:24 PM, Don V Nielsen wrote:
> zip crrt   version_id   NUM   segment
> [truncated num 1..38]
> 53001   R501   0060 39xx
> 53001   R501   0060 40xx
> 53001   R501   0060 41xx
> 53001   R501   0060 42xx
> [truncated num 1..24]
> 53001   R502   0060 21xx
> 53001   R502   0060 22xx
> 53001   R502   0060 23xx
> 53001   R502   0060 24xx

select * from addresses X where rowid in (
   select x2.rowid from addresses x2 where X.zip=x2.zip and X.crrt=x2.crrt
   order by (CASE ...)
   limit ifnull( (
  select net_non_pieces from crrt_net_non net
  where X.zip=net.zip and X.crrt=net.crrt
   ), 0)
);

Do you actually need NUM column in the resultset? That one would be 
tricky to pull off.
-- 
Igor Tandetnik



[sqlite] sqlite window functions extension?

2015-10-14 Thread Igor Tandetnik
On 10/14/2015 5:20 PM, Don V Nielsen wrote:
> X has columns zip & crrt, just like crrt_net_non.  These form a composite
> key identifying groups within x.  A value "53001.R501" would be an
> example...53001 being the zip code and R501 being the carrier route.  There
> are 52 rows in X that have the key 53001.R501.  A calculation determined
> that I need 42 rows from that key and saved the result in crrt_net_non, the
> row looking like "53001.R501.52.6.46.42".  What I need is a sql function
> that can iterate over crrt_net_non, then grab the rows from X, "53001.R501"
> being first key, sort them into an internal group sequence, then update a
> code of the first 42 rows of that sorted group, and then doing this until
> crrt_net_non is exhausted.

Are you looking for something like this?

UPDATE X set code=whatever where rowid in (
   select x2.rowid from X x2 where X.zip=x2.zip and X.crrt=x2.crrt
   order by someOrder
   limit ifnull( (
  select net_non_pieces from crrt_net_non net
  where X.zip=net.zip and X.crrt=net.crrt
   ), 0)
);

-- 
Igor Tandetnik



[sqlite] sqlite window functions extension?

2015-10-14 Thread Don V Nielsen
Table crrt_net_non_pieces:

zip crrt   pkg_pieces   sel_pieces   non_pieces   net_non_pieces
53001   R501   52   646   42
53001   R502   34   727   24


Addresses (as I have been referring to as X) is going to have too many rows
(86 for this example, 52 rows for 53001.R501 and 34 rows for 53001.R502.
The calculation result says I need 42 of the 52 X rows from 53001.R501 to
be recoded and 24 of the 34 X rows from 53001.R502.

Addresses table:

zip crrt   segment   version_id
53001   R501   060060
53001   R501   060060
53001   R501   060060
53001   R501   060060
53001   R501   060060

In sql server, I would be doing something like this:

  select x.* ,'xx' as segment from ( select -- count(1) a.zip ,a.crrt
,a.version_id ,ROW_NUMBER() OVER ( PARTITION BY a.zip,a.crrt ORDER BY CASE WHEN
a.version_id = '0060' THEN 0 WHEN a.version_id = '0064' THEN 2 WHEN
a.version_id = '0061' THEN 3 ELSE 99 END ASC ) AS NUM from
dbwork..addresses a )x join dbwork..crrt_net_non y on y.zip = x.zip and
y.crrt = x.crrt where x.num <= y.net_non_pieces ;
And the results would come out something like the following:  I truncate
the rows with num values 1..whatever.  Notice NUM is the sequential value
applied to the input (1..52 in the case of R501 and 1..34 in the case of
R502).  I then only kept the number of rows as identified by the
corresponding y.net_non_pieces value.

zip crrt   version_id   NUM   segment
[truncated num 1..38]
53001   R501   0060 39xx
53001   R501   0060 40xx
53001   R501   0060 41xx
53001   R501   0060 42xx
[truncated num 1..24]
53001   R502   0060 21xx
53001   R502   0060 22xx
53001   R502   0060 23xx
53001   R502   0060 24xx


Is this better?  Sorry that I imagined/assumed that you guys just picture
this stuff in your head.  I get that impression reading this list.

dvn


These window functions seem to be some sort of Holy Grail that I just can't
imagine duplicating without it.


On Wed, Oct 14, 2015 at 4:27 PM, R.Smith  wrote:

>
>
> On 2015-10-14 11:20 PM, Don V Nielsen wrote:
>
>> X has columns zip & crrt, just like crrt_net_non.  These form a composite
>> key identifying groups within x.  A value "53001.R501" would be an
>> example...53001 being the zip code and R501 being the carrier route.
>> There
>> are 52 rows in X that have the key 53001.R501.  A calculation determined
>> that I need 42 rows from that key and saved the result in crrt_net_non,
>> the
>> row looking like "53001.R501.52.6.46.42".  What I need is a sql function
>> that can iterate over crrt_net_non, then grab the rows from X,
>> "53001.R501"
>> being first key, sort them into an internal group sequence, then update a
>> code of the first 42 rows of that sorted group, and then doing this until
>> crrt_net_non is exhausted.
>>
>
> Hi Don, usually a CTE can be pressed into service to make things work.
> Maybe others know exactly what you mean here but it's a bit Greek to me.
>
> As Igor suggested - perhaps some data to accompany this schema and an
> example result-set (or resulting updated table) from it will make it clear
> what you want to achieve with the query. Feel free to simplify so the
> essence of what is needed remain, but be sure to give an example that
> cannot be achieved by other arbitrary means.
>
>
>
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>


[sqlite] sqlite window functions extension?

2015-10-14 Thread Igor Tandetnik
On 10/14/2015 4:49 PM, Don V Nielsen wrote:
> What am I looking to do?  Using a table X, I've built a summary table,
> calculating a value called net_non_pieces.  net_not_pieces is a qty of rows
> from a group rows that needs to be recoded to a different value.  So what I
> want to do is for each group in X, look up the calculated value in
> crrt_net_non, then grab the first net_non_pieces (sequenced, of course) and
> apply a code to those rows, leaving the remaining rows alone.

It might help if you show the definition of X, a sample of data in it, 
and the desired result of the query over that data. I, for one, have 
difficulty following your description.
-- 
Igor Tandetnik



[sqlite] sqlite window functions extension?

2015-10-14 Thread Don V Nielsen
X has columns zip & crrt, just like crrt_net_non.  These form a composite
key identifying groups within x.  A value "53001.R501" would be an
example...53001 being the zip code and R501 being the carrier route.  There
are 52 rows in X that have the key 53001.R501.  A calculation determined
that I need 42 rows from that key and saved the result in crrt_net_non, the
row looking like "53001.R501.52.6.46.42".  What I need is a sql function
that can iterate over crrt_net_non, then grab the rows from X, "53001.R501"
being first key, sort them into an internal group sequence, then update a
code of the first 42 rows of that sorted group, and then doing this until
crrt_net_non is exhausted.

On Wed, Oct 14, 2015 at 4:03 PM, Igor Tandetnik  wrote:

> On 10/14/2015 4:49 PM, Don V Nielsen wrote:
>
>> What am I looking to do?  Using a table X, I've built a summary table,
>> calculating a value called net_non_pieces.  net_not_pieces is a qty of
>> rows
>> from a group rows that needs to be recoded to a different value.  So what
>> I
>> want to do is for each group in X, look up the calculated value in
>> crrt_net_non, then grab the first net_non_pieces (sequenced, of course)
>> and
>> apply a code to those rows, leaving the remaining rows alone.
>>
>
> It might help if you show the definition of X, a sample of data in it, and
> the desired result of the query over that data. I, for one, have difficulty
> following your description.
> --
> Igor Tandetnik
>
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>


[sqlite] sqlite window functions extension?

2015-10-14 Thread Don V Nielsen
Is there an extension to sqlite the provide window functions such as
row_number over and partition?  There are plenty of googles seeking said
functionality.

What am I looking to do?  Using a table X, I've built a summary table,
calculating a value called net_non_pieces.  net_not_pieces is a qty of rows
from a group rows that needs to be recoded to a different value.  So what I
want to do is for each group in X, look up the calculated value in
crrt_net_non, then grab the first net_non_pieces (sequenced, of course) and
apply a code to those rows, leaving the remaining rows alone.

I'm not sure how to accomplish such a thing

CREATE TABLE crrt_net_non -- developed from table X
(
zip TEXT,   --composite key
crrt TEXT,  --composite key
pkg_pieces integer,  --total row in group
sel_pieces integer,
non_pieces integer,
net_non_pieces integer  --qty of this group to recode
);

Its a pretty innocent thing to do with window functions.  It is not trivial
to do it without.  I can't even figure out how.

Why am I not dumping Sqlite and going with prostgres or the like?  Because
sqlite is awesome.  It is a portable file, easily created as I need for
each project, that can be moved around or deleted, making it super easy to
maintenance in my production environment.  Love the thing.  But this
problem has me flummuxed.

Thanks for your time and consideration,
dvn