Re: [SQL] SQL problem: bank account

Dmitry Tkach Tue, 03 Jun 2003 01:01:14 -0700

I am afraid, this looks even uglier then your second solution, but should work, and be quicker...

-- You need this to avoid having to rescan the whole table for each customerid every time and resort the results create index customer_txstamp_idx on bank_account (customer_id, ts);

select ba.* from bank_account ba where transaction_id = (select transaction_id from bank_account where customer_id = ba.customer_id order by customer_id desc, ts desc limit 1);

Now, note that, if you have lots of different customers in that table, it will still take a while to fetch them all (although, it should still be a lot quicker then half an hour) - in that case, you may consider either getting them one-by-one (by adding ... and customer_id=? to the above query) or using cursors...

I hope, it helps...

Dima

Erik G. Burrows wrote:

It seems to me this is a simple problem, but the solution eludes me.

I have a table:

bank_account (
 transaction_id int not null serial,
 customer_id int not null references customer(id),
 ts timestamp not null default now(),
 amount float not null,
 balance float not null,
 primary key(transaction_id)
)

I need to get the most recent transaction for each customer. I need only
the transaction ID, but the entire row would be best.

I have two solutions, both of which are too slow for use in my
interactive web-based interface:

Solution1: Outer left self join:
SELECT
 ba1.*
FROM
 bank_account ba1
 LEFT OUTER JOIN bank_account ba2
   ON ba1.customer_id = ba2.customer_id AND ba1.ts < ba2.ts
WHERE
 ba2.ts IS NULL;

This query works great on tables of less than a few thousand rows. For
my 300k row table, it takes several hours.


Solution2: max-concat trick
SELECT
 split_part(max(
     extract(EPOCH from ts)::VARCHAR || '    ' ||
     transaction_id::VARCHAR), '    ', 2)::INT
FROM
 bank_account
GROUP BY
 customer_id

This is an ugly and obviously inefficient solution, but it does the job
in about 1/2 hour. Still too long though.

I've been working on this problem for days, and consulting friends. No
elegant, fast solution is presenting itself. As I said, I feel I'm not
seeing the obvious solution in front of my face.

In the mean-time I can use this query to do the job on a per-customer
basis:

select
 *
from
 bank_account
where
 id = <the customer's ID>
 and ts = (select max(ts) from bank_account ba2 where ba2.customer_id =
bank_account.customer_id);


However, doing this for all 40,000 customers is not workable as a manual
process. My last resort is to do it this way to pre-generate the report,
but I'd far rather do it real-time.

Help! My brain hurts!


---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

Re: [SQL] SQL problem: bank account

Reply via email to