On Sat, Jul 26, 2014 at 3:46 AM, Jiří Nádvorník <nadvornik...@gmail.com>
wrote:



> The reason why I solve the performance issues here is that the table of
> observations has atm cca 3e8 rows after 1.5 year of gathering the data. The
> number growth is linear.
>

So about 500,000 new records a day.



                               (UPDATE
>
>                                                \schema.objcat
>
>                                SET
>
>                                                ipix_cat=q3c_ang2ipix(
>
>                                                                (raj2000 *
> weight + curr_raj2000) / (weight + 1),
>
>                                                                (dej2000 *
> weight + curr_dej2000) / (weight + 1)
>
>                                                ),
>
>                                                raj2000 = (raj2000 * weight
> + curr_raj2000) / (weight + 1),
>
>                                                dej2000 = (dej2000 * weight
> + curr_dej2000) / (weight + 1),
>
>                                                weight=weight+1
>
>                                WHERE
>
>                                                q3c_join(curr_raj2000,
> curr_dej2000,
>
>                                                                raj2000,
> dej2000,
>
>                                                                radius)
>
>                                RETURNING *),
>


Doing all of this (above, plus the other parts I snipped) as a single query
seems far too clever.  How can you identify the slow component when you
have them all munged up like that?

Turn the above select query and run it on a random smattering of records,
'explain (analyze, buffers)', periodically while the load process is going
on.



>
> Results: When I run the query only once (1 client thread), it runs cca 1
> mil rows per hour.
>

Is that 1 million, or 1 thousand?  I'm assuming million, but...


> Which is days for the whole dataset. When I run it in parallel with that
> lock to ensure pessimistic synchronization, it runs sequentially too J
> the other threads just waiting. When I delete that lock and hope to solve
> the resulting conflicts later, the ssd disk serves up to 4 threads
> relatively effectively – which can divide my days of time by 4 – still
> inacceptable.
>

It is processing new records 192 times faster than you are generating them.
 Why is that not acceptable?


>
>
> The reason is quite clear here – I’m trying to write something in one
> cycle of the script to a table – then in the following cycle I need to read
> that information.
>

That is the reason for concurrency issues, but it is not clear that that is
the reason that the performance is not what you desire.  If you first
partition your data into stripes that are a few arc minutes wide, each
stripe should not interact with anything other than itself and two
neighbors.  That should parallelize nicely.


>
>
> Questions for you:
>
> 1.       The first question is if you can think of a better way how to do
> this and maybe if SQL is even capable of doing such thing – or do I have to
> do it in C? Would rewriting the SQL function to C help?
>

Skillfully hand-crafted C will always be faster than SQL, if you don't
count the time needed to write and debug it.



> 2.       Could I somehow bend the commiting during the algorithm for my
> thing? Ensure that inside one cycle, the whole part of the identifiers
> table would be kept in memory for faster lookups?
>
Is committing a bottleneck?  It looks like you are doing everything in
large transactional chunks already, so it probably isn't.  If the
identifier table fits in memory, it should probably stay in memory there on
its own just through usage.  If it doesn't fit, there isn't much you can do
other than pre-cluster the data in a coarse-grained way such that only a
few parts of the table (and its index) are active at any one time, such
that those active parts stay in memory.



> 3.       Why is this so slow? J It is comparable to the quadratic
> algorithm in the terms of speed – only does not use any memory.
>

Use 'explain (analyze, buffers)', preferably with track_io_timing on.  use
top, strace, gprof, or perf.

Cheers,

Jeff

Reply via email to