Re: How to bind to a LARGE array

Alexander Foken Wed, 03 May 2006 05:07:43 -0700

On 03.05.2006 10:53, Bealach Na Bo wrote:

Thank you all for your comments. I think a temporary table is
unavoidable if I want to control the way the query is done here (as
opposed to letting the planner do its job).


The @names array is the result of an earlier DBI query.

So why don't you just combine the earlier query and this query into anew query?

If @names comes from something like "SELECT something as names FROM alot of tables WHERE a lot of conditions match", merge it with the queryyou posted earlier:



    SELECT j.job_log_id
    FROM job_log j, node n

WHERE n.name in ( SELECT something as names FROM a lot of tablesWHERE a lot of conditions match )

    AND   n.type = ?
    AND   n.usage = ?
    AND   n.node_id = j.node_id

This seems to be the cleanest and most natural way to me. It even allowsthe query optimizer of the Database do do its job better than with twoqueries that don't seem to be related to each other to the query optimizer.

Temporary tables are not needed here (and cause more trouble thanneeded) -- unless @names comes from another database with no ability tolink to the database you wand to query "now". If, for example, we talkabout two Oracle DBs, even on different machines, even with differentversions of Oracle, even with different Operating systems, and even ondifferent platforms, there is a way to connect both databases so thatthe contents of one appears to be in the other. In that case, you shouldask an Oracle expert for more details about "database links". (I know*this* setup from my former job, but I never had to work with it, so Ilack some knowledge here.) I think MSSQL has a similar capability. ForMySQL and PostgreSQL, this should at least not be impossible. I wouldrecommend to RTFM.


I agree, a Perl snippet to generate the scalar @names place holders
would probably push the query over max query size, as well as various
other name space limits. Probably a bad idea.

And iterating over the array and issuing scalar @names queries is not
something I want to try.

So I'm left with temporary tables.

No. Rethink the way you issue the queries. You would be stuck withtemporary tables if @names came from a plain file or a device, but notif @names comes from the same or a similar database.


Alexander


Bealach



On 5/2/06, Alexander Foken <[EMAIL PROTECTED]> wrote:

If @names was a few records, you could write something like
    select ... where name in (?,?,?,?,?)
using some Perl code to generate the placeholders ( join(',', ('?') x
scalar @names) ). But with 100,000 records, i think this would bring
DBI, DBD::Pg and the Pg SQL parser to their limits. Usually, I strongly
oppose to using temporary tables. But in your case, I would create a
temporary table (using some unique identifier, like hostname + PID +
random number, as the table name), insert @names into that table, and
then let Pg do the job:

    create table temptable2744143 ....;

    insert into temptable2744143  (name) values (?);
       (for each record in @names)

     SELECT j.job_log_id
     FROM job_log j, node n , temptable2744143 t
     WHERE n.name = t.name
     AND   n.type = ?
     AND   n.usage = ?
     AND   n.node_id = j.node_id;

     drop table temptable2744143;


Where does @names come from?

Alexander

On 01.05.2006 14:21, Bealach Na Bo wrote:

> Hi folks,
>
> Apologies if this has been asked and resolved - I've spent half a day
> searching various archive, but have not found anything.
>
> My problem is the following:
>
> I have a very large postgresql table (ca. 3 mil records) and need to
> replace a nested sql query with an "itarative" one so I end up with:
>
>
>
> @names is a very large array - of the order of 100000 elements.
>
>
> sub prep_sql {
>    my $stmt = <<EOSQL;
>      SELECT j.job_log_id
>      FROM job_log j, node n
>      WHERE n.name in ?       <============ (@names) should be here
>
>          DBI documentations says
>
>          this must be a scalar
>      AND   n.type = ?
>      AND   n.usage = ?
>      AND   n.node_id = j.node_id
> EOSQL
>
>    my @bind = (@names,'B','LIVE'); <=========can't use @names here
>    return ($stmt,@bind);
> } # end sub prep_sql
>
>
> I know how to use sql subqueries to do this in one go in postgresql,
> but I think this might be a lot faster?
>
> Has anyone tried the same sort of thing?
>
> Regards,
>
> Bealach
>


--
Alexander Foken
mailto:[EMAIL PROTECTED]  http://www.foken.de/alexander/



--
Alexander Foken
mailto:[EMAIL PROTECTED]  http://www.foken.de/alexander/

Re: How to bind to a LARGE array

Reply via email to