Re: [Perldl] Loading large data from database into a piddle

Doug Hunt Fri, 14 Nov 2014 08:12:12 -0800

Hi kmx: What if you make a temporary table by selecting the subset ofthe table you want and then use pg_getcopydata to dump this entire temptable?


Just a thought...


Regards,

  Doug Hunt

[email protected]
Software Engineer
UCAR - COSMIC, Tel. (303) 497-2611

On Fri, 14 Nov 2014, kmx wrote:

I have tried pg_getcopydata, however I was not able to make it better than my 
old approach. After many tries it was still
15-20% slower.

My guess is that pg_getcopydata(..) might be significantly faster when dumping 
the whole table (which I was not able to test as
the table in question was too big). When dumping a result of SQL query there 
seems to be no advantage.

I have also slightly updated my "maybe module" at 
https://gist.github.com/kmx/6f1234478828e7960fbd

--
kmx

On 12.11.2014 23:54, kmx wrote:
      Thanks, pg_getcopydata sounds very promising.

      I'll try to implement an alternative solution based on pg_getcopydata and 
compare it with my current approach.

      --
      kmx

      On 12.11.2014 16:48, Vikas N Kumar wrote:
      On 11/12/2014 07:43 AM, kmx wrote:
            my $dbh = DBI->connect($dsn);
              my $pdl = pdl($dbh->selectall_arrayref($sql_query));

            But it does not scale well for very large data (millions of rows).


      Hi KMX

      If you're using Postgresql you should use the DBD::Pg->pg_getcopydata using the 
"COPY mytable to STDOUT"
      functionality for accessing millions of rows. You can do this in async or 
sync mode. This will get you there
      faster than using selectall_arrayref(). This allows you to get the rows 
without having to redesign your DB.

      SQLite has a stream API but I am not familiar with it.

      --Vikas

_______________________________________________
Perldl mailing list
[email protected]
http://mailman.jach.hawaii.edu/mailman/listinfo/perldl

Re: [Perldl] Loading large data from database into a piddle

Reply via email to