Hello using a "window" implemented via LIMIT OFFSET is not good - it is solution on some systems where cursors are not available, but it is bad solution on PostgreSQL. Use a cursor instead - it is significantly more efective with less memory requests.
Regards Pavel Stehule 2011/7/8 Viktor Bojović <viktor.bojo...@gmail.com>: > Thanx Wayne, > at the end i did it that way and it works. > The code is below. > CREATE FUNCTION pattern_counter1("patLength" integer) RETURNS character > varying > LANGUAGE plperl > AS $_X$ > my $rvCnt = spi_exec_query("select count(1) as cnt from entry"); > #my $rowCountAll = $rvCnt->{processed}; > my $row = $rvCnt->{rows}[0]; > my $rowCountAll = $row->{cnt}; > my $windowSize = 500000; > my %patterns=(); > for (my $p=0;$p<$rowCountAll;$p+=$windowSize){ > my $sql="select sequence from entry limit $windowSize offset $p"; > > my $rv = spi_exec_query($sql); > my $rowCount = $rv->{processed}; > my $patLen = $_[0]; > my $patt = ''; > > foreach my $rn (0 .. $rowCount -1){ > my $row = $rv->{rows}[$rn]; > my $seq = $row->{sequence}; > for (my $x = 1;$x<=length($seq) - $patLen;$x++){ > $patt=substr($seq,$x,$patLen); > if (! defined $patterns{$patt}) { > $patterns{$patt}=1; > }else{ > $patterns{$patt}++; > } > } > } > } > > foreach $patt (keys %patterns){ > my $sql="insert into patterns values('".$patt."',".$patterns{$patt}.")"; > spi_exec_query($sql); > } > return $tmp; > $_X$; > > > On Fri, Jul 8, 2011 at 8:50 PM, <lists-pg...@useunix.net> wrote: >> >> I'm have the same situation with large tables. Take a look at using a >> cursor to fetch several thousand rows at a time. I presume what's >> happening is that perl is attempting to create a massive list/array in >> memory. If you use a cursor the list should only contain X number of >> rows where X in the number specified at each fetch execution. You'll >> need to define the cursor inside a transaction block. >> >> - begin transaction >> - define the cursor >> - fetch rows from cursor >> - while row count from previous step > 0, execute previous step >> - terminate transaction >> >> Or you could use plpgsql instead of plperl, FOR loops over result sets in >> plpgsql implicitly use cursors... it's just a little less code. >> >> Hope that helps, >> Wayne >> >> On Tue, Jul 05, 2011 at 10:29:03PM +0200, Viktor Bojovi?? wrote: >> > Hi, >> > while reading 20GB table through PL/PERL function , it constantly grows >> > in >> > RAM. >> > I wanted to ask you which is the best way to read table inside that >> > function without such memory consumption. >> > Thanks in advance >> > >> > Code is here: >> > >> > CREATE FUNCTION pattern_counter("patLength" integer) >> > RETURNS varchar AS >> > $BODY$ >> > my $rv = spi_exec_query("select sequence from entry"); >> > my $rowCount = $rv->{processed}; >> > my $patLen = $_[0]; >> > my $patt = ''; >> > my %patterns=(); >> > foreach my $rn (0 .. $rowCount -1){ >> > my $row = $rv->{rows}[$rn]; >> > my $seq = $row->{sequence}; >> > for (my $x = 1;$x<=length($seq) - $patLen;$x++){ >> > $patt=substr($seq,$x,$patLen); >> > if (! defined $patterns{$patt}) { >> > $patterns{$patt}=1; >> > }else{ >> > $patterns{$patt}++; >> > } >> > } >> > } >> > foreach $patt (keys %patterns){ >> > my $sql="insert into patterns >> > values('".$patt."',".$patterns{$patt}.")"; >> > spi_exec_query($sql); >> > } >> > return ''; >> > $BODY$ >> > LANGUAGE plperl VOLATILE >> > COST 100; >> > >> > >> > >> > -- >> > --------------------------------------- >> > Viktor Bojovi?? >> > --------------------------------------- >> > Wherever I go, Murphy goes with me > > > > -- > --------------------------------------- > Viktor Bojović > --------------------------------------- > Wherever I go, Murphy goes with me > -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql