On May 22, 12:00 pm, valgog <[EMAIL PROTECTED]> wrote:
> I have rewritten the code like
>
>       existing_words_array :=     ARRAY( select word
>                                            from WORD_COUNTS
>                                           where word = ANY
> ( array_of_words ) );
>       not_existing_words_array := ARRAY( select distinct_word
>                                            from ( select distinct
> (array_of_words)[s.index] as distinct_word
>                                                     from
> generate_series(1, array_upper( array_of_words, 1 ) ) as s(index)
>                                                  ) as distinct_words
>                                           where distinct_word <> ALL
> ( existing_words_array ) );
>       -- insert the missing words
>       if not_existing_words_array is not null then
>         insert into WORD_COUNTS
>                     ( word, count )
>                     ( select word, 1
>                         from ( select
> not_existing_words_array[s.index] as word
>                                  from generate_series( 1,
> array_upper( not_existing_words_array, 1 ) ) as s(index) ) as
> distinct_words
>                      );
>       end if;
>       -- update the counts
>       if existing_words_array is not null then
>         update WORD_COUNTS
>            set count = COALESCE( count, 0 ) + 1
>          where sw_word = ANY ( existing_words_array );
>       end if;
>
> Now it processes a million records in 14 seconds... so it was probably
> the problem of looking up NOT IN WORD_COUNTS was way too expencive

Sorry... this code did not update anythig at all, as I forgot about
the NULL values... had to COALASCE practically everything and use
array_upper()... do not have the performance numbers of the insert,
updates yet...


---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
       subscribe-nomail command to [EMAIL PROTECTED] so that your
       message can get through to the mailing list cleanly

Reply via email to