I have a query that produce a different query plan if I put a trim in one
of the columns in the order by.


When i put the trim in any column it use hashaggregate and took 3 seconds
against 30 when not.


Is wear because the columns is clean not need to be trimmed, I have check
that.


The problem is that I can't change the query because is generate by the
mondrian.


I do research and found in postgres list that I need to crank work_mem up
high but don't work for me.



My postgresql.conf


# Add settings for extensions here


default_statistics_target = 50 # pgtune wizard 2014-06-04


maintenance_work_mem = 1GB # pgtune wizard 2014-06-04


constraint_exclusion = on # pgtune wizard 2014-06-04


checkpoint_completion_target = 0.9 # pgtune wizard 2014-06-04


effective_cache_size = 44GB # pgtune wizard 2014-06-04


work_mem = 1536MB # pgtune wizard 2014-06-04


#work_mem = 16GB # I have try this but don't work


wal_buffers = 32MB # pgtune wizard 2014-06-04


checkpoint_segments = 16 # pgtune wizard 2014-06-04


shared_buffers = 15GB # pgtune wizard 2014-06-04


max_connections = 20 # pgtune wizard 2014-06-04




___________________________________________________




Query with trim


SELECT "dim_cliente"."tipocliente" AS "c0",


       "dim_cliente"."a1_ibge" AS "c1",


       "dim_cliente"."a1_cod" AS "c2",


       "dim_cliente"."a1_nome" AS "c3",


       "dim_vendedor"."a3_nome" AS "c4"


  FROM "public"."dim_cliente" AS "dim_cliente",


       "public"."fato_ventas_productos" AS "fato_ventas_productos",


       "public"."dim_vendedor" AS "dim_vendedor"


 WHERE "fato_ventas_productos"."key_cliente" = "dim_cliente"."key_cliente"


   AND "fato_ventas_productos"."key_vendedor" =
"dim_vendedor"."key_vendedor"


 GROUP


    BY "dim_cliente"."tipocliente" ,


       "dim_cliente"."a1_ibge",


       "dim_cliente"."a1_cod",


       "dim_cliente"."a1_nome",


       "dim_vendedor"."a3_nome"


 ORDER


    BY trim("dim_cliente"."tipocliente") ASC NULLS LAST,


       "dim_cliente"."a1_ibge" ASC NULLS LAST, -- the same result if I put
the trim here


       "dim_cliente"."a1_cod" ASC NULLS LAST, -- or here


       "dim_cliente"."a1_nome" ASC NULLS LAST; -- or here


-- this query took 3845.895 ms




___________________________________________________




Query Plan when using trim



 QUERY PLAN



---------------------------------------------------------------------------------------------------------------------------------------------------------------------


 Sort  (cost=193101.41..195369.80 rows=907357 width=129) (actual
time=3828.176..3831.261 rows=43615 loops=1)


   Output: dim_cliente.tipocliente, dim_cliente.a1_ibge,
dim_cliente.a1_cod, dim_cliente.a1_nome, dim_vendedor.a3_nome,
(btrim((dim_cliente.tipocliente)::text))


   Sort Key: (btrim((dim_cliente.tipocliente)::text)), dim_cliente.a1_ibge,
dim_cliente.a1_cod, dim_cliente.a1_nome


   Sort Method: quicksort  Memory: 13121kB


   ->  HashAggregate  (cost=91970.52..103312.49 rows=907357 width=129)
(actual time=2462.690..2496.729 rows=43615 loops=1)


         Output: dim_cliente.tipocliente, dim_cliente.a1_ibge,
dim_cliente.a1_cod, dim_cliente.a1_nome, dim_vendedor.a3_nome,
btrim((dim_cliente.tipocliente)::text)


         ->  Hash Join  (cost=856.30..80628.56 rows=907357 width=129)
(actual time=29.524..1533.880 rows=907357 loops=1)


               Output: dim_cliente.tipocliente, dim_cliente.a1_ibge,
dim_cliente.a1_cod, dim_cliente.a1_nome, dim_vendedor.a3_nome


               Hash Cond: (fato_ventas_productos.key_vendedor =
dim_vendedor.key_vendedor)


               ->  Hash Join  (cost=830.02..68126.13 rows=907357 width=86)
(actual time=28.746..1183.691 rows=907357 loops=1)


                     Output: dim_cliente.tipocliente, dim_cliente.a1_ibge,
dim_cliente.a1_cod, dim_cliente.a1_nome, fato_ventas_productos.key_vendedor


                     Hash Cond: (fato_ventas_productos.key_cliente =
dim_cliente.key_cliente)


                     ->  Seq Scan on public.fato_ventas_productos
 (cost=0.00..46880.57 rows=907357 width=16) (actual time=0.004..699.779
rows=907357 loops=1)


                           Output: fato_ventas_productos.key_cliente,
fato_ventas_productos.key_vendedor


                     ->  Hash  (cost=618.90..618.90 rows=16890 width=86)
(actual time=28.699..28.699 rows=16890 loops=1)


                           Output: dim_cliente.tipocliente,
dim_cliente.a1_ibge, dim_cliente.a1_cod, dim_cliente.a1_nome,
dim_cliente.key_cliente


                           Buckets: 2048  Batches: 1  Memory Usage: 1980kB


                           ->  Seq Scan on public.dim_cliente
 (cost=0.00..618.90 rows=16890 width=86) (actual time=0.008..16.537
rows=16890 loops=1)


                                 Output: dim_cliente.tipocliente,
dim_cliente.a1_ibge, dim_cliente.a1_cod, dim_cliente.a1_nome,
dim_cliente.key_cliente


               ->  Hash  (cost=18.90..18.90 rows=590 width=59) (actual
time=0.747..0.747 rows=590 loops=1)


                     Output: dim_vendedor.a3_nome, dim_vendedor.key_vendedor


                     Buckets: 1024  Batches: 1  Memory Usage: 56kB


                     ->  Seq Scan on public.dim_vendedor  (cost=0.00..18.90
rows=590 width=59) (actual time=0.026..0.423 rows=590 loops=1)


                           Output: dim_vendedor.a3_nome,
dim_vendedor.key_vendedor


 Total runtime: 3845.895 ms


(25 filas)



___________________________________________________


Query without trim

SELECT "dim_cliente"."tipocliente" AS "c0",

       "dim_cliente"."a1_ibge" AS "c1",

       "dim_cliente"."a1_cod" AS "c2",

       "dim_cliente"."a1_nome" AS "c3",

       "dim_vendedor"."a3_nome" AS "c4"

  FROM "public"."dim_cliente" AS "dim_cliente",

       "public"."fato_ventas_productos" AS "fato_ventas_productos",

       "public"."dim_vendedor" AS "dim_vendedor"

 WHERE "fato_ventas_productos"."key_cliente" = "dim_cliente"."key_cliente"

   AND "fato_ventas_productos"."key_vendedor" =
"dim_vendedor"."key_vendedor"

 GROUP

    BY "dim_cliente"."tipocliente" ,

       "dim_cliente"."a1_ibge",

       "dim_cliente"."a1_cod",

       "dim_cliente"."a1_nome",

       "dim_vendedor"."a3_nome"

 ORDER

    BY "dim_cliente"."tipocliente" ASC NULLS LAST,

       "dim_cliente"."a1_ibge" ASC NULLS LAST,

       "dim_cliente"."a1_cod" ASC NULLS LAST,

       "dim_cliente"."a1_nome" ASC NULLS LAST;

-- this query took 37249.268 ms


___________________________________________________


Query Plan when not using trim


QUERY PLAN


---------------------------------------------------------------------------------------------------------------------------------------------------------------

 Group  (cost=170417.48..184027.84 rows=907357 width=129) (actual
time=36649.329..37235.158 rows=43615 loops=1)

   Output: dim_cliente.tipocliente, dim_cliente.a1_ibge,
dim_cliente.a1_cod, dim_cliente.a1_nome, dim_vendedor.a3_nome

   ->  Sort  (cost=170417.48..172685.88 rows=907357 width=129) (actual
time=36649.315..36786.760 rows=907357 loops=1)

         Output: dim_cliente.tipocliente, dim_cliente.a1_ibge,
dim_cliente.a1_cod, dim_cliente.a1_nome, dim_vendedor.a3_nome

         Sort Key: dim_cliente.tipocliente, dim_cliente.a1_ibge,
dim_cliente.a1_cod, dim_cliente.a1_nome, dim_vendedor.a3_nome

         Sort Method: quicksort  Memory: 265592kB

         ->  Hash Join  (cost=856.30..80628.56 rows=907357 width=129)
(actual time=26.719..1593.693 rows=907357 loops=1)

               Output: dim_cliente.tipocliente, dim_cliente.a1_ibge,
dim_cliente.a1_cod, dim_cliente.a1_nome, dim_vendedor.a3_nome

               Hash Cond: (fato_ventas_productos.key_vendedor =
dim_vendedor.key_vendedor)

               ->  Hash Join  (cost=830.02..68126.13 rows=907357 width=86)
(actual time=25.980..1203.775 rows=907357 loops=1)

                     Output: dim_cliente.tipocliente, dim_cliente.a1_ibge,
dim_cliente.a1_cod, dim_cliente.a1_nome, fato_ventas_productos.key_vendedor

                     Hash Cond: (fato_ventas_productos.key_cliente =
dim_cliente.key_cliente)

                     ->  Seq Scan on public.fato_ventas_productos
 (cost=0.00..46880.57 rows=907357 width=16) (actual time=0.004..680.283
rows=907357 loops=1)

                           Output: fato_ventas_productos.key_cliente,
fato_ventas_productos.key_vendedor

                     ->  Hash  (cost=618.90..618.90 rows=16890 width=86)
(actual time=25.931..25.931 rows=16890 loops=1)

                           Output: dim_cliente.tipocliente,
dim_cliente.a1_ibge, dim_cliente.a1_cod, dim_cliente.a1_nome,
dim_cliente.key_cliente

                           Buckets: 2048  Batches: 1  Memory Usage: 1980kB

                           ->  Seq Scan on public.dim_cliente
 (cost=0.00..618.90 rows=16890 width=86) (actual time=0.005..13.736
rows=16890 loops=1)

                                 Output: dim_cliente.tipocliente,
dim_cliente.a1_ibge, dim_cliente.a1_cod, dim_cliente.a1_nome,
dim_cliente.key_cliente

               ->  Hash  (cost=18.90..18.90 rows=590 width=59) (actual
time=0.715..0.715 rows=590 loops=1)

                     Output: dim_vendedor.a3_nome, dim_vendedor.key_vendedor

                     Buckets: 1024  Batches: 1  Memory Usage: 56kB

                     ->  Seq Scan on public.dim_vendedor  (cost=0.00..18.90
rows=590 width=59) (actual time=0.024..0.405 rows=590 loops=1)

                           Output: dim_vendedor.a3_nome,
dim_vendedor.key_vendedor

 Total runtime: 37249.268 ms

(25 filas)


___________________________________________________


Is anything that I can do to solve this problem, is that a bug or a config
problem?


Here the link with a dump of the tables

https://drive.google.com/file/d/0Bwupj61i9BtWZ1NiVXltaWc0dnM/view?usp=sharing


I appreciate your help

Reply via email to