>>Achei sua pergunta sobre ferramentas externas um pouco agressiva para o
momento e isso não ajuda nada.
Não foi minha intenção ser agressivo, sou muito grato pelas ajudas e venho
aprendendo bastante com essa lista,
não tenho experiência com TEXT SEARCH, por isso toquei no assunto do solr,
para saber qual a opnião de vocês.

>>1) Qual a versão do PostgreSQL
9.3.2

>>Ainda não foi (ou eu perdi?) as informações do hardware (principalmente
memória disponível) e qual SO
>>e o Sistema Operacional? Hardware?
Windows 8.1 64 bits, intel i7 2,1Ghz, 8GB ram, hd sata (é apenas um
ambiente de testes)

>>O PostgreSQL depende muito do cache de SO, tanto que devemos configurar o
effective_cache_size corretamente também.
effective_cache_size 3584MB
shared_buffers 3584MB

>>2) O EXPLAIN ANALYZE da consulta, nos dois momentos, em que ela é boa e
que ela é ruim (na sua visão).
Primeira consulta:
"Limit  (cost=27797.55..27797.60 rows=20 width=1161) (actual
time=130309.614..130309.616 rows=20 loops=1)"
"  ->  Sort  (cost=27797.55..27817.55 rows=8000 width=1161) (actual
time=130309.612..130309.612 rows=20 loops=1)"
"        Sort Key: (ts_rank(a.texto_search, query.query)), b.nome"
"        Sort Method: top-N heapsort  Memory: 35kB"
"        ->  Hash Join  (cost=144.67..27584.68 rows=8000 width=1161)
(actual time=132.512..130197.345 rows=97361 loops=1)"
"              Hash Cond: (a.codigo_bairro = b.codigo)"
"              ->  Nested Loop  (cost=110.00..27420.00 rows=8000
width=1142) (actual time=132.034..129042.422 rows=97361 loops=1)"
"                    ->  Function Scan on query  (cost=0.00..0.01 rows=1
width=32) (actual time=0.003..0.005 rows=1 loops=1)"
"                    ->  Bitmap Heap Scan on anuncio a
 (cost=110.00..27339.99 rows=8000 width=1110) (actual
time=132.021..126777.915 rows=97361 loops=1)"
"                          Recheck Cond: (texto_search @@ query.query)"
"                          ->  Bitmap Index Scan on idx_gin_texto_search
 (cost=0.00..108.00 rows=8000 width=0) (actual time=108.097..108.097
rows=97361 loops=1)"
"                                Index Cond: (texto_search @@ query.query)"
"              ->  Hash  (cost=20.41..20.41 rows=1141 width=23) (actual
time=0.456..0.456 rows=1141 loops=1)"
"                    Buckets: 1024  Batches: 1  Memory Usage: 63kB"
"                    ->  Seq Scan on bairro b  (cost=0.00..20.41 rows=1141
width=23) (actual time=0.008..0.201 rows=1141 loops=1)"
"Total runtime: 130310.497 ms"
Segunda consulta:
"Limit  (cost=27797.55..27797.60 rows=20 width=1161) (actual
time=343.553..343.556 rows=20 loops=1)"
"  ->  Sort  (cost=27797.55..27817.55 rows=8000 width=1161) (actual
time=343.551..343.552 rows=20 loops=1)"
"        Sort Key: (ts_rank(a.texto_search, query.query)), b.nome"
"        Sort Method: top-N heapsort  Memory: 35kB"
"        ->  Hash Join  (cost=144.67..27584.68 rows=8000 width=1161)
(actual time=57.589..286.643 rows=97361 loops=1)"
"              Hash Cond: (a.codigo_bairro = b.codigo)"
"              ->  Nested Loop  (cost=110.00..27420.00 rows=8000
width=1142) (actual time=57.107..211.125 rows=97361 loops=1)"
"                    ->  Function Scan on query  (cost=0.00..0.01 rows=1
width=32) (actual time=0.004..0.005 rows=1 loops=1)"
"                    ->  Bitmap Heap Scan on anuncio a
 (cost=110.00..27339.99 rows=8000 width=1110) (actual time=57.089..180.745
rows=97361 loops=1)"
"                          Recheck Cond: (texto_search @@ query.query)"
"                          ->  Bitmap Index Scan on idx_gin_texto_search
 (cost=0.00..108.00 rows=8000 width=0) (actual time=32.931..32.931
rows=97361 loops=1)"
"                                Index Cond: (texto_search @@ query.query)"
"              ->  Hash  (cost=20.41..20.41 rows=1141 width=23) (actual
time=0.461..0.461 rows=1141 loops=1)"
"                    Buckets: 1024  Batches: 1  Memory Usage: 63kB"
"                    ->  Seq Scan on bairro b  (cost=0.00..20.41 rows=1141
width=23) (actual time=0.008..0.198 rows=1141 loops=1)"
"Total runtime: 344.407 ms"

>>Nesse caso acho melhor mesmo usar `EXPLAIN (ANALYZE, VERBOSE, BUFFERS)`
(dependendo da versão), assim podemos analisar melhor o efeito de cache do
PostgreSQL.
Primeira consulta:
"Limit  (cost=27797.55..27797.60 rows=20 width=1161) (actual
time=46109.061..46109.064 rows=20 loops=1)"
"  Output: a.tipo, a.inicio, a.texto, b.nome, (ts_rank(a.texto_search,
query.query))"
"  Buffers: shared hit=50820 read=2055"
"  ->  Sort  (cost=27797.55..27817.55 rows=8000 width=1161) (actual
time=46109.059..46109.060 rows=20 loops=1)"
"        Output: a.tipo, a.inicio, a.texto, b.nome,
(ts_rank(a.texto_search, query.query))"
"        Sort Key: (ts_rank(a.texto_search, query.query)), b.nome"
"        Sort Method: top-N heapsort  Memory: 36kB"
"        Buffers: shared hit=50820 read=2055"
"        ->  Hash Join  (cost=144.67..27584.68 rows=8000 width=1161)
(actual time=92.188..46046.735 rows=58844 loops=1)"
"              Output: a.tipo, a.inicio, a.texto, b.nome,
ts_rank(a.texto_search, query.query)"
"              Hash Cond: (a.codigo_bairro = b.codigo)"
"              Buffers: shared hit=50820 read=2055"
"              ->  Nested Loop  (cost=110.00..27420.00 rows=8000
width=1142) (actual time=91.696..45035.989 rows=58844 loops=1)"
"                    Output: a.tipo, a.inicio, a.texto, a.texto_search,
a.codigo_bairro, query.query"
"                    Buffers: shared hit=50811 read=2055"
"                    ->  Function Scan on query  (cost=0.00..0.01 rows=1
width=32) (actual time=0.003..0.004 rows=1 loops=1)"
"                          Output: query.query"
"                          Function Call: '''propri'''::tsquery"
"                    ->  Bitmap Heap Scan on public.anuncio a
 (cost=110.00..27339.99 rows=8000 width=1110) (actual
time=91.682..43901.170 rows=58844 loops=1)"
"                          Output: a.id, a.id_usuario, a.codigo_bairro,
a.tipo, a.texto, a.inicio, a.texto_search"
"                          Recheck Cond: (a.texto_search @@ query.query)"
"                          Buffers: shared hit=50811 read=2055"
"                          ->  Bitmap Index Scan on idx_gin_texto_search
 (cost=0.00..108.00 rows=8000 width=0) (actual time=77.595..77.595
rows=58844 loops=1)"
"                                Index Cond: (a.texto_search @@
query.query)"
"                                Buffers: shared hit=3 read=45"
"              ->  Hash  (cost=20.41..20.41 rows=1141 width=23) (actual
time=0.471..0.471 rows=1141 loops=1)"
"                    Output: b.nome, b.codigo"
"                    Buckets: 1024  Batches: 1  Memory Usage: 63kB"
"                    Buffers: shared hit=9"
"                    ->  Seq Scan on public.bairro b  (cost=0.00..20.41
rows=1141 width=23) (actual time=0.007..0.191 rows=1141 loops=1)"
"                          Output: b.nome, b.codigo"
"                          Buffers: shared hit=9"
"Total runtime: 46109.779 ms"
Segunda consulta:
"Limit  (cost=27797.55..27797.60 rows=20 width=1161) (actual
time=232.074..232.077 rows=20 loops=1)"
"  Output: a.tipo, a.inicio, a.texto, b.nome, (ts_rank(a.texto_search,
query.query))"
"  Buffers: shared hit=52875"
"  ->  Sort  (cost=27797.55..27817.55 rows=8000 width=1161) (actual
time=232.071..232.072 rows=20 loops=1)"
"        Output: a.tipo, a.inicio, a.texto, b.nome,
(ts_rank(a.texto_search, query.query))"
"        Sort Key: (ts_rank(a.texto_search, query.query)), b.nome"
"        Sort Method: top-N heapsort  Memory: 36kB"
"        Buffers: shared hit=52875"
"        ->  Hash Join  (cost=144.67..27584.68 rows=8000 width=1161)
(actual time=35.060..195.143 rows=58844 loops=1)"
"              Output: a.tipo, a.inicio, a.texto, b.nome,
ts_rank(a.texto_search, query.query)"
"              Hash Cond: (a.codigo_bairro = b.codigo)"
"              Buffers: shared hit=52875"
"              ->  Nested Loop  (cost=110.00..27420.00 rows=8000
width=1142) (actual time=34.546..143.135 rows=58844 loops=1)"
"                    Output: a.tipo, a.inicio, a.texto, a.texto_search,
a.codigo_bairro, query.query"
"                    Buffers: shared hit=52866"
"                    ->  Function Scan on query  (cost=0.00..0.01 rows=1
width=32) (actual time=0.004..0.006 rows=1 loops=1)"
"                          Output: query.query"
"                          Function Call: '''propri'''::tsquery"
"                    ->  Bitmap Heap Scan on public.anuncio a
 (cost=110.00..27339.99 rows=8000 width=1110) (actual time=34.528..123.412
rows=58844 loops=1)"
"                          Output: a.id, a.id_usuario, a.codigo_bairro,
a.tipo, a.texto, a.inicio, a.texto_search"
"                          Recheck Cond: (a.texto_search @@ query.query)"
"                          Buffers: shared hit=52866"
"                          ->  Bitmap Index Scan on idx_gin_texto_search
 (cost=0.00..108.00 rows=8000 width=0) (actual time=20.198..20.198
rows=58844 loops=1)"
"                                Index Cond: (a.texto_search @@
query.query)"
"                                Buffers: shared hit=48"
"              ->  Hash  (cost=20.41..20.41 rows=1141 width=23) (actual
time=0.478..0.478 rows=1141 loops=1)"
"                    Output: b.nome, b.codigo"
"                    Buckets: 1024  Batches: 1  Memory Usage: 63kB"
"                    Buffers: shared hit=9"
"                    ->  Seq Scan on public.bairro b  (cost=0.00..20.41
rows=1141 width=23) (actual time=0.008..0.218 rows=1141 loops=1)"
"                          Output: b.nome, b.codigo"
"                          Buffers: shared hit=9"
"Total runtime: 232.811 ms"


Atenciosamente,

Alessandro Lima
email [email protected]
_______________________________________________
pgbr-geral mailing list
[email protected]
https://listas.postgresql.org.br/cgi-bin/mailman/listinfo/pgbr-geral

Responder a