[PERFORM] optimization join on random value

2015-05-03 Thread Anton Bushmelev
 Hello guru of postgres,  it's possoble to tune query with join on random
string ?
i know that it is not real life example, but i need it for tests.

soe=# explain
soe-#  SELECT   ADDRESS_ID,
soe-#   CUSTOMER_ID,
soe-#   DATE_CREATED,
soe-#   HOUSE_NO_OR_NAME,
soe-#   STREET_NAME,
soe-#   TOWN,
soe-#   COUNTY,
soe-#   COUNTRY,
soe-#   POST_CODE,
soe-#   ZIP_CODE
soe-# FROM ADDRESSES
soe-# WHERE customer_id = trunc( random()*45000) ;
QUERY
PLAN
---
 Seq Scan on addresses  (cost=0.00..165714.00 rows=22500 width=84)
   Filter: ((customer_id)::double precision = trunc((random() *
45000::double precision)))
(2 rows)

soe=# \d addresses;
soe=# \d addresses;
  Table
public.addresses

  Column  |Type |
Modifiers

--+-+---

 address_id   | bigint  | not
null
 customer_id  | bigint  | not
null
 date_created | timestamp without time zone | not
null
 house_no_or_name | character varying(60)
|

 street_name  | character varying(60)
|

 town | character varying(60)
|

 county   | character varying(60)
|

 country  | character varying(60)
|

 post_code| character varying(12)
|

 zip_code | character varying(12)
|

Indexes:

addresses_pkey PRIMARY KEY, btree
(address_id)

addresses_cust_ix btree
(customer_id)

Foreign-key
constraints:

add_cust_fk FOREIGN KEY (customer_id) REFERENCES
customers(customer_id) DEFERRABLE



same query in oracle same query use index access path:

00:05:23 (1)c##bushmelev_aa@orcl explain plan for
 SELECT   ADDRESS_ID,
  CUSTOMER_ID,
  DATE_CREATED,
  HOUSE_NO_OR_NAME,
  STREET_NAME,
  TOWN,
  COUNTY,
  COUNTRY,
  POST_CODE,
  ZIP_CODE
FROM soe.ADDRESSES
* WHERE customer_id = dbms_random.value ();*

Explained.

Elapsed: 00:00:00.05
00:05:29 (1)c##bushmelev_aa@orcl @utlxpls

PLAN_TABLE_OUTPUT
--
Plan hash value: 317664678

---
| Id  | Operation   | Name| Rows  | Bytes |
Cost (%CPU)| Time |
---
|   0 | SELECT STATEMENT| | 2 |   150 |
5   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| ADDRESSES   | 2 |   150 |
5   (0)| 00:00:01 |
|*  2 |   *INDEX RANGE SCAN * | ADDRESS_CUST_IX | 2 |
| 3   (0)| 00:00:01 |
---

Predicate Information (identified by operation id):
---
   2 - access(CUSTOMER_ID=DBMS_RANDOM.VALUE())


Re: [PERFORM] optimization join on random value

2015-05-03 Thread Heikki Linnakangas

On 05/04/2015 12:23 AM, Anton Bushmelev wrote:

  Hello guru of postgres,  it's possoble to tune query with join on random
string ?
i know that it is not real life example, but i need it for tests.

soe=# explain
soe-#  SELECT   ADDRESS_ID,
soe-#   CUSTOMER_ID,
soe-#   DATE_CREATED,
soe-#   HOUSE_NO_OR_NAME,
soe-#   STREET_NAME,
soe-#   TOWN,
soe-#   COUNTY,
soe-#   COUNTRY,
soe-#   POST_CODE,
soe-#   ZIP_CODE
soe-# FROM ADDRESSES
soe-# WHERE customer_id = trunc( random()*45000) ;
 QUERY
PLAN
---
  Seq Scan on addresses  (cost=0.00..165714.00 rows=22500 width=84)
Filter: ((customer_id)::double precision = trunc((random() *
45000::double precision)))
(2 rows)



There are two problems here that prohibit the index from being used:

1. random() is volatile, so it's recalculated for each row.
2. For the comparison, customer_id is cast to a float, and the index is 
on the bigint value.


To work around the first problem, put the random() call inside a 
subquery. And for the second problem, cast to bigint.


SELECT ... FROM addresses
WHERE customer_id = (SELECT random()*45000)::bigint

- Heikki



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance