I have the following index created:
CELL_CREATE_INDEX = '''
CREATE INDEX IF NOT EXISTS idx_cell_enodeb_id ON erilte01_cell
(PERIOD_START_TIME,REGION_ID)'''
So I used now the created INDEX and the EXPLAIN tells me I used it:
['PLAN']
["SELECT\n P__Z0.PERIOD_START_TIME AS __C0_0,\n P__Z0.REGION_ID AS
__C0_1,\n P__Z0.MARKET_ID AS __C0_2,\n
SUM(P__Z0.EUTRANCELLFDD_PMUETHPTIMEDL) AS __C0_3,\n
SUM(P__Z0.EUTRANCELLFDD_PMPDCPVOLDLDRB) AS __C0_4,\n
SUM(P__Z0.EUTRANCELLFDD_PMPDCPVOLDLDRBLASTTTI) AS __C0_5\nFROM
PUBLIC.ERILTE01_CELL P__Z0 USE INDEX (IDX_CELL_ENODEB_ID)\n /*
PUBLIC.IDX_CELL_ENODEB_ID: PERIOD_START_TIME < TIMESTAMP '2020-08-04
00:00:00'\n AND REGION_ID = 'NORTHEAST'\n AND
PERIOD_START_TIME >= TIMESTAMP '2020-08-03 00:00:00'\n */\nWHERE
(UPPER(P__Z0.MEASUREMENTNAME) = 'EUTRANCELLFDD')\n AND ((P__Z0.DATALEVEL
= 'RAW')\n AND ((P__Z0.DATATYPE = 'RAW')\n AND
((P__Z0.PERIOD_START_TIME < TIMESTAMP '2020-08-04 00:00:00')\n AND
((P__Z0.REGION_ID = 'NORTHEAST')\n AND (P__Z0.PERIOD_START_TIME >=
TIMESTAMP '2020-08-03 00:00:00')))))\nGROUP BY P__Z0.PERIOD_START_TIME,
P__Z0.REGION_ID, P__Z0.MARKET_ID"]
['SELECT\n EUTRANCELLFDD__Z1.PERIOD_START_TIME AS PERIOD_START_TIME,\n
EUTRANCELLFDD__Z1.REGION,\n EUTRANCELLFDD__Z1.MARKET,\n
ROUND(DECODE((NVL(EUTRANCELLFDD__Z1.PMUETHPTIMEDL, 0) / 1000), 0, 0,
((NVL(EUTRANCELLFDD__Z1.PMPDCPVOLDLDRB, 0) -
NVL(EUTRANCELLFDD__Z1.PMPDCPVOLDLDRBLASTTTI, 0)) /
(NVL(EUTRANCELLFDD__Z1.PMUETHPTIMEDL, 0) / 1000))), 6) AS DLTHROUGHPU\nFROM
(\n SELECT\n __C0_0 AS PERIOD_START_TIME,\n __C0_1 AS
REGION,\n __C0_2 AS MARKET,\n CAST(CAST(SUM(__C0_3) AS DOUBLE)
AS DOUBLE) AS PMUETHPTIMEDL,\n CAST(CAST(SUM(__C0_4) AS DOUBLE) AS
DOUBLE) AS PMPDCPVOLDLDRB,\n CAST(CAST(SUM(__C0_5) AS DOUBLE) AS
DOUBLE) AS PMPDCPVOLDLDRBLASTTTI\n FROM PUBLIC.__T0\n GROUP BY __C0_0,
__C0_1, __C0_2\n) EUTRANCELLFDD__Z1\n /* SELECT\n __C0_0 AS
PERIOD_START_TIME,\n __C0_1 AS REGION,\n __C0_2 AS MARKET,\n
CAST(CAST(SUM(__C0_3) AS DOUBLE) AS DOUBLE) AS PMUETHPTIMEDL,\n
CAST(CAST(SUM(__C0_4) AS DOUBLE) AS DOUBLE) AS PMPDCPVOLDLDRB,\n
CAST(CAST(SUM(__C0_5) AS DOUBLE) AS DOUBLE) AS PMPDCPVOLDLDRBLASTTTI\n
FROM PUBLIC.__T0\n /++ PUBLIC."merge_scan" ++/\n GROUP BY __C0_0,
__C0_1, __C0_2\n */']
Here is the query on 2.5Million rows, on a table that has ~1500 columns:
ALL_QUERY = '''select
EUTRANCELLFDD.period_start_time period_start_time,
EUTRANCELLFDD.REGION,
EUTRANCELLFDD.MARKET,
round(DECODE((NVL(EUTRANCELLFDD.PMUETHPTIMEDL,0)/1000), 0,
0,(NVL(EUTRANCELLFDD.PMPDCPVOLDLDRB,0)-NVL(EUTRANCELLFDD.PMPDCPVOLDLDRBLASTTTI,0))/(NVL(EUTRANCELLFDD.PMUETHPTIMEDL,0)/1000)),
6) DLThroughpu
from
(
select
period_start_time,
p.REGION_ID REGION,
p.MARKET_ID MARKET,
SUM(EUTRANCELLFDD_PMUETHPTIMEDL) PMUETHPTIMEDL,
SUM(EUTRANCELLFDD_PMPDCPVOLDLDRB) PMPDCPVOLDLDRB,
SUM(EUTRANCELLFDD_PMPDCPVOLDLDRBLASTTTI) PMPDCPVOLDLDRBLASTTTI
from
erilte01_cell p
USE INDEX(idx_cell_enodeb_id)
where
p.REGION_ID in ( 'NORTHEAST' ) and
period_start_time >= '2020-08-03 00:00:00' and
period_start_time < '2020-08-04 00:00:00' and
p.datatype ='RAW' and
p.datalevel ='RAW' and
UPPER(p.measurementname) = UPPER('EUTRANCELLFDD')
group by
period_start_time,
p.REGION_ID,
p.MARKET_ID) EUTRANCELLFDD'''
And it returns in about 7 seconds in a 18Host cluster with about 3TB memory.
It seems way to slow for me.
I am loading via SQL interface. Is there a performance difference reading if
I use the cache ?
And nothing is persistent.
AL
--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/