I have the following index created:
CELL_CREATE_INDEX = '''
CREATE INDEX IF NOT EXISTS idx_cell_enodeb_id ON erilte01_cell
(PERIOD_START_TIME,REGION_ID)'''

So I used now the created INDEX and the EXPLAIN tells me I used it:

['PLAN']
["SELECT\n    P__Z0.PERIOD_START_TIME AS __C0_0,\n    P__Z0.REGION_ID AS
__C0_1,\n    P__Z0.MARKET_ID AS __C0_2,\n   
SUM(P__Z0.EUTRANCELLFDD_PMUETHPTIMEDL) AS __C0_3,\n   
SUM(P__Z0.EUTRANCELLFDD_PMPDCPVOLDLDRB) AS __C0_4,\n   
SUM(P__Z0.EUTRANCELLFDD_PMPDCPVOLDLDRBLASTTTI) AS __C0_5\nFROM
PUBLIC.ERILTE01_CELL P__Z0 USE INDEX (IDX_CELL_ENODEB_ID)\n    /*
PUBLIC.IDX_CELL_ENODEB_ID: PERIOD_START_TIME < TIMESTAMP '2020-08-04
00:00:00'\n        AND REGION_ID = 'NORTHEAST'\n        AND
PERIOD_START_TIME >= TIMESTAMP '2020-08-03 00:00:00'\n     */\nWHERE
(UPPER(P__Z0.MEASUREMENTNAME) = 'EUTRANCELLFDD')\n    AND ((P__Z0.DATALEVEL
= 'RAW')\n    AND ((P__Z0.DATATYPE = 'RAW')\n    AND
((P__Z0.PERIOD_START_TIME < TIMESTAMP '2020-08-04 00:00:00')\n    AND
((P__Z0.REGION_ID = 'NORTHEAST')\n    AND (P__Z0.PERIOD_START_TIME >=
TIMESTAMP '2020-08-03 00:00:00')))))\nGROUP BY P__Z0.PERIOD_START_TIME,
P__Z0.REGION_ID, P__Z0.MARKET_ID"]
['SELECT\n    EUTRANCELLFDD__Z1.PERIOD_START_TIME AS PERIOD_START_TIME,\n   
EUTRANCELLFDD__Z1.REGION,\n    EUTRANCELLFDD__Z1.MARKET,\n   
ROUND(DECODE((NVL(EUTRANCELLFDD__Z1.PMUETHPTIMEDL, 0) / 1000), 0, 0,
((NVL(EUTRANCELLFDD__Z1.PMPDCPVOLDLDRB, 0) -
NVL(EUTRANCELLFDD__Z1.PMPDCPVOLDLDRBLASTTTI, 0)) /
(NVL(EUTRANCELLFDD__Z1.PMUETHPTIMEDL, 0) / 1000))), 6) AS DLTHROUGHPU\nFROM
(\n    SELECT\n        __C0_0 AS PERIOD_START_TIME,\n        __C0_1 AS
REGION,\n        __C0_2 AS MARKET,\n        CAST(CAST(SUM(__C0_3) AS DOUBLE)
AS DOUBLE) AS PMUETHPTIMEDL,\n        CAST(CAST(SUM(__C0_4) AS DOUBLE) AS
DOUBLE) AS PMPDCPVOLDLDRB,\n        CAST(CAST(SUM(__C0_5) AS DOUBLE) AS
DOUBLE) AS PMPDCPVOLDLDRBLASTTTI\n    FROM PUBLIC.__T0\n    GROUP BY __C0_0,
__C0_1, __C0_2\n) EUTRANCELLFDD__Z1\n    /* SELECT\n        __C0_0 AS
PERIOD_START_TIME,\n        __C0_1 AS REGION,\n        __C0_2 AS MARKET,\n      
 
CAST(CAST(SUM(__C0_3) AS DOUBLE) AS DOUBLE) AS PMUETHPTIMEDL,\n       
CAST(CAST(SUM(__C0_4) AS DOUBLE) AS DOUBLE) AS PMPDCPVOLDLDRB,\n       
CAST(CAST(SUM(__C0_5) AS DOUBLE) AS DOUBLE) AS PMPDCPVOLDLDRBLASTTTI\n   
FROM PUBLIC.__T0\n        /++ PUBLIC."merge_scan" ++/\n    GROUP BY __C0_0,
__C0_1, __C0_2\n     */']

Here is the query on 2.5Million rows, on a table that has ~1500 columns:
ALL_QUERY = '''select
        EUTRANCELLFDD.period_start_time period_start_time,
        EUTRANCELLFDD.REGION,
        EUTRANCELLFDD.MARKET,
        round(DECODE((NVL(EUTRANCELLFDD.PMUETHPTIMEDL,0)/1000), 0,
0,(NVL(EUTRANCELLFDD.PMPDCPVOLDLDRB,0)-NVL(EUTRANCELLFDD.PMPDCPVOLDLDRBLASTTTI,0))/(NVL(EUTRANCELLFDD.PMUETHPTIMEDL,0)/1000)),
6) DLThroughpu
      from

      (
      select
        period_start_time,
        p.REGION_ID REGION,
        p.MARKET_ID MARKET,
        SUM(EUTRANCELLFDD_PMUETHPTIMEDL) PMUETHPTIMEDL,
        SUM(EUTRANCELLFDD_PMPDCPVOLDLDRB) PMPDCPVOLDLDRB,
        SUM(EUTRANCELLFDD_PMPDCPVOLDLDRBLASTTTI) PMPDCPVOLDLDRBLASTTTI

      from
        erilte01_cell p
        USE INDEX(idx_cell_enodeb_id)
      where
         p.REGION_ID in ( 'NORTHEAST' )  and
        period_start_time >= '2020-08-03 00:00:00' and
        period_start_time < '2020-08-04 00:00:00'   and
        p.datatype ='RAW' and
        p.datalevel ='RAW' and
        UPPER(p.measurementname) = UPPER('EUTRANCELLFDD')
      group by
        period_start_time,
        p.REGION_ID,
        p.MARKET_ID) EUTRANCELLFDD'''

And it returns in about 7 seconds in a 18Host cluster with about 3TB memory.
It seems way to slow for me. 

I am loading via SQL interface. Is there a performance difference reading if
I use the cache ?
And nothing is persistent.
AL



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Reply via email to