Well, it uses index now: /* PUBLIC.IDX_CELL_ENODEB_ID: PERIOD_START_TIME < TIMESTAMP '2020-08-04 00:00:00'\n AND REGION_ID = 'NORTHEAST'\n AND PERIOD_START_TIME >= TIMESTAMP '2020-08-03 00:00:00'\n */
Is it better now, after using the hint for index? How much time was it taking before? Evgenii ср, 12 авг. 2020 г. в 16:23, Axel Luft <axel.l...@t-mobile.com>: > I have the following index created: > CELL_CREATE_INDEX = ''' > CREATE INDEX IF NOT EXISTS idx_cell_enodeb_id ON erilte01_cell > (PERIOD_START_TIME,REGION_ID)''' > > So I used now the created INDEX and the EXPLAIN tells me I used it: > > ['PLAN'] > ["SELECT\n P__Z0.PERIOD_START_TIME AS __C0_0,\n P__Z0.REGION_ID AS > __C0_1,\n P__Z0.MARKET_ID AS __C0_2,\n > SUM(P__Z0.EUTRANCELLFDD_PMUETHPTIMEDL) AS __C0_3,\n > SUM(P__Z0.EUTRANCELLFDD_PMPDCPVOLDLDRB) AS __C0_4,\n > SUM(P__Z0.EUTRANCELLFDD_PMPDCPVOLDLDRBLASTTTI) AS __C0_5\nFROM > PUBLIC.ERILTE01_CELL P__Z0 USE INDEX (IDX_CELL_ENODEB_ID)\n /* > PUBLIC.IDX_CELL_ENODEB_ID: PERIOD_START_TIME < TIMESTAMP '2020-08-04 > 00:00:00'\n AND REGION_ID = 'NORTHEAST'\n AND > PERIOD_START_TIME >= TIMESTAMP '2020-08-03 00:00:00'\n */\nWHERE > (UPPER(P__Z0.MEASUREMENTNAME) = 'EUTRANCELLFDD')\n AND ((P__Z0.DATALEVEL > = 'RAW')\n AND ((P__Z0.DATATYPE = 'RAW')\n AND > ((P__Z0.PERIOD_START_TIME < TIMESTAMP '2020-08-04 00:00:00')\n AND > ((P__Z0.REGION_ID = 'NORTHEAST')\n AND (P__Z0.PERIOD_START_TIME >= > TIMESTAMP '2020-08-03 00:00:00')))))\nGROUP BY P__Z0.PERIOD_START_TIME, > P__Z0.REGION_ID, P__Z0.MARKET_ID"] > ['SELECT\n EUTRANCELLFDD__Z1.PERIOD_START_TIME AS PERIOD_START_TIME,\n > > EUTRANCELLFDD__Z1.REGION,\n EUTRANCELLFDD__Z1.MARKET,\n > ROUND(DECODE((NVL(EUTRANCELLFDD__Z1.PMUETHPTIMEDL, 0) / 1000), 0, 0, > ((NVL(EUTRANCELLFDD__Z1.PMPDCPVOLDLDRB, 0) - > NVL(EUTRANCELLFDD__Z1.PMPDCPVOLDLDRBLASTTTI, 0)) / > (NVL(EUTRANCELLFDD__Z1.PMUETHPTIMEDL, 0) / 1000))), 6) AS DLTHROUGHPU\nFROM > (\n SELECT\n __C0_0 AS PERIOD_START_TIME,\n __C0_1 AS > REGION,\n __C0_2 AS MARKET,\n CAST(CAST(SUM(__C0_3) AS > DOUBLE) > AS DOUBLE) AS PMUETHPTIMEDL,\n CAST(CAST(SUM(__C0_4) AS DOUBLE) AS > DOUBLE) AS PMPDCPVOLDLDRB,\n CAST(CAST(SUM(__C0_5) AS DOUBLE) AS > DOUBLE) AS PMPDCPVOLDLDRBLASTTTI\n FROM PUBLIC.__T0\n GROUP BY > __C0_0, > __C0_1, __C0_2\n) EUTRANCELLFDD__Z1\n /* SELECT\n __C0_0 AS > PERIOD_START_TIME,\n __C0_1 AS REGION,\n __C0_2 AS > MARKET,\n > CAST(CAST(SUM(__C0_3) AS DOUBLE) AS DOUBLE) AS PMUETHPTIMEDL,\n > CAST(CAST(SUM(__C0_4) AS DOUBLE) AS DOUBLE) AS PMPDCPVOLDLDRB,\n > CAST(CAST(SUM(__C0_5) AS DOUBLE) AS DOUBLE) AS PMPDCPVOLDLDRBLASTTTI\n > FROM PUBLIC.__T0\n /++ PUBLIC."merge_scan" ++/\n GROUP BY __C0_0, > __C0_1, __C0_2\n */'] > > Here is the query on 2.5Million rows, on a table that has ~1500 columns: > ALL_QUERY = '''select > EUTRANCELLFDD.period_start_time period_start_time, > EUTRANCELLFDD.REGION, > EUTRANCELLFDD.MARKET, > round(DECODE((NVL(EUTRANCELLFDD.PMUETHPTIMEDL,0)/1000), 0, > > 0,(NVL(EUTRANCELLFDD.PMPDCPVOLDLDRB,0)-NVL(EUTRANCELLFDD.PMPDCPVOLDLDRBLASTTTI,0))/(NVL(EUTRANCELLFDD.PMUETHPTIMEDL,0)/1000)), > 6) DLThroughpu > from > > ( > select > period_start_time, > p.REGION_ID REGION, > p.MARKET_ID MARKET, > SUM(EUTRANCELLFDD_PMUETHPTIMEDL) PMUETHPTIMEDL, > SUM(EUTRANCELLFDD_PMPDCPVOLDLDRB) PMPDCPVOLDLDRB, > SUM(EUTRANCELLFDD_PMPDCPVOLDLDRBLASTTTI) PMPDCPVOLDLDRBLASTTTI > > from > erilte01_cell p > USE INDEX(idx_cell_enodeb_id) > where > p.REGION_ID in ( 'NORTHEAST' ) and > period_start_time >= '2020-08-03 00:00:00' and > period_start_time < '2020-08-04 00:00:00' and > p.datatype ='RAW' and > p.datalevel ='RAW' and > UPPER(p.measurementname) = UPPER('EUTRANCELLFDD') > group by > period_start_time, > p.REGION_ID, > p.MARKET_ID) EUTRANCELLFDD''' > > And it returns in about 7 seconds in a 18Host cluster with about 3TB > memory. > It seems way to slow for me. > > I am loading via SQL interface. Is there a performance difference reading > if > I use the cache ? > And nothing is persistent. > AL > > > > -- > Sent from: http://apache-ignite-users.70518.x6.nabble.com/ >