Re: AsterixDB Performance Tuning

Taewoo Kim Thu, 25 Jan 2018 22:37:17 -0800

Hi Rana,

The field order in the search predicate might be related to the factor.
Here are my two questions.


1) Can you paste the optimized plan?

2) Can you create an index on LABEVENTS.FLAG?

3) Can you switch the predicate order? (e.g., moving E.FLAG='abnormal' and
I.FLUID='Blood' and I.LABEL='Haptoglobin' to the front and place the join
conditions after that)




Best,
Taewoo

On Thu, Jan 25, 2018 at 8:46 PM, Rana Alotaibi <[email protected]>
wrote:

> Hi there,
>
> I have a query that takes ~12.7mins on average (I have excluded the
> warm-up time which was 30mins)!, and I would like to make sure that I
> didn't miss any performance tuning parameters ( I have run the same query
> on MongoDB, and it took ~2mins).
>
> The query asks to find all patients that have 'abnormal' haptoglobin blood
> test result. (The query result can have duplicate values).
>
> *Query:*
> USE mimiciii;
> SET `compiler.parallelism` "5";
> SET `compiler.sortmemory` "128MB";
> SET `compiler.joinmemory` "265MB";
> SELECT P.SUBJECT_ID
> FROM   LABITEMS I, PATIENTS P, P.ADMISSIONS A, A.LABEVENTS E
> WHERE I.ITEMID=E.ITEMID AND
>              E.FLAG = 'abnormal' AND
>              I.FLUID='Blood' AND
>              I.LABEL='Haptoglobin'
>
> *Datasets Schema:*
> - PATIENTS  and LABITEMS datasets have an open schema.
> - LABITEMS's  primary key is ITEMID
> - PATIENTS 's primary key is SUBJECT_ID
> - The JSON schema for both datasets is attached.
> - The DDL for both datasets is attached
>
> *Performance Tuning Parameters:*
> - 4 partitions (iodevices)
> - The total memory size is : 125GB, and I have assigned ~ 57GB to the
> buffercache (storage.buffercache.size).
> - As you can see from the query, I set the parallelism to 5,  sort-memory
> to 128MB, join-memory to 265MB.
> - The data size is 7GB
>
> Your feedback is highly appreciated!
>
> --Rana
>

Re: AsterixDB Performance Tuning

Reply via email to