Re: Hyracks Job Requirement Configuration

Rana Alotaibi Sun, 28 Jan 2018 19:51:58 -0800

*- Do you see all cores being fully utilized during the query execution? *
 I have noticed only 6 cores were utilized


*- How much time does the query take right now and how do you measure the
query execution time? Do you wait for the result to be printed somewhere
(e.g. in the browser)?*
I'm using the HTTP APIs. The response is a JSON object that includes the
query execution time:
   { "status": "success",
        "metrics": {

* "elapsedTime": "434.627299814s",                "executionTime":
"434.626137977s",*
                "resultCount": 4943,
                "resultSize": 132293,-
                "processedObjects": 46875
        }
}
I run the query 10 times and took the average which is ~6mins.

*- You mentioned that you have 4 partitions, how many physical hard drives
are they mapped to?*
 One physical hard drive
*- **Also, increasing the sort/join memory doesn’t necessarily lead to a
better performance. Have you tried changing these values to something
smaller and seeing the effects?*
  Yes, I tried the following numbers:
  1) sort-memory: 32MB, join-memory: 64MB
  2) sort-memory: 64MB, join-memory: 128MB
  3) sort-memory: 128MB, join-memory:  265MB

The execution time remains on average ~6 - 6.5mins. I didn't see any
improvement. The configurations that I have now:
- compiler.parallelism :39 //Only 6 were utilized
- storage.buffercache.size: 20GB
- storage.buffercache.pagesize: 1MB

Thanks,
Rana
On Sun, Jan 28, 2018 at 6:41 PM, Murtadha Hubail <[email protected]>
wrote:

> I have few questions if you don’t mind:
>
> Do you see all cores being fully utilized during the query execution?
>
> How much time does the query take right now and how do you measure the
> query execution time? Do you wait for the result to be printed somewhere
> (e.g. in the browser)?
>
> You mentioned that you have 4 partitions, how many physical hard drives
> are they mapped to?
>
> Also, increasing the sort/join memory doesn’t necessarily lead to a better
> performance. Have you tried changing these values to something smaller and
> seeing the effects?
>
>
>
> Cheers,
>
> Murtadha
>
>
>
> *From: *Rana Alotaibi <[email protected]>
> *Date: *Monday, 29 January 2018 at 5:21 AM
> *To: *<[email protected]>
> *Cc: *<[email protected]>, <[email protected]>
> *Subject: *Re: Hyracks Job Requirement Configuration
>
>
>
> Thanks Murtadha! The problem solved. However, increasing the number of
> cores didn't help to improve the performance of that query.
>
> On Sun, Jan 28, 2018 at 5:05 PM, Murtadha Hubail <[email protected]>
> wrote:
>
> Hi Rana,
>
> The memory used for query processing is automatically calculated as
> follows:
> JVM Max Memory - storage.buffercache.size - storage.memorycomponent.
> globalbudget
>
> The documentation defaults for these parameters are outdated. The default
> value for storage.buffercache.size is (JVM Max Memory / 4) and it's the
> same for storage.memorycomponent.globalbudget. Since your dataset is
> already loaded, you could reduce the budget of 
> storage.memorycomponent.globalbudget.
> In addition, if I recall correctly, your dataset size is way smaller than
> what's allocated for the buffer cache, so you might want to reduce the
> buffer cache budget. That should give you more than enough memory to
> execute on 39 cores.
>
> Cheers,
> Murtadha
>
>
> On 01/29/2018, 3:30 AM, "Mike Carey" <[email protected]> wrote:
>
>     + dev
>
>
>     On 1/28/18 3:37 PM, Rana Alotaibi wrote:
>     > Hi all,
>     >
>     > I would like to make AsterixDB utilizes all available CPU cores (39)
>     > that I have for the following query:
>     >
>     > USE mimiciii;
>     > SET `compiler.parallelism` "39";
>     > SET `compiler.sortmemory` "128MB";
>     > SET `compiler.joinmemory` "265MB";
>     > SELECT P.SUBJECT_ID
>     > FROM   LABITEMS I, PATIENTS P, P.ADMISSIONS A, A.LABEVENTS E
>     > WHERE E.ITEMID/*+bcast*/=I.ITEMID AND
>     >              E.FLAG = 'abnormal' AND
>     >              I.FLUID='Blood' AND
>     >              I.LABEL='Haptoglobin'
>     >
>     >
>     > The total memory size that I have is 125GB(57GB for the AsterixDB
>     > buffer cache). By running the above query, I got the following error:
>     >
>     > "msg": "HYR0009: Job requirement (memory: 10705403904 bytes, CPU
>     > cores: 39) exceeds capacity (memory: 3258744832 <(325)%20874-4832>
> bytes, CPU cores: 39)"
>     >
>     > How can I change this capacity default configuration? I'm looking
> into
>     > this page : https://asterixdb.apache.org/docs/0.9.2/ncservice.html .
>     > Could you please point me to the appropriate configuration parameter?
>     >
>     > Thanks
>     > -- Rana
>     >
>     >
>     >
>     >
>
>
>
>
>

Re: Hyracks Job Requirement Configuration

Reply via email to