[
https://issues.apache.org/jira/browse/GEODE-2588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Christian Tzolov updated GEODE-2588:
------------------------------------
Description:
For Partition Region with 1 500 000 entries running on a single Geode member.
The OQL query *SELECT DISTINCT a, b FROM /region ORDER BY b* takes *13x* times
(*1300%*) more time compared to OQL *SELECT a, b FROM /region* + manual Java
sort of the result for the same dataset.
Setup: Geode 1.0.0 with Partition region with 1 500 000 objects, 4GB memory
1. OQL with DISTINCT/ORDER BY
{code}SELECT DISTINCT e.key,e.day FROM /partitionRegion e ORDER BY e.day{code}
OQL execution time: 64899 ms = *~65 sec*
2. OQL with manual sort
{code}SELECT e.key,e.day FROM /partitionRegion e{code}
and then
{code}
//OQL all -> 3058 ms
SelectResults result = (SelectResults) query.execute(bindings);
//Client-side sort -> 1830 ms
List<?> result2 = (List<?>) result.asList().parallelStream().sorted((o1, o2) ->
{
Struct st1 = (Struct) o1;
Struct st2 = (Struct) o2;
return ((Date) st1.get("day")).compareTo((Date) st2.get("day"));
}).collect(toList());
{code}
OQL execution time: 3058 ms,
Client-side sort time: 1830 ms
Total time: 4888 ms = *~5 sec*
Attached [^gemfire-oql-orderby-vs-on-client-sort-test-cases.zip] can demo the
problem (check the comments below).
Attached are also the JMC profiler [^flight_recording_OQL_ORDER_BY.jfr], logs
and vsd stats
The profiler suggests that most of the CPU goes to the
*OrderByComparator#evaluateSortCriteria* method:
!oql_with_order_by_hot_methods.png!
was:
For Partition Region with 1 500 000 entries running on a single Geode member.
The OQL query *SELECT DISTINCT a, b FROM /region ORDER BY b* takes *13x* times
(*1300%*) more time compared to OQL *SELECT a, b FROM /region* + manual Java
sort of the result for the same dataset.
Setup: Geode 1.0.0 with Partition region with 1 500 000 objects, 4GB memory
1. OQL with DISTINCT/ORDER BY
{code}SELECT DISTINCT e.key,e.day FROM /partitionRegion e ORDER BY e.day{code}
OQL execution time: 64899 ms = *~65 sec*
2. OQL with manual sort
{code}SELECT e.key,e.day FROM /partitionRegion e{code}
and then
{code}
//OQL all -> 3058 ms
SelectResults result = (SelectResults) query.execute(bindings);
//Client-side sort -> 1830 ms
List<?> result2 = (List<?>) result.asList().parallelStream().sorted((o1, o2) ->
{
Struct st1 = (Struct) o1;
Struct st2 = (Struct) o2;
return ((Date) st1.get("day")).compareTo((Date) st2.get("day"));
}).collect(toList());
{code}
OQL execution time: 3058 ms,
Client-side sort time: 1830 ms
Total time: 4888 ms = *~5 sec*
> OQL's ORDER BY takes 13x (1300%) more time compared to plain java sort for
> the same amount of data and same resources
> ---------------------------------------------------------------------------------------------------------------------
>
> Key: GEODE-2588
> URL: https://issues.apache.org/jira/browse/GEODE-2588
> Project: Geode
> Issue Type: Bug
> Components: querying
> Reporter: Christian Tzolov
> Attachments: flight_recording_OQL_ORDER_BY.jfr,
> gemfire_OQL_ORDER_BY.log,
> gemfire-oql-orderby-vs-on-client-sort-test-cases.zip,
> myStats_OQL_ORDER_BY.gfs, oql_with_order_by_hot_methods.png
>
>
> For Partition Region with 1 500 000 entries running on a single Geode member.
> The OQL query *SELECT DISTINCT a, b FROM /region ORDER BY b* takes *13x*
> times (*1300%*) more time compared to OQL *SELECT a, b FROM /region* +
> manual Java sort of the result for the same dataset.
> Setup: Geode 1.0.0 with Partition region with 1 500 000 objects, 4GB memory
> 1. OQL with DISTINCT/ORDER BY
> {code}SELECT DISTINCT e.key,e.day FROM /partitionRegion e ORDER BY e.day{code}
> OQL execution time: 64899 ms = *~65 sec*
> 2. OQL with manual sort
> {code}SELECT e.key,e.day FROM /partitionRegion e{code}
> and then
> {code}
> //OQL all -> 3058 ms
> SelectResults result = (SelectResults) query.execute(bindings);
> //Client-side sort -> 1830 ms
> List<?> result2 = (List<?>) result.asList().parallelStream().sorted((o1, o2)
> -> {
> Struct st1 = (Struct) o1;
> Struct st2 = (Struct) o2;
> return ((Date) st1.get("day")).compareTo((Date) st2.get("day"));
> }).collect(toList());
> {code}
> OQL execution time: 3058 ms,
> Client-side sort time: 1830 ms
> Total time: 4888 ms = *~5 sec*
> Attached [^gemfire-oql-orderby-vs-on-client-sort-test-cases.zip] can demo the
> problem (check the comments below).
> Attached are also the JMC profiler [^flight_recording_OQL_ORDER_BY.jfr], logs
> and vsd stats
> The profiler suggests that most of the CPU goes to the
> *OrderByComparator#evaluateSortCriteria* method:
> !oql_with_order_by_hot_methods.png!
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)