Wenhai created ASTERIXDB-1777:
---------------------------------
Summary: Budget does not consider the runfile frame that should be
temporarily cached in massive memory.
Key: ASTERIXDB-1777
URL: https://issues.apache.org/jira/browse/ASTERIXDB-1777
Project: Apache AsterixDB
Issue Type: Improvement
Environment: MAC/Linux
Reporter: Wenhai
Assignee: Wenhai
Till now, we ensued that two cases should consider cache the frame in the
memory before we write (syncwrite) them onto the Runfile:
1. Replicate: In parallel sort case, if we have massive memory, we should cache
the framework in the memory before forward them onto distributed range
partitions.
2. ExternalSort: The current Sorter caches the frames by the constraint of
compiler.sortmemory in asterix-configuration.xml. In other words, we sort such
batch size of frames in one-shot. Actually, we can run faster if we configure
smaller sortmemory budget (in our memory-resident experiment, 64MB saves 20%
sort time as compared to that in 320MB), but the per-round sorted frames will
be write onto Runfile with 1:1 of the total data size. We can also consider
this case similar to the above Replicate case.
Still we are thinking the general cases like the above ...
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)