[
https://issues.apache.org/jira/browse/TEZ-1652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14168008#comment-14168008
]
Prakash Ramachandran commented on TEZ-1652:
-------------------------------------------
[~jeagles] can look at optimizing the counters de-serialization code. however I
was wondering if we need the counters for the swimlane view? if not can we
avoid de-serializing it altogether. since it would save deserializing 40k
objects. something more in the lines of creating a separate taskattempt model
for swimlanes.
{code}
App.SwimlaneTaskAttempt = DS.Model.extend({
... only properties required by swimlanes..
});
//translate requested model type to the path fragment in url
var typeToPathMap = {
...
'swimlaneTaskAttempt': TEZ_TASK_ATTEMPT_ID
}
App.SwimlaneTaskAttemptSerializer = {
... serialize only properties required by swimlanes delete other propreties.
}
store.findQuery('swimlaneTaskAttempt', queryParams);
{code}
will this work for swimlanes?
also assuming each task attempt json is 8k for 10k attempts it would have to
download somewhere around 80MB and parse it. most likely this is going to keep
the javascript in a tight loop and browser reporting that the page is not
responding. if the findQuery is moved from model to the controller we can then
fetch in batches .
{code}
// just a rough thought.
batchSize = 1000; // always fetch one more than this.
fetchNextBatch(batchSize, null);
fetchNextBatch: function(size, lastid) {
var queryParams = { limit: size };
if (last) queryParams.fromId = lastid;
var promise = store.findQuery(model, queryParams)
promise.then(handleData);
promise.catch(errorHandler);
}
handleData: function(taskAttempts) {
allTaskAttempts.push(taskAttempts.slice(0, batchSize)); // we fetch one more
than batch size
if (taskAttempts.length < batchSize) { // need to fetch more
Ember.run.once(this, fetchOnebatch, batchsize + 1, taskAttempts[length])
} else {
// done - start rendering.
}
}
{code}
I am doing something similar in ShowTasksViewController.loadEntities to fetch
paginated data.
I will look at optimizing the normalize path, but please make any necessary
changes if you need, as I would take couple of days to look at it.
> Large job support for Swimlane view
> -----------------------------------
>
> Key: TEZ-1652
> URL: https://issues.apache.org/jira/browse/TEZ-1652
> Project: Apache Tez
> Issue Type: Sub-task
> Reporter: Jonathan Eagles
> Assignee: Jonathan Eagles
> Attachments: TEZ-1652-v1.patch
>
>
> Issue is that by default, Timeline Server only returns 100 max entity results
> per query. For query for all task attempts for dag. A limit should be high
> enough to get all entries and yet provide responsiveness.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)