[ 
https://issues.apache.org/jira/browse/TEZ-1652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14168008#comment-14168008
 ] 

Prakash Ramachandran commented on TEZ-1652:
-------------------------------------------

[~jeagles] can look at optimizing the counters de-serialization code. however I 
was wondering if we need the counters for the swimlane view? if not can we 
avoid de-serializing it altogether. since it would save deserializing 40k 
objects. something more in the lines of creating a separate taskattempt model 
for swimlanes. 
{code}
App.SwimlaneTaskAttempt = DS.Model.extend({
... only properties required by swimlanes..
});

//translate requested model type to the path fragment in url
var typeToPathMap = {
  ...
  'swimlaneTaskAttempt': TEZ_TASK_ATTEMPT_ID
}

App.SwimlaneTaskAttemptSerializer = {
  ... serialize only properties required by swimlanes delete other propreties.
}

store.findQuery('swimlaneTaskAttempt', queryParams);
{code} 
will this work for swimlanes?

also assuming each task attempt json is 8k for 10k attempts it would have to 
download somewhere around 80MB and parse it. most likely this is going to keep 
the javascript in a tight loop and browser reporting that the page is not 
responding. if the findQuery is moved from model to the controller we can then 
fetch in batches . 
{code}
// just a rough thought.
batchSize = 1000; // always fetch one more than this.

fetchNextBatch(batchSize, null);

fetchNextBatch: function(size, lastid) {
   var queryParams = { limit: size  };
   if (last) queryParams.fromId = lastid;
   var promise  = store.findQuery(model, queryParams)
   promise.then(handleData);
   promise.catch(errorHandler);
}

handleData: function(taskAttempts) {
   allTaskAttempts.push(taskAttempts.slice(0, batchSize)); // we fetch one more 
than batch size
   if (taskAttempts.length < batchSize) { // need to fetch more
       Ember.run.once(this, fetchOnebatch, batchsize + 1, taskAttempts[length])
   } else {
      // done - start rendering.
   }
}
{code}
I am doing something similar in ShowTasksViewController.loadEntities to fetch 
paginated data. 

I will look at optimizing the normalize path, but please make any necessary 
changes if you need, as I would take couple of days to look at it.

> Large job support for Swimlane view
> -----------------------------------
>
>                 Key: TEZ-1652
>                 URL: https://issues.apache.org/jira/browse/TEZ-1652
>             Project: Apache Tez
>          Issue Type: Sub-task
>            Reporter: Jonathan Eagles
>            Assignee: Jonathan Eagles
>         Attachments: TEZ-1652-v1.patch
>
>
> Issue is that by default, Timeline Server only returns 100 max entity results 
> per query. For query for all task attempts for dag. A limit should be high 
> enough to get all entries and yet provide responsiveness.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to