[
https://issues.apache.org/jira/browse/TEZ-2390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14521136#comment-14521136
]
Gopal V commented on TEZ-2390:
------------------------------
[~jeagles]: the patch looks good, except for the debug print statements.
{code}
containers = [Container(ev) for ev in self.events if ev.event
== "CONTAINER_LAUNCHED"]
+ for container in containers:
+ print(container)
...
if(l.find("[HISTORY]") != -1):
m = self.MAIN_RE.match(l)
+ print(m);
{code}
The AM regexes are likely to be very fragile going forward as people switch
on/off logging (my sub-second LLAP demos needs AM logging to be turned off).
The reason I was forced to parse AM logs was because ATS kept losing data.
Now that Tez AMs hang around until ATS acks all writes, I'm actually
contemplating throwing this whole thing away now that we've got TEZ-2076 (which
should work for 0.6.x as well, if backported).
It would be great if you can help review that, so that we can move swimlane
into a first-class analysis tool.
> tez-tools swimlane tool fails to parse large jobs >8K containers
> ----------------------------------------------------------------
>
> Key: TEZ-2390
> URL: https://issues.apache.org/jira/browse/TEZ-2390
> Project: Apache Tez
> Issue Type: Bug
> Reporter: Jonathan Eagles
> Assignee: Jonathan Eagles
> Attachments: TEZ-2390.1.patch
>
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)