[
https://issues.apache.org/jira/browse/FALCON-221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13883943#comment-13883943
]
Srikanth Sundarrajan commented on FALCON-221:
---------------------------------------------
These are my observations so far (specific to hadoop-1).
When JT retires the job when the user limit exceeds, JobInProgress object is
stripped off all the TaskCompletionEvents, at which point It is still possible
to retrieve RunningJob handle for a job, however the TaskCompletionEvents are
empty. Subsequently the job is moved out of retired cache, when the RunningJob
is no longer available.
Falcon needs to consider the possibility of a workflow that runs for a long
duration resulting in scenarios where RunningJob / TaskCompletionEvents are
inaccessible.
Proposed fix: We can use the job's tracking url which is active for a much
longer duration (where JT auto-redirects with HTTP 302 to history file), parse
the contents of history and move the contents of the task logs.
I am working on a patch along these lines. If there are any concerns please do
chime in.
> Logmover is not copying all action level logs
> ---------------------------------------------
>
> Key: FALCON-221
> URL: https://issues.apache.org/jira/browse/FALCON-221
> Project: Falcon
> Issue Type: Bug
> Components: archival
> Affects Versions: 0.3
> Reporter: Pracheer Agarwal
> Priority: Minor
>
> Log mover copies the action level logs and oozie logs of a worklfow to hdfs.
> My workflow has 6 actions. Logs of 2-3 actions are getting copied to hdfs.
> Logs for all the actions are not available at hdfs.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)