Github user merrimanr commented on the issue:
https://github.com/apache/metron/pull/1081
Looks good! One thing I'm trying to wrap my head around is how we get
status if we only have a job id or unique identifier for a job? JobStatus
doesn't have an id so I'm assuming resultPath is the unique identifier here.
As far as I can tell an instance of org.apache.hadoop.mapreduce.Job is kept
in memory and is responsible for reporting status. I can think of a couple
scenarios where this might be problematic.
One is if I ran a query from the CLI but then wanted to get status from
REST. How would that work? That's probably not a likely use case so maybe not
an issue there. What happens if I submit a query through REST and REST is
restarted while jobs are running? Do we lose job status information?
---