Hi,

I have data that's collected and stored in hdfs all day.
I have 2 jobs(let's say job1 and job2) .

Right now job1 is scheduled to run at 2 am each day(and it runs some analysis 
on the previous day data). once this job completes it creates a hive folder 
/user/hive/warehouse/job1/year/month/prev_day(even though job1 runs today) and 
creates output file and a _success flag in this directory.

Now, I have to schedule job 2 like around 5 am but it should run only if the 
_success flag is created on the /user/hive/warehouse/job1/year/month/prev_day 
by job1 that ran around 2am.

for this I need to create a input dataset to look for _success flag.So i guess 
i need to configure a dataset URI in the form of 
/user/hive/warehouse/job1/year/month/prev_day

But I don't know how to configure the URI to point to previous day and I am 
having some difficulty in configuring the URI for previous day.

Any help would be appreciated!

Thanks,
Senthil


This message contains information which may be proprietary or confidential, and 
may include material non-public information. Disclosing information in this 
message to any person outside of the company, or trading securities when you 
have material non-public information, is prohibited by our Business Ethics 
Policy<http://current.sabre.com/orgs/Legal/Pages/SP_BusinessEthicsPolicy.aspx>.

Reply via email to