Case is not clear :) Can you write examples: I read A, see B and then produce C,D.
>>b which from the data stream categorize data according to.... These delails doesn't halp to understand what you want from oozie, sorry. 2014-09-25 15:06 GMT+04:00 Jakub Stransky <[email protected]>: > Hello experienced oozie users, > > I am new to apache oozie and I am facing following task which I don't know > how to solve according "best" practices if even possible.It would be great > to get several opinions on it. So the situation is following: > > I have a MR job which from the data stream categorize data according to > their date in several output directories. It is finite number but it is > huge and moving, e.g. last ten years -> 10 times 365 is the total number > of data buckets. Than I have an archive which essentially has the same data > folders according to date which is kind of accumulating the data from > various sources for a given date. The problem here is that during the MR > run we don't know the dates we are processing beforehand and we need to > "merge" those data to archive. We need first to check what dates the output > has and than assemble paths for "merge" MR job, which does merge, cleaning, > removing possible duplicities etc. Having predefined 10*365 jobs looks > horribly. > > I hope that the case is clear and I would be really grateful for any > thoughts or suggestions. > > Thanks > Jakub >
