Hi there,

In our system, we have multiple pig scripts that run against a particular HDFS 
directory.  The pig scripts can run at different times, and are scheduled to 
run regularly.  Is there a way to point a pig script at the same directory for 
multiple executions, but make sure that it only processed new files that it 
hasn't seen before?  I was thinking of using a custom PathFilter for my loader, 
but I thought I would ask to see if there is already a way to do this, rather 
than me reinventing the wheel (!).

Thanks,
John.
</pre>****************************************************************************************<br>This
 email and any files transmitted with are confidential and intended solely for 
the<br>use of the individual or entity to whom they are addressed.  If you have 
received this<br>email in error then please delete it and notify the sender. Do 
not make a copy or forward<br>it to anyone.  This footnote also confirms that 
this email message has been swept for the<br>presence of computer 
viruses.<br><br>Adaptive Mobile Security Ltd, Ferry House, 48 Lower Mount 
Street, Dublin 2, Ireland<br>Directors: B. Collins, G. Maclachlan (UK), N. 
Grierson (UK), J. Ennis (UK), D. Summers (UK).<br>Registered in Ireland, 
Company No. 370343, VAT 
Reg.No.IE6390343O<br>****************************************************************************************</pre>

Reply via email to