Ok, let re modify my requirement. I should have specified in the beginning itself.
I need to get count of records in an HDFS file created by a PIG script and the store the count in a text file. This should be done automatically on a daily basis without manual intervention On Mon, May 13, 2013 at 11:13 AM, Rahul Bhattacharjee < [email protected]> wrote: > How about the second approach , get the application/job id which the pig > creates and submits to cluster and then find the job output counter for > that job from the JT. > > Thanks, > Rahul > > > On Mon, May 13, 2013 at 11:37 PM, Mix Nin <[email protected]> wrote: > >> It is a text file. >> >> If we want to use wc, we need to copy file from HDFS and then use wc, and >> this may take time. Is there a way without copying file from HDFS to local >> directory? >> >> Thanks >> >> >> On Mon, May 13, 2013 at 11:04 AM, Rahul Bhattacharjee < >> [email protected]> wrote: >> >>> few pointers. >>> >>> what kind of files are we talking about. for text you can use wc , for >>> avro data files you can use avro-tools. >>> >>> or get the job that pig is generating , get the counters for that job >>> from the jt of your hadoop cluster. >>> >>> Thanks, >>> Rahul >>> >>> >>> On Mon, May 13, 2013 at 11:21 PM, Mix Nin <[email protected]> wrote: >>> >>>> Hello, >>>> >>>> What is the bets way to get the count of records in an HDFS file >>>> generated by a PIG script. >>>> >>>> Thanks >>>> >>>> >>> >> >
