Re: Number of records in an HDFS file

Mix Nin Mon, 13 May 2013 11:16:53 -0700

Ok, let re modify my requirement. I should have specified in the beginning
itself.


I need to get count of records in an HDFS file created by a PIG script and
the store the count in a text file. This should be done automatically on a
daily basis without manual intervention


On Mon, May 13, 2013 at 11:13 AM, Rahul Bhattacharjee <
[email protected]> wrote:

> How about the second approach , get the application/job id which the pig
> creates and submits to cluster and then find the job output counter for
> that job from the JT.
>
> Thanks,
> Rahul
>
>
> On Mon, May 13, 2013 at 11:37 PM, Mix Nin <[email protected]> wrote:
>
>> It is a text file.
>>
>> If we want to use wc, we need to copy file from HDFS and then use wc, and
>> this may take time. Is there a way without copying file from HDFS to local
>> directory?
>>
>> Thanks
>>
>>
>> On Mon, May 13, 2013 at 11:04 AM, Rahul Bhattacharjee <
>> [email protected]> wrote:
>>
>>> few pointers.
>>>
>>> what kind of files are we talking about. for text you can use wc , for
>>> avro data files you can use avro-tools.
>>>
>>> or get the job that pig is generating , get the counters for that job
>>> from the jt of your hadoop cluster.
>>>
>>> Thanks,
>>>  Rahul
>>>
>>>
>>> On Mon, May 13, 2013 at 11:21 PM, Mix Nin <[email protected]> wrote:
>>>
>>>> Hello,
>>>>
>>>> What is the bets way to get the count of records in an HDFS file
>>>> generated by a PIG script.
>>>>
>>>> Thanks
>>>>
>>>>
>>>
>>
>

Re: Number of records in an HDFS file

Reply via email to