Thanks for your response Azuryy.

My hadoop version: 2.0.0-cdh4.3.0
InputFormat: a custom class that extends from FileInputFormat(csv input format)
These fiels are under the same directory, different files.
My input path is configured using oozie throughout the propertie 
mapred.input.dir.


Same code and input running on Hadoop 2.0.0-cdh4.2.1 works fine. Does not 
discard any record.

Thanks.

De: Azuryy Yu <[email protected]<mailto:[email protected]>>
Responder a: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Fecha: jueves, 21 de noviembre de 2013 07:31
Para: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Asunto: Re: Missing records from HDFS

what's your hadoop version? and which InputFormat are you used?

these files under one directory or there are lots of subdirectory? how ddi you 
configure input path in your main?



On Thu, Nov 21, 2013 at 12:25 AM, ZORAIDA HIDALGO SANCHEZ 
<[email protected]<mailto:[email protected]>> wrote:
Hi all,

my job is not reading all the input records. In the input directory I have a 
set of files containing a total of 6000000 records but only 5997000 are 
processed. The Map Input Records counter says 5997000.
I have tried downloading the files with a getmerge to check how many records 
would return but the correct number is returned(6000000).

Do you have any suggestion?

Thanks.

________________________________

Este mensaje se dirige exclusivamente a su destinatario. Puede consultar 
nuestra política de envío y recepción de correo electrónico en el enlace 
situado más abajo.
This message is intended exclusively for its addressee. We only send and 
receive email on the basis of the terms set out at:
http://www.tid.es/ES/PAGINAS/disclaimer.aspx


________________________________

Este mensaje se dirige exclusivamente a su destinatario. Puede consultar 
nuestra política de envío y recepción de correo electrónico en el enlace 
situado más abajo.
This message is intended exclusively for its addressee. We only send and 
receive email on the basis of the terms set out at:
http://www.tid.es/ES/PAGINAS/disclaimer.aspx

Reply via email to