Yes, we are now trying 1.x. And having this option will be great. I have never 
file a JIRA in ASF, but will do it.
Thanks,
Marc

-----Original Message-----
From: Harsh J [mailto:ha...@cloudera.com]
Sent: Monday, April 09, 2012 3:46 PM
To: mapreduce-user@hadoop.apache.org
Subject: Re: mapreduce line separator question

Marc,

The answer depends on the Hadoop version you are running. The following requires
https://issues.apache.org/jira/browse/MAPREDUCE-2254 which is present currently 
in 0.23 (and eventually 2.x) and also (last I checked) in
CDH3 if you use that:

Simply set "textinputformat.record.delimiter" in your Job's configuration to 
the exact character string you need, and that will get used as a record/line 
delimiter in TextInputFormat. The string can also be multi-character, and the 
records would be read based to that provided sequence.

Its unavailable presently in 1.x, but it appears harmless to add this in and if 
you can file a JIRA with a backport I can review and commit it in for a future 
1.x update.

On Tue, Apr 10, 2012 at 12:31 AM, Marc Sturm <mas9...@nyp.org> wrote:
> Hi,
>
> I am new to Mapreduce and I have a short question: is it possible for
> a MapReduce job to split the lines of a file with \n and ignore \r?
> Basically, in the use case I am looking into, the \r has to be
> included when reading a line.
>
> I am just "playing" with mapreduce with a standalone hadoop, not using
> hdfs, and I am looking into writing my own LineReader but I am afraid
> it is much more complicated than this. I can also update each line and
> replace the \r with a \t, but I rather leave the file and data as is.
>
> Any insight and/or link to the correct documentation will be appreciated.
>
> Thanks,
>
> Marc
>
>
>
>
> ________________________________
> This electronic message is intended to be for the use only of the
> named recipient, and may contain information that is confidential or 
> privileged.
> If you are not the intended recipient, you are hereby notified that
> any disclosure, copying, distribution or use of the contents of this
> message is strictly prohibited. If you have received this message in
> error or are not the named recipient, please notify us immediately by
> contacting the sender at the electronic mail address noted above, and
> delete and destroy all copies of this message. Thank you.
>
> --------------------
>
> This electronic message is intended to be for the use only of the
> named recipient, and may contain information that is confidential or 
> privileged.
> If you are not the intended recipient, you are hereby notified that
> any disclosure, copying, distribution or use of the contents of this
> message is strictly prohibited.  If you have received this message in
> error or are not the named recipient, please notify us immediately by
> contacting the sender at the electronic mail address noted above, and
> delete and destroy all copies of this message.  Thank you.
>
> --------------------
>
> This electronic message is intended to be for the use only of the
> named recipient, and may contain information that is confidential or 
> privileged.
> If you are not the intended recipient, you are hereby notified that
> any disclosure, copying, distribution or use of the contents of this
> message is strictly prohibited.  If you have received this message in
> error or are not the named recipient, please notify us immediately by
> contacting the sender at the electronic mail address noted above, and
> delete and destroy all copies of this message.  Thank you.
>
>



--
Harsh J

This electronic message is intended to be for the use only of the named 
recipient, and may contain information that is confidential or privileged. If 
you are not the intended recipient, you are hereby notified that any 
disclosure, copying, distribution or use of the contents of this message is 
strictly prohibited. If you have received this message in error or are not the 
named recipient, please notify us immediately by contacting the sender at the 
electronic mail address noted above, and delete and destroy all copies of this 
message. Thank you.


--------------------

This electronic message is intended to be for the use only of the named 
recipient, and may contain information that is confidential or privileged.  If 
you are not the intended recipient, you are hereby notified that any 
disclosure, copying, distribution or use of the contents of this message is 
strictly prohibited.  If you have received this message in error or are not the 
named recipient, please notify us immediately by contacting the sender at the 
electronic mail address noted above, and delete and destroy all copies of this 
message.  Thank you.




--------------------

This electronic message is intended to be for the use only of the named 
recipient, and may contain information that is confidential or privileged.  If 
you are not the intended recipient, you are hereby notified that any 
disclosure, copying, distribution or use of the contents of this message is 
strictly prohibited.  If you have received this message in error or are not the 
named recipient, please notify us immediately by contacting the sender at the 
electronic mail address noted above, and delete and destroy all copies of this 
message.  Thank you.



Reply via email to