Re: Text

Jörn Franke Fri, 27 Jan 2017 05:59:26 -0800

Sorry the message was not complete: the key is the file position, so if you 
sort by key the lines will be in the same order as in the original file


> On 27 Jan 2017, at 14:45, Jörn Franke <jornfra...@gmail.com> wrote:
> 
> I agree with the previous statements. You cannot expect any ordering 
> guarantee. This means you need to ensure that the same ordering is done as 
> the original file. Internally Spark is using the Hadoop Client libraries - 
> even if you do not have Hadoop installed, because it is a flexible 
> transparent solution to access many file systems including the local one. In 
> the case you mentioned it is the TextInputFileFormat that returns a key and 
> the value. The key i
> This means you can sort by the key.
> However to access this key you must use the hadoopFile method of Sparl 
> together with the TextInputFormat.
> 
>> On 27 Jan 2017, at 10:44, Soheila S. <soheila...@gmail.com> wrote:
>> 
>> Hi All,
>> I read a test file using sparkContext.textfile(filename) and assign it to an 
>> RDD and process the RDD (replace some words) and finally write it to a text 
>> file using rdd.saveAsTextFile(output).
>> Is there any way to be sure the order of the sentences will not be changed? 
>> I need to have the same text with some corrected words.
>> 
>> thanks!
>> 
>> Soheila

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Re: Text

Reply via email to