Rahul, I'm not sure what you mean by your question being "very
unprofessional". You can feel free to answer such questions here. You may
or may not receive an answer, and you shouldn't necessarily expect to have
your question answered within five hours.

I've never tried to do anything like your case. I imagine the easiest thing
would be to read and process each file individually, since you are
intending to produce a separate result for each. You could also look at
RDD.wholeTextFiles - maybe that will be of some use if your files are small
- but I don't know of any corresponding save method which would generate
files with different names from within a single RDD.


On Mon, Jul 14, 2014 at 2:30 PM, Rahul Bhojwani <rahulbhojwani2...@gmail.com
> wrote:

> I understand that the question is very unprofessional, but I am a newbie.
> If you could share some link where I can ask such questions, if not here.
>
> But please answer.
>
>
> On Mon, Jul 14, 2014 at 6:52 PM, Rahul Bhojwani <
> rahulbhojwani2...@gmail.com> wrote:
>
>> Hey, My question is for this situation:
>> Suppose we have 100000 files each containing list of features in each row.
>>
>> Task is that for each file cluster the features in that file and write
>> the corresponding cluster along with it in a new file.  So we have to
>> generate 100000 more files by applying clustering in each file
>> individually.
>>
>> So can I do it this way, that get rdd of list of files and apply map.
>> Inside the mapper function which will be handling each file, get another
>> spark context and use Mllib kmeans to get the clustered output file.
>>
>> Please suggest the appropriate method to tackle this problem.
>>
>> Thanks,
>> Rahul Kumar Bhojwani
>> 3rd year, B.Tech
>> Computer Science Engineering
>> National Institute Of Technology, Karnataka
>> 9945197359
>>
>
>
>
> --
> Rahul K Bhojwani
> 3rd Year B.Tech
> Computer Science and Engineering
> National Institute of Technology, Karnataka
>



-- 
Daniel Siegmann, Software Developer
Velos
Accelerating Machine Learning

440 NINTH AVENUE, 11TH FLOOR, NEW YORK, NY 10001
E: daniel.siegm...@velos.io W: www.velos.io

Reply via email to