I have one document per file and each file is to be converted to a feature
vector. Pretty much like standard feature construction for document
classification.
ThanksRishi
Date: Sun, 5 Jul 2015 01:44:04 +1000
Subject: Re: Feature Generation On Spark
From: guha.a...@gmail.com
To: mici...@gmail.com
Hi
Thanks, I guess this will solve my problem. I will load mutiple files using
wildcard's likes *.csv. I guess if I use wholeTextFile instead of textFile, I
will get whole file contents as value which will in turn ensure one feature
vector per file.
thanksNitin
> Date: Sat, 4 Jul 2015 09:37:52 -