This is not a task for tika - you should use Lucene to do this On Thu, Dec 22, 2011 at 3:32 AM, Periya.Data <[email protected]> wrote: > Hi all, > What is the suggested way to remove stop words from a given text > document using Tika? I am able to extract the contents of a PDF document and > have it in a nice text format. Now I am looking for ways to remove stop > words and stem words. > > Suggestions are very much appreciated. > > Thanks, > PD.
-- With best wishes, Alex Ott http://alexott.net/ Tiwtter: alexott_en (English), alexott (Russian) Skype: alex.ott
