The minhash algorithm itself should work as well with non-English text. It is likely that the input phases where the text is analyzed would not work correctly, however.
On Tue, Dec 4, 2012 at 6:05 PM, Varun Thacker <[email protected]>wrote: > I'd tried out the MinHash algorithm in mahout using the Reuters data set. > > I have a doubt as to whether all these algorithms would work if the data is > not in English. > > -- > > > Regards, > Varun Thacker > http://www.vthacker.in/ >
