In my classification code, I create the model easily using Map, Reduce. But
it has become difficult to do classification with big datasets.  For big
dataset like wikipedia it has become difficult to load the data into
memory(even though it takes only 600MB on the disk). it shoots past 2.5GB
when i use a HashMap<String, HashMap<String, Float>> to store the weights.I
wish there was this big matrix server out there and all i had to do to fetch
a data was call fetch(row, column).


I am trying to put th data on Hbase

Please tell me if there are simpler solutions to do this using hadoop. or
any other package

Robin

Reply via email to