With such a tiny training set, you will have to go through it several times. In fact, by setting the stepOffset to 1000, you have basically said that you are going to show the algorithm several thousand examples. You should also randomize the order.
You should also use some held out data for testing performance. On Mon, Jun 25, 2012 at 10:51 PM, damodar shetyo <[email protected]>wrote: > Hi, > > I am trying to build a simple model that can group points in 2D space.Am > training the model by giving few examples.After that i am using the model > to predict the group in which the any other points may fall.But am not > getting answer as expected.Am i missing something in my code or am i doing > something wrong? > > public static void main(String[] args) { > > // points at (index%2)==0 belong to cluster 0 Eg (0,0) (0,1) > // points at index%2 != 0 belong to cluster 1 > > double [][] points = > {{0,0},{8,8},{0,1},{9,9},{1,0},{8,9},{1,1},{9,8}}; > > > OnlineLogisticRegression learningAlgo = new > OnlineLogisticRegression(); > learningAlgo = new OnlineLogisticRegression(2, 2, new L1()); > learningAlgo.alpha(1).stepOffset(1000); > > int i =0; > System.out.println("training model \n" ); > for(double point [] : points ){ > Vector v = new RandomAccessSparseVector(2); > v.set(0, point[0]); > v.set(1, point[1]); > learningAlgo.train(i%2, v); > i++; > } > > learningAlgo.close(); > > > //now classify real data > Vector v = new RandomAccessSparseVector(2); > v.set(0, 0); > v.set(1, 1); > > Vector r = learningAlgo.classifyFull(v); > System.out.println(r); > > System.out.println("ans = " ); > System.out.println("Probability of cluster 0 = " + r.get(0)); > System.out.println("Probability of cluster 1 = " + r.get(1)); > > } > > op = > > {0:0.45938608354117305,1:0.540613916458827} > ans = > Probability of cluster 0 = 0.45938608354117305 > Probability of cluster 1 = 0.540613916458827 > > 99 % of times the output show more probability for cluster 1.Why? > > -- > Regards, > Damodar Shetyo >
