Okay... Let's not worry about R, BigDecimal & precision for time being. I might have been looking at wrong values. So let's hold that thought.
Let's take a simple example for getting Y-Hat values using Multiple Regression given in this PDF: http://www.utdallas.edu/~herve/abdi-prc-pretty.pdf I created a small CSV called, students.csv that contains the following data: s1 14 4 1 s2 23 4 2 s3 30 7 2 s4 50 7 4 s5 39 10 3 s6 67 10 6 Col headers: Student id, Memory span(Y), age(X1), speech rate(X2) Now the expected results are: yHat[0]:15.166666666666668 yHat[1]:24.666666666666668 yHat[2]:27.666666666666664 yHat[3]:46.666666666666664 yHat[4]:40.166666666666664 yHat[5]:68.66666666666667 This is based on the following equation (given in the PDF): Y = 1.67 + X1 + 9.50 X2 I wrote the following small quick and dirty code to use OLSMultipleLinearRegression. The 'calculateHat()' method returns a RealMatrix, but I can't see the above results in there. Am I using this class correctly? Please let me know. Thanks. private static void regression1() { double[][] X = new double[6][2]; double[] Y = new double[6]; try { File file = new File("C:\\students.csv"); FileReader reader = new FileReader(file); BufferedReader in = new BufferedReader(reader); String line; int count = 0; while ((line = in.readLine()) != null) { // System.out.println(line); Scanner scanner = new Scanner(line); scanner.useDelimiter(" "); String[] cols = new String[4]; int col = 0; while (scanner.hasNext()) { cols[col++] = scanner.next(); } Y[count] = Double.valueOf(cols[1]); X[count] [0] = Double.valueOf(cols[2]); X[count] [1] = Double.valueOf(cols[3]); count++; } in.close(); reader.close(); } catch (IOException e) { e.printStackTrace(); } OLSMultipleLinearRegression regression = new OLSMultipleLinearRegression(); regression.newSampleData(Y, X); RealMatrix matrix = regression.calculateHat(); System.out.println("matrix:" + matrix.getColumnDimension()); } On Fri, Feb 12, 2010 at 12:08 PM, Ted Dunning <[email protected]> wrote: > It is not a precision issue. R and commons-math use different algorithms > with the same underlying numerical implementation. > > It is even an open question which result is better. R has lots of > credibility, but I have found cases where it lacked precision (and I coded > up a patch that was accepted). > > Unbounded precision integers and rationals are very useful, but not usually > for large scale numerical programming. Except in a very few cases, if you > need more than 17 digits of precision, you have other very serious problems > that precision won't help. > > On Fri, Feb 12, 2010 at 1:40 AM, Andy Turner <[email protected] > >wrote: > > > Interesting that this is a precision issue. I'm not surprised depending > on > > what you are doing, double precision may not be enough. It depends a lot > on > > how the calculations are broken into smaller parts. BigDecimal is > > fantastically useful... > > > > > > -- > Ted Dunning, CTO > DeepDyve >
