j143 commented on a change in pull request #952:
URL: https://github.com/apache/systemml/pull/952#discussion_r436232635
##########
File path: dev/docs/builtins-reference.md
##########
@@ -318,7 +320,47 @@ slicefinder(X,W, y, k, paq, S);
### Usage
```r
X = rand (rows = 50, cols = 10)
-y = X %*% rand(rows=ncol(X), 1)
+y = X %*% rand(rows = ncol(X), cols = 1)
w = lm(X = X, y = y)
ress = slicefinder(X = X,W = w, Y = y, k = 5, paq = 1, S = 2);
```
+
+## `confusionMatrix`-Function
+
+A `confusionMatrix` is a technique for summarizing the performance of a
classification algorithm.
+Calculating a confusion matrix can give you a better idea of what your
classification model is getting right and what types of errors it is making.
+This confusionMatrix function accepts two matrices with one column each, these
two matrices are vector for prediction and one-hot-encoded matrix respectively.
+Then it computes the max value of each vector and compare them, after whichit
calculates and returns the sum of classifications and the average of each true
class.
+
+### Usage
+```r
+confusionMatrix(P,Y)
+```
+
+### Arguments
+
+| Name | Type | Default | Description |
+| :------ | :------------- | :--- | :---------- |
+| P | Matrix[Double] | --- |vector of prediction |
Review comment:
Can this be made consistent? - i.e., after `|` one space only, and one
space in the end. Tables syntax for other functions is good for reference.
##########
File path: dev/docs/builtins-reference.md
##########
@@ -318,7 +320,47 @@ slicefinder(X,W, y, k, paq, S);
### Usage
```r
X = rand (rows = 50, cols = 10)
-y = X %*% rand(rows=ncol(X), 1)
+y = X %*% rand(rows = ncol(X), cols = 1)
w = lm(X = X, y = y)
ress = slicefinder(X = X,W = w, Y = y, k = 5, paq = 1, S = 2);
```
+
+## `confusionMatrix`-Function
+
+A `confusionMatrix` is a technique for summarizing the performance of a
classification algorithm.
+Calculating a confusion matrix can give you a better idea of what your
classification model is getting right and what types of errors it is making.
Review comment:
`you`, `your` if possible can be avoided, instead. So, it is (without
these two words!)
`a confusion matrix can give a better idea of what the classification model
is getting right.`
##########
File path: dev/docs/builtins-reference.md
##########
@@ -318,7 +320,47 @@ slicefinder(X,W, y, k, paq, S);
### Usage
```r
X = rand (rows = 50, cols = 10)
-y = X %*% rand(rows=ncol(X), 1)
+y = X %*% rand(rows = ncol(X), cols = 1)
w = lm(X = X, y = y)
ress = slicefinder(X = X,W = w, Y = y, k = 5, paq = 1, S = 2);
```
+
+## `confusionMatrix`-Function
+
+A `confusionMatrix` is a technique for summarizing the performance of a
classification algorithm.
+Calculating a confusion matrix can give you a better idea of what your
classification model is getting right and what types of errors it is making.
+This confusionMatrix function accepts two matrices with one column each, these
two matrices are vector for prediction and one-hot-encoded matrix respectively.
+Then it computes the max value of each vector and compare them, after whichit
calculates and returns the sum of classifications and the average of each true
class.
+
+### Usage
+```r
+confusionMatrix(P,Y)
+```
+
+### Arguments
+
+| Name | Type | Default | Description |
+| :------ | :------------- | :--- | :---------- |
+| P | Matrix[Double] | --- |vector of prediction |
+| Y | Matrix[Double] | --- | vector of Golden standard One
Hot Encoded|
+
+### Returns
+
+|Name | Type | Description |
+|:-----------------| :------------- | :---------- |
+|ConfusionSum | Matrix[Double] | The Confusion Matrix Sums of
classifications |
+|ConfusionAvg | Matrix[Double] | The Confusion Matrix averages of each
true class|
+
+### Example
+ #here numClasses is assigned to 1 as numClasses is directly proportional to
the
+ number of columns in the one hot data matrix, as confusion matrix accepts
only matrices with one column.
+
+```r
+numClasses = 1
Review comment:
Can this be added in the example snippet itself.
```r
# here numClasses is assigned to 1 as numClasses is directly proportional
to the
# number of columns in the one hot data matrix, as confusion matrix accepts
only matrices with one column.
```
##########
File path: dev/docs/builtins-reference.md
##########
@@ -318,7 +320,47 @@ slicefinder(X,W, y, k, paq, S);
### Usage
```r
X = rand (rows = 50, cols = 10)
-y = X %*% rand(rows=ncol(X), 1)
+y = X %*% rand(rows = ncol(X), cols = 1)
w = lm(X = X, y = y)
ress = slicefinder(X = X,W = w, Y = y, k = 5, paq = 1, S = 2);
```
+
+## `confusionMatrix`-Function
+
+A `confusionMatrix` is a technique for summarizing the performance of a
classification algorithm.
+Calculating a confusion matrix can give you a better idea of what your
classification model is getting right and what types of errors it is making.
+This confusionMatrix function accepts two matrices with one column each, these
two matrices are vector for prediction and one-hot-encoded matrix respectively.
+Then it computes the max value of each vector and compare them, after whichit
calculates and returns the sum of classifications and the average of each true
class.
+
+### Usage
+```r
+confusionMatrix(P,Y)
+```
+
+### Arguments
+
+| Name | Type | Default | Description |
+| :------ | :------------- | :--- | :---------- |
+| P | Matrix[Double] | --- |vector of prediction |
+| Y | Matrix[Double] | --- | vector of Golden standard One
Hot Encoded|
+
+### Returns
+
+|Name | Type | Description |
+|:-----------------| :------------- | :---------- |
+|ConfusionSum | Matrix[Double] | The Confusion Matrix Sums of
classifications |
+|ConfusionAvg | Matrix[Double] | The Confusion Matrix averages of each
true class|
+
+### Example
+ #here numClasses is assigned to 1 as numClasses is directly proportional to
the
+ number of columns in the one hot data matrix, as confusion matrix accepts
only matrices with one column.
Review comment:
These lines can be removed.
##########
File path: scripts/builtin/outlier.dml
##########
@@ -18,6 +18,13 @@
# under the License.
#
#-------------------------------------------------------------
+#An outlier in a probability distribution function is a number that is more
+#than 1.5 times the length of the data set away from either the lower or upper
quartiles.
+#Specifically, if a number is less than Q1−1.5×IQR or greater than Q3+1.5×IQR,
then it is an outlier.
+#
+
+
+
Review comment:
Shall we delete thi
##########
File path: dev/docs/builtins-reference.md
##########
@@ -318,7 +320,47 @@ slicefinder(X,W, y, k, paq, S);
### Usage
```r
X = rand (rows = 50, cols = 10)
-y = X %*% rand(rows=ncol(X), 1)
+y = X %*% rand(rows = ncol(X), cols = 1)
w = lm(X = X, y = y)
ress = slicefinder(X = X,W = w, Y = y, k = 5, paq = 1, S = 2);
```
+
+## `confusionMatrix`-Function
+
+A `confusionMatrix` is a technique for summarizing the performance of a
classification algorithm.
+Calculating a confusion matrix can give you a better idea of what your
classification model is getting right and what types of errors it is making.
+This confusionMatrix function accepts two matrices with one column each, these
two matrices are vector for prediction and one-hot-encoded matrix respectively.
+Then it computes the max value of each vector and compare them, after whichit
calculates and returns the sum of classifications and the average of each true
class.
Review comment:
These lines are long, can they be curtailed a bit, to be readable.
For example `331` line we can stop the line at `what types` and `of errors
it is making.` comes in the `332` line.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]