[jira] [Created] (SPARK-10356) MLlib: Normalization should use absolute values
Carsten Schnober created SPARK-10356: Summary: MLlib: Normalization should use absolute values Key: SPARK-10356 URL: https://issues.apache.org/jira/browse/SPARK-10356 Project: Spark Issue Type: Bug Components: MLlib Affects Versions: 1.4.1 Reporter: Carsten Schnober The normalizer does not handle vectors with negative values properly. It can be tested with the following code {{ val normalized = new Normalizer(1.0).transform(v: Vector) normalizer.toArray.sum == 1.0 }} This yields true if all values in Vector v are positive, but false when v contains one or more negative values. This is because the values in v are taken immediately without applying {{abs()}}, This (probably) does not occur for {{p=2.0}} because the values are squared and hence positive anyway. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-10356) MLlib: Normalization should use absolute values
[ https://issues.apache.org/jira/browse/SPARK-10356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carsten Schnober updated SPARK-10356: - Description: The normalizer does not handle vectors with negative values properly. It can be tested with the following code {code} val normalized = new Normalizer(1.0).transform(v: Vector) normalizer.toArray.sum == 1.0 {code} This yields true if all values in Vector v are positive, but false when v contains one or more negative values. This is because the values in v are taken immediately without applying {{abs()}}, This (probably) does not occur for {{p=2.0}} because the values are squared and hence positive anyway. was: The normalizer does not handle vectors with negative values properly. It can be tested with the following code {{val normalized = new Normalizer(1.0).transform(v: Vector) normalizer.toArray.sum == 1.0}} This yields true if all values in Vector v are positive, but false when v contains one or more negative values. This is because the values in v are taken immediately without applying {{abs()}}, This (probably) does not occur for {{p=2.0}} because the values are squared and hence positive anyway. MLlib: Normalization should use absolute values --- Key: SPARK-10356 URL: https://issues.apache.org/jira/browse/SPARK-10356 Project: Spark Issue Type: Bug Components: MLlib Affects Versions: 1.4.1 Reporter: Carsten Schnober Labels: easyfix Original Estimate: 2h Remaining Estimate: 2h The normalizer does not handle vectors with negative values properly. It can be tested with the following code {code} val normalized = new Normalizer(1.0).transform(v: Vector) normalizer.toArray.sum == 1.0 {code} This yields true if all values in Vector v are positive, but false when v contains one or more negative values. This is because the values in v are taken immediately without applying {{abs()}}, This (probably) does not occur for {{p=2.0}} because the values are squared and hence positive anyway. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-10356) MLlib: Normalization should use absolute values
[ https://issues.apache.org/jira/browse/SPARK-10356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carsten Schnober updated SPARK-10356: - Description: The normalizer does not handle vectors with negative values properly. It can be tested with the following code {{val normalized = new Normalizer(1.0).transform(v: Vector) normalizer.toArray.sum == 1.0}} This yields true if all values in Vector v are positive, but false when v contains one or more negative values. This is because the values in v are taken immediately without applying {{abs()}}, This (probably) does not occur for {{p=2.0}} because the values are squared and hence positive anyway. was: The normalizer does not handle vectors with negative values properly. It can be tested with the following code {{ val normalized = new Normalizer(1.0).transform(v: Vector) normalizer.toArray.sum == 1.0 }} This yields true if all values in Vector v are positive, but false when v contains one or more negative values. This is because the values in v are taken immediately without applying {{abs()}}, This (probably) does not occur for {{p=2.0}} because the values are squared and hence positive anyway. MLlib: Normalization should use absolute values --- Key: SPARK-10356 URL: https://issues.apache.org/jira/browse/SPARK-10356 Project: Spark Issue Type: Bug Components: MLlib Affects Versions: 1.4.1 Reporter: Carsten Schnober Labels: easyfix Original Estimate: 2h Remaining Estimate: 2h The normalizer does not handle vectors with negative values properly. It can be tested with the following code {{val normalized = new Normalizer(1.0).transform(v: Vector) normalizer.toArray.sum == 1.0}} This yields true if all values in Vector v are positive, but false when v contains one or more negative values. This is because the values in v are taken immediately without applying {{abs()}}, This (probably) does not occur for {{p=2.0}} because the values are squared and hence positive anyway. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-10356) MLlib: Normalization should use absolute values
[ https://issues.apache.org/jira/browse/SPARK-10356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14721502#comment-14721502 ] Carsten Schnober edited comment on SPARK-10356 at 8/30/15 12:00 PM: According to https://en.wikipedia.org/wiki/Norm_%28mathematics%29#p-norm, each value's absolute value should be used to compute the norm: {code} ||x||_p := (sum(|x|^p)^1/p {code} For p = 1, this results in: {code} ||x||_1 := sum(|x|) {code} I suppose the issue is thus actually located in the norm() method. was (Author: carschno): According to [[Wikipedia][https://en.wikipedia.org/wiki/Norm_%28mathematics%29#p-norm], each value's absolute value should be used to compute the norm: {||x||_p := (sum(|x|^p)^1/p} For p = 1, this results in: {||x||_1 := sum(|x|)} I suppose the issue is thus actually located in the {norm()} method. MLlib: Normalization should use absolute values --- Key: SPARK-10356 URL: https://issues.apache.org/jira/browse/SPARK-10356 Project: Spark Issue Type: Bug Components: MLlib Affects Versions: 1.4.1 Reporter: Carsten Schnober Labels: easyfix Original Estimate: 2h Remaining Estimate: 2h The normalizer does not handle vectors with negative values properly. It can be tested with the following code {code} val normalized = new Normalizer(1.0).transform(v: Vector) normalizer.toArray.sum == 1.0 {code} This yields true if all values in Vector v are positive, but false when v contains one or more negative values. This is because the values in v are taken immediately without applying {{abs()}}, This (probably) does not occur for {{p=2.0}} because the values are squared and hence positive anyway. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-10356) MLlib: Normalization should use absolute values
[ https://issues.apache.org/jira/browse/SPARK-10356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14721502#comment-14721502 ] Carsten Schnober commented on SPARK-10356: -- According to [[Wikipedia][https://en.wikipedia.org/wiki/Norm_%28mathematics%29#p-norm], each value's absolute value should be used to compute the norm: {||x||_p := (sum(|x|^p)^1/p} For p = 1, this results in: {||x||_1 := sum(|x|)} I suppose the issue is thus actually located in the {norm()} method. MLlib: Normalization should use absolute values --- Key: SPARK-10356 URL: https://issues.apache.org/jira/browse/SPARK-10356 Project: Spark Issue Type: Bug Components: MLlib Affects Versions: 1.4.1 Reporter: Carsten Schnober Labels: easyfix Original Estimate: 2h Remaining Estimate: 2h The normalizer does not handle vectors with negative values properly. It can be tested with the following code {code} val normalized = new Normalizer(1.0).transform(v: Vector) normalizer.toArray.sum == 1.0 {code} This yields true if all values in Vector v are positive, but false when v contains one or more negative values. This is because the values in v are taken immediately without applying {{abs()}}, This (probably) does not occur for {{p=2.0}} because the values are squared and hence positive anyway. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org