[jira] [Commented] (SPARK-23218) simplify ColumnVector.getArray

2018-01-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16339297#comment-16339297 ] Apache Spark commented on SPARK-23218: -- User 'cloud-fan' has created a pull request for this issue:

[jira] [Assigned] (SPARK-23218) simplify ColumnVector.getArray

2018-01-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23218: Assignee: Apache Spark (was: Wenchen Fan) > simplify ColumnVector.getArray >

[jira] [Assigned] (SPARK-23218) simplify ColumnVector.getArray

2018-01-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23218: Assignee: Wenchen Fan (was: Apache Spark) > simplify ColumnVector.getArray >

[jira] [Created] (SPARK-23217) Add cosine distance measure to ClusteringEvaluator

2018-01-25 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-23217: --- Summary: Add cosine distance measure to ClusteringEvaluator Key: SPARK-23217 URL: https://issues.apache.org/jira/browse/SPARK-23217 Project: Spark Issue Type:

[jira] [Created] (SPARK-23218) simplify ColumnVector.getArray

2018-01-25 Thread Wenchen Fan (JIRA)
Wenchen Fan created SPARK-23218: --- Summary: simplify ColumnVector.getArray Key: SPARK-23218 URL: https://issues.apache.org/jira/browse/SPARK-23218 Project: Spark Issue Type: Sub-task

[jira] [Updated] (SPARK-23217) Add cosine distance measure to ClusteringEvaluator

2018-01-25 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido updated SPARK-23217: Attachment: (was: SPARK-23217.pages) > Add cosine distance measure to ClusteringEvaluator >

[jira] [Updated] (SPARK-23217) Add cosine distance measure to ClusteringEvaluator

2018-01-25 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido updated SPARK-23217: Attachment: SPARK-23217.pdf > Add cosine distance measure to ClusteringEvaluator >

[jira] [Updated] (SPARK-23217) Add cosine distance measure to ClusteringEvaluator

2018-01-25 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido updated SPARK-23217: Attachment: SPARK-23217.pages > Add cosine distance measure to ClusteringEvaluator >

[jira] [Created] (SPARK-23216) Multiclass LogisticRegression could have methods like NCE, NEG, Hierarchical SoftMax, Blackout or IS

2018-01-25 Thread Michel Lemay (JIRA)
Michel Lemay created SPARK-23216: Summary: Multiclass LogisticRegression could have methods like NCE, NEG, Hierarchical SoftMax, Blackout or IS Key: SPARK-23216 URL:

[jira] [Assigned] (SPARK-23112) ML, Graph 2.3 QA: Programming guide update and migration guide

2018-01-25 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath reassigned SPARK-23112: -- Assignee: Nick Pentreath > ML, Graph 2.3 QA: Programming guide update and migration

[jira] [Resolved] (SPARK-23112) ML, Graph 2.3 QA: Programming guide update and migration guide

2018-01-25 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-23112. Resolution: Fixed Fix Version/s: 2.3.0 Issue resolved by pull request 20363

[jira] [Commented] (SPARK-8682) Range Join for Spark SQL

2018-01-25 Thread Sujith Jay Nair (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16339169#comment-16339169 ] Sujith Jay Nair commented on SPARK-8682: The ticket description refers to the implementation of

[jira] [Commented] (SPARK-23213) SparkR:::textFile(sc1,"/opt/test333") can not work on spark2.2.1

2018-01-25 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16339166#comment-16339166 ] Hyukjin Kwon commented on SPARK-23213: -- I think we should rather leave this won't fix .. I don't

[jira] [Commented] (SPARK-23213) SparkR:::textFile(sc1,"/opt/test333") can not work on spark2.2.1

2018-01-25 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16339160#comment-16339160 ] Hyukjin Kwon commented on SPARK-23213: -- target version is usually reserved for committers. I just

[jira] [Updated] (SPARK-23213) SparkR:::textFile(sc1,"/opt/test333") can not work on spark2.2.1

2018-01-25 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-23213: - Target Version/s: (was: 2.2.1) > SparkR:::textFile(sc1,"/opt/test333") can not work on

[jira] [Commented] (SPARK-23201) Cannot create view when duplicate columns exist in subquery

2018-01-25 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16339159#comment-16339159 ] Hyukjin Kwon commented on SPARK-23201: -- We usually resolve it as {{Cannot Reprodice}} if it can't be

[jira] [Commented] (SPARK-23173) from_json can produce nulls for fields which are marked as non-nullable

2018-01-25 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-23173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16339144#comment-16339144 ] Michał Świtakowski commented on SPARK-23173: I'm going to work on this. > from_json can

[jira] [Assigned] (SPARK-21717) Decouple the generated codes of consuming rows in operators under whole-stage codegen

2018-01-25 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-21717: --- Assignee: Liang-Chi Hsieh > Decouple the generated codes of consuming rows in operators

[jira] [Resolved] (SPARK-21717) Decouple the generated codes of consuming rows in operators under whole-stage codegen

2018-01-25 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-21717. - Resolution: Fixed Fix Version/s: 2.3.0 Issue resolved by pull request 18931

[jira] [Assigned] (SPARK-23214) cached data should not carry extra hint info

2018-01-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23214: Assignee: Apache Spark (was: Wenchen Fan) > cached data should not carry extra hint info

[jira] [Commented] (SPARK-23214) cached data should not carry extra hint info

2018-01-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16339024#comment-16339024 ] Apache Spark commented on SPARK-23214: -- User 'cloud-fan' has created a pull request for this issue:

[jira] [Assigned] (SPARK-23214) cached data should not carry extra hint info

2018-01-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23214: Assignee: Wenchen Fan (was: Apache Spark) > cached data should not carry extra hint info

[jira] [Updated] (SPARK-23215) Dataset Grouping: Index out of bounds error

2018-01-25 Thread Nikhil (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nikhil updated SPARK-23215: --- Description: Peforming groupByKey operation followed by reduceGroups on dataset results in

[jira] [Updated] (SPARK-23215) Dataset Grouping: Index out of bounds error

2018-01-25 Thread Nikhil (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nikhil updated SPARK-23215: --- Description: Peforming groupByKey operation followed by reduceGroups on dataset results in

[jira] [Created] (SPARK-23215) Dataset Grouping: Index out of bounds error

2018-01-25 Thread Nikhil (JIRA)
Nikhil created SPARK-23215: -- Summary: Dataset Grouping: Index out of bounds error Key: SPARK-23215 URL: https://issues.apache.org/jira/browse/SPARK-23215 Project: Spark Issue Type: Bug

[jira] [Created] (SPARK-23214) cached data should not carry extra hint info

2018-01-25 Thread Wenchen Fan (JIRA)
Wenchen Fan created SPARK-23214: --- Summary: cached data should not carry extra hint info Key: SPARK-23214 URL: https://issues.apache.org/jira/browse/SPARK-23214 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-23213) SparkR:::textFile(sc1,"/opt/test333") can not work on spark2.2.1

2018-01-25 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16338987#comment-16338987 ] Felix Cheung commented on SPARK-23213: -- Try read.text instead?

[jira] [Assigned] (SPARK-23207) Shuffle+Repartition on an RDD/DataFrame could lead to Data Loss

2018-01-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23207: Assignee: Jiang Xingbo (was: Apache Spark) > Shuffle+Repartition on an RDD/DataFrame

[jira] [Assigned] (SPARK-23207) Shuffle+Repartition on an RDD/DataFrame could lead to Data Loss

2018-01-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23207: Assignee: Apache Spark (was: Jiang Xingbo) > Shuffle+Repartition on an RDD/DataFrame

[jira] [Commented] (SPARK-23207) Shuffle+Repartition on an RDD/DataFrame could lead to Data Loss

2018-01-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16338950#comment-16338950 ] Apache Spark commented on SPARK-23207: -- User 'jiangxb1987' has created a pull request for this

[jira] [Updated] (SPARK-23213) SparkR:::textFile(sc1,"/opt/test333") can not work on spark2.2.1

2018-01-25 Thread Tony (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tony updated SPARK-23213: -- Environment: JAVA_HOME=/opt/jdk1.8.0_161/ spark 2.2.1 R version 3.4.3 (2017-11-30) – "Kite-Eating Tree"

[jira] [Resolved] (SPARK-23208) GenArrayData produces illegal code

2018-01-25 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-23208. - Resolution: Fixed Fix Version/s: 2.3.0 Issue resolved by pull request 20391

[jira] [Created] (SPARK-23213) SparkR:::textFile(sc1,"/opt/test333") can not work on spark2.2.1

2018-01-25 Thread Tony (JIRA)
Tony created SPARK-23213: - Summary: SparkR:::textFile(sc1,"/opt/test333") can not work on spark2.2.1 Key: SPARK-23213 URL: https://issues.apache.org/jira/browse/SPARK-23213 Project: Spark Issue

[jira] [Resolved] (SPARK-23212) Casts the column to a different data type.

2018-01-25 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido resolved SPARK-23212. - Resolution: Invalid This is not the right place. For questions, please use the user mailing

[jira] [Created] (SPARK-23212) Casts the column to a different data type.

2018-01-25 Thread JIRA
黄龙龙 created SPARK-23212: --- Summary: Casts the column to a different data type. Key: SPARK-23212 URL: https://issues.apache.org/jira/browse/SPARK-23212 Project: Spark Issue Type: Question

[jira] [Updated] (SPARK-23129) Lazy init DiskMapIterator#deserializeStream to reduce memory usage when ExternalAppendOnlyMap spill too many times

2018-01-25 Thread zhoukang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhoukang updated SPARK-23129: - Description: Currently,the deserializeStream in ExternalAppendOnlyMap#DiskMapIterator init when 

[jira] [Updated] (SPARK-23129) Lazy init DiskMapIterator#deserializeStream to reduce memory usage when ExternalAppendOnlyMap spill too many times

2018-01-25 Thread zhoukang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhoukang updated SPARK-23129: - Summary: Lazy init DiskMapIterator#deserializeStream to reduce memory usage when ExternalAppendOnlyMap

[jira] [Commented] (SPARK-23211) SparkR MLlib randomFroest parameter problem

2018-01-25 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-23211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16338894#comment-16338894 ] 黄龙龙 commented on SPARK-23211: - I want to konw the usage of parameter newData in spark.randomForest {SparkR}

<    1   2