[jira] [Updated] (SPARK-21703) Why RPC message are transferred with header and body separately in TCP frame

2017-08-10 Thread neoremind (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] neoremind updated SPARK-21703: -- Description: After seeing the details of how spark leverage netty, I found one question, typically

[jira] [Updated] (SPARK-21703) Why RPC message are transferred with header and body separately in TCP frame

2017-08-10 Thread neoremind (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] neoremind updated SPARK-21703: -- Description: After seeing the details of how spark leverage netty, I found one question, typically

[jira] [Updated] (SPARK-21703) Why RPC message are transferred with header and body separately in TCP frame

2017-08-10 Thread neoremind (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] neoremind updated SPARK-21703: -- Description: After seeing the details of how spark leverage netty, I found one question, typically

[jira] [Created] (SPARK-21703) Why RPC message are transferred with header and body separately in TCP frame

2017-08-10 Thread neoremind (JIRA)
neoremind created SPARK-21703: - Summary: Why RPC message are transferred with header and body separately in TCP frame Key: SPARK-21703 URL: https://issues.apache.org/jira/browse/SPARK-21703 Project:

[jira] [Created] (SPARK-21702) Structured Streaming S3A SSE Encryption Not Applied when PartitionBy Used

2017-08-10 Thread George Pongracz (JIRA)
George Pongracz created SPARK-21702: --- Summary: Structured Streaming S3A SSE Encryption Not Applied when PartitionBy Used Key: SPARK-21702 URL: https://issues.apache.org/jira/browse/SPARK-21702

[jira] [Created] (SPARK-21701) Add TCP send/rcv buffer size support for RPC client

2017-08-10 Thread neoremind (JIRA)
neoremind created SPARK-21701: - Summary: Add TCP send/rcv buffer size support for RPC client Key: SPARK-21701 URL: https://issues.apache.org/jira/browse/SPARK-21701 Project: Spark Issue Type:

[jira] [Commented] (SPARK-21700) How can I get the MetricsSystem information

2017-08-10 Thread Alex Bozarth (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16122762#comment-16122762 ] Alex Bozarth commented on SPARK-21700: -- I would recommend taking a look at the Metrics REST API

[jira] [Updated] (SPARK-21693) AppVeyor tests reach the time limit, 1.5 hours, sometimes in SparkR tests

2017-08-10 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-21693: - Description: We finally sometimes reach the time limit, 1.5 hours,

[jira] [Created] (SPARK-21700) How can I get the MetricsSystem information

2017-08-10 Thread LiuXiangyu (JIRA)
LiuXiangyu created SPARK-21700: -- Summary: How can I get the MetricsSystem information Key: SPARK-21700 URL: https://issues.apache.org/jira/browse/SPARK-21700 Project: Spark Issue Type: Question

[jira] [Updated] (SPARK-21699) Remove unused getTableOption in ExternalCatalog

2017-08-10 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-21699: Fix Version/s: 2.2.1 > Remove unused getTableOption in ExternalCatalog >

[jira] [Resolved] (SPARK-21699) Remove unused getTableOption in ExternalCatalog

2017-08-10 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-21699. - Resolution: Fixed Fix Version/s: 2.3.0 > Remove unused getTableOption in ExternalCatalog

[jira] [Commented] (SPARK-21564) TaskDescription decoding failure should fail the task

2017-08-10 Thread Andrew Ash (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16122651#comment-16122651 ] Andrew Ash commented on SPARK-21564: [~irashid] a possible fix could look roughly like this:

[jira] [Commented] (SPARK-21563) Race condition when serializing TaskDescriptions and adding jars

2017-08-10 Thread Andrew Ash (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16122549#comment-16122549 ] Andrew Ash commented on SPARK-21563: Thanks for the thoughts [~irashid] -- I submitted a PR

[jira] [Created] (SPARK-21699) Remove unused getTableOption in ExternalCatalog

2017-08-10 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-21699: --- Summary: Remove unused getTableOption in ExternalCatalog Key: SPARK-21699 URL: https://issues.apache.org/jira/browse/SPARK-21699 Project: Spark Issue Type:

[jira] [Updated] (SPARK-21693) AppVeyor tests reach the time limit, 1.5 hours, sometimes in SparkR tests

2017-08-10 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-21693: - Description: We finally sometimes reach the time limit, 1.5 hours,

[jira] [Commented] (SPARK-21685) Params isSet in scala Transformer triggered by _setDefault in pyspark

2017-08-10 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16122492#comment-16122492 ] Joseph K. Bradley commented on SPARK-21685: --- Could you please point to more info, such as the

[jira] [Updated] (SPARK-21698) write.partitionBy() is giving me garbage data

2017-08-10 Thread Luis (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luis updated SPARK-21698: - Description: Spark partionBy is causing some data corruption. I am doing three super simple writes. . Below is

[jira] [Updated] (SPARK-21698) write.partitionBy() is giving me garbage data

2017-08-10 Thread Luis (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luis updated SPARK-21698: - Description: Spark partionBy is causing some data corruption. I am doing three super simple writes. . Below is

[jira] [Updated] (SPARK-21698) write.partitionBy() is giving me garbage data

2017-08-10 Thread Luis (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luis updated SPARK-21698: - Description: Spark partionBy is causing some data corruption. I am doing three super simple writes. . Below is

[jira] [Updated] (SPARK-21698) write.partitionBy() is giving me garbage data

2017-08-10 Thread Luis (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luis updated SPARK-21698: - Description: Spark partionBy is causing some data corruption. I am doing three super simple writes. . Below is

[jira] [Updated] (SPARK-21698) write.partitionBy() is giving me garbage data

2017-08-10 Thread Luis (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luis updated SPARK-21698: - Description: Spark partionBy is causing some data corruption. I am doing three super simple writes. . Below is

[jira] [Updated] (SPARK-21698) write.partitionBy() is giving me garbage data

2017-08-10 Thread Luis (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luis updated SPARK-21698: - Description: Spark partionBy is causing some data corruption. I am doing three super simple writes. . Below is

[jira] [Updated] (SPARK-21698) write.partitionBy() is giving me garbage data

2017-08-10 Thread Luis (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luis updated SPARK-21698: - Description: Spark partionBy is causing some data corruption. I am doing three super simple writes. . Below is

[jira] [Updated] (SPARK-21698) write.partitionBy() is giving me garbage data

2017-08-10 Thread Luis (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luis updated SPARK-21698: - Summary: write.partitionBy() is giving me garbage data (was: write.partitionBy() is given me garbage data) >

[jira] [Created] (SPARK-21698) write.partitionBy() is given me garbage data

2017-08-10 Thread Luis (JIRA)
Luis created SPARK-21698: Summary: write.partitionBy() is given me garbage data Key: SPARK-21698 URL: https://issues.apache.org/jira/browse/SPARK-21698 Project: Spark Issue Type: Bug

[jira] [Resolved] (SPARK-21638) Warning message of RF is not accurate

2017-08-10 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-21638. --- Resolution: Fixed Fix Version/s: 2.3.0 Issue resolved by pull request 18868

[jira] [Assigned] (SPARK-21638) Warning message of RF is not accurate

2017-08-10 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reassigned SPARK-21638: - Assignee: Peng Meng > Warning message of RF is not accurate >

[jira] [Commented] (SPARK-21697) NPE & ExceptionInInitializerError trying to load UTF from HDFS

2017-08-10 Thread Weiqing Yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16122234#comment-16122234 ] Weiqing Yang commented on SPARK-21697: -- Thanks for filing this issue! > NPE &

[jira] [Comment Edited] (SPARK-21697) NPE & ExceptionInInitializerError trying to load UTF from HDFS

2017-08-10 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16122195#comment-16122195 ] Steve Loughran edited comment on SPARK-21697 at 8/10/17 7:48 PM: - Text of

[jira] [Commented] (SPARK-21697) NPE & ExceptionInInitializerError trying to load UTF from HDFS

2017-08-10 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16122195#comment-16122195 ] Steve Loughran commented on SPARK-21697: {code} Have u tried it in yarn-client mode? i add this

[jira] [Created] (SPARK-21697) NPE & ExceptionInInitializerError trying to load UTF from HDFS

2017-08-10 Thread Steve Loughran (JIRA)
Steve Loughran created SPARK-21697: -- Summary: NPE & ExceptionInInitializerError trying to load UTF from HDFS Key: SPARK-21697 URL: https://issues.apache.org/jira/browse/SPARK-21697 Project: Spark

[jira] [Resolved] (SPARK-21669) Internal API for collecting metrics/stats during FileFormatWriter jobs

2017-08-10 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-21669. - Resolution: Fixed Assignee: Adrian Ionescu Fix Version/s: 2.3.0 > Internal API

[jira] [Commented] (SPARK-21696) State Store can't handle corrupted snapshots

2017-08-10 Thread Alexander Bessonov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16122078#comment-16122078 ] Alexander Bessonov commented on SPARK-21696: {{HDFSBackedStateStoreProvider.doMaintenance()}}

[jira] [Updated] (SPARK-21696) State Store can't handle corrupted snapshots

2017-08-10 Thread Alexander Bessonov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Bessonov updated SPARK-21696: --- Description: State store's asynchronous maintenance task (generation of Snapshot

[jira] [Created] (SPARK-21696) State Store can't handle corrupted snapshots

2017-08-10 Thread Alexander Bessonov (JIRA)
Alexander Bessonov created SPARK-21696: -- Summary: State Store can't handle corrupted snapshots Key: SPARK-21696 URL: https://issues.apache.org/jira/browse/SPARK-21696 Project: Spark

[jira] [Commented] (SPARK-17419) Mesos virtual network support

2017-08-10 Thread Susan X. Huynh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16122019#comment-16122019 ] Susan X. Huynh commented on SPARK-17419: SPARK-21694 allows the user to pass network labels to

[jira] [Commented] (SPARK-17419) Mesos virtual network support

2017-08-10 Thread Susan X. Huynh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16122018#comment-16122018 ] Susan X. Huynh commented on SPARK-17419: SPARK-18232 adds the ability to launch containers

[jira] [Created] (SPARK-21695) Spark scheduler locality algorithm can take longer then expected

2017-08-10 Thread Thomas Graves (JIRA)
Thomas Graves created SPARK-21695: - Summary: Spark scheduler locality algorithm can take longer then expected Key: SPARK-21695 URL: https://issues.apache.org/jira/browse/SPARK-21695 Project: Spark

[jira] [Created] (SPARK-21694) Support Mesos CNI network labels

2017-08-10 Thread Susan X. Huynh (JIRA)
Susan X. Huynh created SPARK-21694: -- Summary: Support Mesos CNI network labels Key: SPARK-21694 URL: https://issues.apache.org/jira/browse/SPARK-21694 Project: Spark Issue Type: Improvement

[jira] [Updated] (SPARK-21688) performance improvement in mllib SVM with native BLAS

2017-08-10 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-21688: -- Priority: Minor (was: Major) > performance improvement in mllib SVM with native BLAS >

[jira] [Commented] (SPARK-21644) LocalLimit.maxRows is defined incorrectly

2017-08-10 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121946#comment-16121946 ] Xiao Li commented on SPARK-21644: - https://github.com/apache/spark/pull/18851 > LocalLimit.maxRows is

[jira] [Commented] (SPARK-14927) DataFrame. saveAsTable creates RDD partitions but not Hive partitions

2017-08-10 Thread Chaoyu Tang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121931#comment-16121931 ] Chaoyu Tang commented on SPARK-14927: - [~rajeshc] could you provide your example here? > DataFrame.

[jira] [Resolved] (SPARK-18648) spark-shell --jars option does not add jars to classpath on windows

2017-08-10 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-18648. -- Resolution: Duplicate I checked {{spark-shell --jars C:\test\my.jar}} works and fixed. I am

[jira] [Commented] (SPARK-21693) AppVeyor tests reach the time limit, 1.5 hours, sometimes in SparkR tests

2017-08-10 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121888#comment-16121888 ] Hyukjin Kwon commented on SPARK-21693: -- Yes, it does build multiple times and If I have observed

[jira] [Commented] (SPARK-21693) AppVeyor tests reach the time limit, 1.5 hours, sometimes in SparkR tests

2017-08-10 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121866#comment-16121866 ] Felix Cheung commented on SPARK-21693: -- splitting test matrix is also possible, I worry though since

[jira] [Commented] (SPARK-21693) AppVeyor tests reach the time limit, 1.5 hours, sometimes in SparkR tests

2017-08-10 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121860#comment-16121860 ] Felix Cheung commented on SPARK-21693: -- we could certainly simplify the classification set - but

[jira] [Resolved] (SPARK-21535) Reduce memory requirement for CrossValidator and TrainValidationSplit

2017-08-10 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yuhao yang resolved SPARK-21535. Resolution: Not A Problem The new implementation will load the evaluation dataset when training

[jira] [Updated] (SPARK-21693) AppVeyor tests reach the time limit, 1.5 hours, sometimes in SparkR tests

2017-08-10 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-21693: - Description: We finally sometimes reach the time limit, 1.5 hours,

[jira] [Updated] (SPARK-21693) AppVeyor tests reach the time limit, 1.5 hours, sometimes in SparkR tests

2017-08-10 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-21693: - Description: We finally sometimes reach the time limit, 1.5 hours,

[jira] [Updated] (SPARK-21693) AppVeyor tests reach the time limit, 1.5 hours, sometimes in SparkR tests

2017-08-10 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-21693: - Description: We finally sometimes reach the time limit, 1.5 hours,

[jira] [Commented] (SPARK-21693) AppVeyor tests reach the time limit, 1.5 hours, sometimes in SparkR tests

2017-08-10 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121827#comment-16121827 ] Hyukjin Kwon commented on SPARK-21693: -- FYI, [~felixcheung] and [~shivaram]. > AppVeyor tests reach

[jira] [Updated] (SPARK-21693) AppVeyor tests reach the time limit, 1.5 hours, sometimes in SparkR tests

2017-08-10 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-21693: - Description: We finally sometimes reach the time limit, 1.5 hours,

[jira] [Created] (SPARK-21693) AppVeyor tests reach the time limit, 1.5 hours, sometimes in SparkR tests

2017-08-10 Thread Hyukjin Kwon (JIRA)
Hyukjin Kwon created SPARK-21693: Summary: AppVeyor tests reach the time limit, 1.5 hours, sometimes in SparkR tests Key: SPARK-21693 URL: https://issues.apache.org/jira/browse/SPARK-21693 Project:

[jira] [Commented] (SPARK-18648) spark-shell --jars option does not add jars to classpath on windows

2017-08-10 Thread Devaraj K (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121817#comment-16121817 ] Devaraj K commented on SPARK-18648: --- [~FlamingMike], It has fixed as part of SPARK-21339, can you check

[jira] [Commented] (SPARK-21657) Spark has exponential time complexity to explode(array of structs)

2017-08-10 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121801#comment-16121801 ] Ruslan Dautkhanov commented on SPARK-21657: --- [~bjornjons] confirms this problem pertains to

[jira] [Created] (SPARK-21692) Modify PythonUDF to support nullability

2017-08-10 Thread Michael Styles (JIRA)
Michael Styles created SPARK-21692: -- Summary: Modify PythonUDF to support nullability Key: SPARK-21692 URL: https://issues.apache.org/jira/browse/SPARK-21692 Project: Spark Issue Type:

[jira] [Commented] (SPARK-21677) json_tuple throws NullPointException when column is null as string type.

2017-08-10 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121740#comment-16121740 ] Liang-Chi Hsieh commented on SPARK-21677: - As a given field name {{null}} can't be matched with

[jira] [Commented] (SPARK-21688) performance improvement in mllib SVM with native BLAS

2017-08-10 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121679#comment-16121679 ] Sean Owen commented on SPARK-21688: --- Not a good solution? How about just checking the env variables?

[jira] [Comment Edited] (SPARK-21677) json_tuple throws NullPointException when column is null as string type.

2017-08-10 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121658#comment-16121658 ] Hyukjin Kwon edited comment on SPARK-21677 at 8/10/17 2:12 PM: --- [~cjm], I

[jira] [Commented] (SPARK-21677) json_tuple throws NullPointException when column is null as string type.

2017-08-10 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121658#comment-16121658 ] Hyukjin Kwon commented on SPARK-21677: -- [~cjm], I was thinking like {code} spark.sql("""SELECT

[jira] [Updated] (SPARK-21656) spark dynamic allocation should not idle timeout executors when there are enough tasks to run on them

2017-08-10 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated SPARK-21656: -- Description: Right now with dynamic allocation spark starts by getting the number of

[jira] [Commented] (SPARK-21677) json_tuple throws NullPointException when column is null as string type.

2017-08-10 Thread Jen-Ming Chung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121641#comment-16121641 ] Jen-Ming Chung commented on SPARK-21677: to [~hyukjin.kwon], the return {{NULL}} you mentioned

[jira] [Commented] (SPARK-21688) performance improvement in mllib SVM with native BLAS

2017-08-10 Thread Vincent (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121624#comment-16121624 ] Vincent commented on SPARK-21688: - Okay. Yes, true. It can still run without issue but we are just

[jira] [Updated] (SPARK-21656) spark dynamic allocation should not idle timeout executors when tasks still to run

2017-08-10 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated SPARK-21656: -- Description: Right now spark lets go of executors when they are idle for the 60s (or

[jira] [Updated] (SPARK-21656) spark dynamic allocation should not idle timeout executors when there are enough tasks to run on them

2017-08-10 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated SPARK-21656: -- Summary: spark dynamic allocation should not idle timeout executors when there are enough

[jira] [Commented] (SPARK-21688) performance improvement in mllib SVM with native BLAS

2017-08-10 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121599#comment-16121599 ] Sean Owen commented on SPARK-21688: --- I mean best case in that MKL might be a little different from

[jira] [Commented] (SPARK-21688) performance improvement in mllib SVM with native BLAS

2017-08-10 Thread Vincent (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121589#comment-16121589 ] Vincent commented on SPARK-21688: - [~srowen] Thanks for your comments. I think if user decides to use

[jira] [Created] (SPARK-21691) Accessing canonicalized plan for query with limit throws exception

2017-08-10 Thread Bjoern Toldbod (JIRA)
Bjoern Toldbod created SPARK-21691: -- Summary: Accessing canonicalized plan for query with limit throws exception Key: SPARK-21691 URL: https://issues.apache.org/jira/browse/SPARK-21691 Project:

[jira] [Commented] (SPARK-21402) Java encoders - switch fields on collectAsList

2017-08-10 Thread Paul Praet (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121401#comment-16121401 ] Paul Praet commented on SPARK-21402: It seems changing the order of the fields in the struct can give

[jira] [Updated] (SPARK-21520) Improvement a special case for non-deterministic projects and filters in optimizer

2017-08-10 Thread caoxuewen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] caoxuewen updated SPARK-21520: -- Description: Currently, Did a lot of special handling for non-deterministic projects and filters in

[jira] [Issue Comment Deleted] (SPARK-20971) Purge the metadata log for FileStreamSource

2017-08-10 Thread Fei Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fei Shao updated SPARK-20971: - Comment: was deleted (was: Hi Zhu, What does metadata logs stands for please?) > Purge the metadata log

[jira] [Comment Edited] (SPARK-21402) Java encoders - switch fields on collectAsList

2017-08-10 Thread Paul Praet (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121297#comment-16121297 ] Paul Praet edited comment on SPARK-21402 at 8/10/17 9:04 AM: - I can confirm

[jira] [Comment Edited] (SPARK-21402) Java encoders - switch fields on collectAsList

2017-08-10 Thread Paul Praet (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121297#comment-16121297 ] Paul Praet edited comment on SPARK-21402 at 8/10/17 9:03 AM: - I can confirm

[jira] [Commented] (SPARK-21688) performance improvement in mllib SVM with native BLAS

2017-08-10 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121299#comment-16121299 ] Sean Owen commented on SPARK-21688: --- Understood, though it potentially impacts the benchmarks. You have

[jira] [Commented] (SPARK-21402) Java encoders - switch fields on collectAsList

2017-08-10 Thread Paul Praet (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121297#comment-16121297 ] Paul Praet commented on SPARK-21402: I can confirm this problem persists in Spark 2.2.0: fields get

[jira] [Commented] (SPARK-21688) performance improvement in mllib SVM with native BLAS

2017-08-10 Thread Peng Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121284#comment-16121284 ] Peng Meng commented on SPARK-21688: --- MKL is just an example of native BLAS, if user has Openblas,

[jira] [Commented] (SPARK-21688) performance improvement in mllib SVM with native BLAS

2017-08-10 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121270#comment-16121270 ] Sean Owen commented on SPARK-21688: --- I guess my concern is that this slows things down unless people do

[jira] [Commented] (SPARK-21688) performance improvement in mllib SVM with native BLAS

2017-08-10 Thread Vincent (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121254#comment-16121254 ] Vincent commented on SPARK-21688: - and if native blas is left with default multi-threading setting, it

[jira] [Updated] (SPARK-21688) performance improvement in mllib SVM with native BLAS

2017-08-10 Thread Vincent (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vincent updated SPARK-21688: Attachment: native-trywait.png > performance improvement in mllib SVM with native BLAS >

[jira] [Updated] (SPARK-21684) df.write double escaping all the already escaped characters except the first one

2017-08-10 Thread Taran Saini (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Taran Saini updated SPARK-21684: Attachment: SparkQuotesTest2.scala PFA the same. > df.write double escaping all the already

[jira] [Commented] (SPARK-21684) df.write double escaping all the already escaped characters except the first one

2017-08-10 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121244#comment-16121244 ] Liang-Chi Hsieh commented on SPARK-21684: - Would you mind provide a small codes to reproduce it?

[jira] [Updated] (SPARK-21688) performance improvement in mllib SVM with native BLAS

2017-08-10 Thread Vincent (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vincent updated SPARK-21688: Attachment: (was: uni-test on ddot.png) > performance improvement in mllib SVM with native BLAS >

[jira] [Updated] (SPARK-21688) performance improvement in mllib SVM with native BLAS

2017-08-10 Thread Vincent (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vincent updated SPARK-21688: Attachment: ddot unitest.png > performance improvement in mllib SVM with native BLAS >

[jira] [Commented] (SPARK-21688) performance improvement in mllib SVM with native BLAS

2017-08-10 Thread Vincent (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121236#comment-16121236 ] Vincent commented on SPARK-21688: - upload a data we collected before, uni-test on ddot, we can see for

[jira] [Updated] (SPARK-21688) performance improvement in mllib SVM with native BLAS

2017-08-10 Thread Vincent (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vincent updated SPARK-21688: Attachment: uni-test on ddot.png > performance improvement in mllib SVM with native BLAS >

[jira] [Commented] (SPARK-21688) performance improvement in mllib SVM with native BLAS

2017-08-10 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121225#comment-16121225 ] Sean Owen commented on SPARK-21688: --- I see, so you're saying use BLAS for level 1 ops. Do we know

[jira] [Updated] (SPARK-21679) KMeans Clustering is Not Deterministic

2017-08-10 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-21679: -- Priority: Minor (was: Major) As a general statement, it's hard to get deterministic behavior out of a

[jira] [Updated] (SPARK-21679) KMeans Clustering is Not Deterministic

2017-08-10 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang-Chi Hsieh updated SPARK-21679: Issue Type: Improvement (was: Bug) > KMeans Clustering is Not Deterministic >

[jira] [Commented] (SPARK-21679) KMeans Clustering is Not Deterministic

2017-08-10 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121213#comment-16121213 ] Liang-Chi Hsieh commented on SPARK-21679: - Old MLlib {{org.apache.spark.mllib.clustering.KMeans}}

[jira] [Commented] (SPARK-21688) performance improvement in mllib SVM with native BLAS

2017-08-10 Thread Vincent (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121209#comment-16121209 ] Vincent commented on SPARK-21688: - currently, there are certain places in ML/MLLib, such as in mllib/SVM,

[jira] [Commented] (SPARK-21686) spark.sql.hive.convertMetastoreOrc is causing NullPointerException while reading ORC tables

2017-08-10 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121187#comment-16121187 ] Liang-Chi Hsieh commented on SPARK-21686: - I saw the affect version is 1.6.1. So the more recent

[jira] [Commented] (SPARK-21680) ML/MLLIB Vector compressed optimization

2017-08-10 Thread Peng Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121175#comment-16121175 ] Peng Meng commented on SPARK-21680: --- I mean if the user call toSparse(size), but the size is smaller

[jira] [Commented] (SPARK-21680) ML/MLLIB Vector compressed optimization

2017-08-10 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121166#comment-16121166 ] Sean Owen commented on SPARK-21680: --- I don't get what security issue you mean here, but no the change

[jira] [Commented] (SPARK-21688) performance improvement in mllib SVM with native BLAS

2017-08-10 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121163#comment-16121163 ] Sean Owen commented on SPARK-21688: --- Of course native BLAS is typically faster where it is used so you

[jira] [Commented] (SPARK-21689) Spark submit will not get kerberos token token when hbase class not found

2017-08-10 Thread zhoukang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121124#comment-16121124 ] zhoukang commented on SPARK-21689: -- https://github.com/apache/spark/pull/18901 i created a pr for this

[jira] [Commented] (SPARK-21660) Yarn ShuffleService failed to start when the chosen directory become read-only

2017-08-10 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121125#comment-16121125 ] Saisai Shao commented on SPARK-21660: - Will yarn NM handle this bad disk problem and return a good

[jira] [Comment Edited] (SPARK-21688) performance improvement in mllib SVM with native BLAS

2017-08-10 Thread Vincent (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121113#comment-16121113 ] Vincent edited comment on SPARK-21688 at 8/10/17 6:13 AM: -- attach svm profiling

[jira] [Updated] (SPARK-21688) performance improvement in mllib SVM with native BLAS

2017-08-10 Thread Vincent (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vincent updated SPARK-21688: Attachment: svm1.png svm2.png svm-mkl-1.png svm-mkl-2.png

[jira] [Updated] (SPARK-21689) Spark submit will not get kerberos token token when hbase class not found

2017-08-10 Thread zhoukang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhoukang updated SPARK-21689: - Description: When use yarn cluster mode,and we need scan hbase,there will be a case which can not work:

[jira] [Created] (SPARK-21690) one-pass imputer

2017-08-10 Thread zhengruifeng (JIRA)
zhengruifeng created SPARK-21690: Summary: one-pass imputer Key: SPARK-21690 URL: https://issues.apache.org/jira/browse/SPARK-21690 Project: Spark Issue Type: Improvement

  1   2   >