[jira] [Commented] (SPARK-13861) TPCDS query 40 returns wrong results compared to TPC official result set

2016-03-19 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15198526#comment-15198526 ] Xiao Li commented on SPARK-13861: - Great job! I am just wondering if only cs_sales_price has a wrong

[jira] [Created] (SPARK-13943) The behavior of sum(booleantype) in Spark DataFrames is not intuitive

2016-03-19 Thread Wes McKinney (JIRA)
Wes McKinney created SPARK-13943: Summary: The behavior of sum(booleantype) in Spark DataFrames is not intuitive Key: SPARK-13943 URL: https://issues.apache.org/jira/browse/SPARK-13943 Project: Spark

[jira] [Assigned] (SPARK-14011) Enable `LineLength` Java checkstyle rule

2016-03-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-14011: Assignee: Apache Spark > Enable `LineLength` Java checkstyle rule >

[jira] [Issue Comment Deleted] (SPARK-13963) Add binary toggle Param to ml.HashingTF

2016-03-19 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-13963: --- Comment: was deleted (was: Sure, assigned to you.) > Add binary toggle Param to

[jira] [Commented] (SPARK-13865) TPCDS query 87 returns wrong results compared to TPC official result set

2016-03-19 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15200821#comment-15200821 ] Xiao Li commented on SPARK-13865: - The query I posted here is downloaded from the official website. It is

[jira] [Assigned] (SPARK-13938) word2phrase feature created in ML

2016-03-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13938: Assignee: (was: Apache Spark) > word2phrase feature created in ML >

[jira] [Updated] (SPARK-14010) ColumnPruning is conflict with PushPredicateThroughProject

2016-03-19 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-14010: --- Description: ColumnPruning will insert a Project before Filter, but > ColumnPruning is conflict

[jira] [Updated] (SPARK-13979) Killed executor is respawned without AWS keys in standalone spark cluster

2016-03-19 Thread Allen George (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen George updated SPARK-13979: - Description: I'm having a problem where respawning a failed executor during a job that

[jira] [Resolved] (SPARK-13816) Add parameter checks for algorithms in Graphx

2016-03-19 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-13816. - Resolution: Fixed Assignee: zhengruifeng Fix Version/s: 2.0.0 > Add parameter

[jira] [Resolved] (SPARK-13901) We get wrong logdebug information when jump to the next locality level.

2016-03-19 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-13901. --- Resolution: Fixed Fix Version/s: 1.6.2 2.0.0 Issue resolved by pull

[jira] [Commented] (SPARK-14014) Replace existing analysis.Catalog with SessionCatalog

2016-03-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15202318#comment-15202318 ] Apache Spark commented on SPARK-14014: -- User 'andrewor14' has created a pull request for this issue:

[jira] [Updated] (SPARK-13905) Change signature of as.data.frame() to be consistent with the R base package

2016-03-19 Thread Sun Rui (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sun Rui updated SPARK-13905: Description: (was: SparkR provides a method as.data.frame() to collect a SparkR DataFrame into a local

[jira] [Updated] (SPARK-13964) Feature hashing improvements

2016-03-19 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-13964: --- Priority: Minor (was: Major) > Feature hashing improvements >

[jira] [Updated] (SPARK-13963) Add binary toggle Param to ml.HashingTF

2016-03-19 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-13963: --- Assignee: Bryan Cutler > Add binary toggle Param to ml.HashingTF >

[jira] [Updated] (SPARK-12789) Support order by position in SQL

2016-03-19 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-12789: Description: This is to support order by position in SQL, e.g. {noformat} select c1, c2, c3 from

[jira] [Updated] (SPARK-14010) ColumnPruning is conflict with PushPredicateThroughProject

2016-03-19 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-14010: --- Description: ColumnPruning will insert a Project before Filter, but PushPredicateThroughProject will

[jira] [Created] (SPARK-13976) do not remove sub-queries added by user when generate SQL

2016-03-19 Thread Wenchen Fan (JIRA)
Wenchen Fan created SPARK-13976: --- Summary: do not remove sub-queries added by user when generate SQL Key: SPARK-13976 URL: https://issues.apache.org/jira/browse/SPARK-13976 Project: Spark

[jira] [Assigned] (SPARK-13951) PySpark ml.pipeline support export/import - nested Piplines

2016-03-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13951: Assignee: Apache Spark > PySpark ml.pipeline support export/import - nested Piplines >

[jira] [Commented] (SPARK-13957) Support group by ordinal in SQL

2016-03-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15203089#comment-15203089 ] Apache Spark commented on SPARK-13957: -- User 'gatorsmile' has created a pull request for this issue:

[jira] [Assigned] (SPARK-13957) Support group by ordinal in SQL

2016-03-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13957: Assignee: (was: Apache Spark) > Support group by ordinal in SQL >

[jira] [Assigned] (SPARK-13957) Support group by ordinal in SQL

2016-03-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13957: Assignee: Apache Spark > Support group by ordinal in SQL >

[jira] [Created] (SPARK-13946) PySpark DataFrames allows you to silently use aggregate expressions derived from different table expressions

2016-03-19 Thread Wes McKinney (JIRA)
Wes McKinney created SPARK-13946: Summary: PySpark DataFrames allows you to silently use aggregate expressions derived from different table expressions Key: SPARK-13946 URL:

[jira] [Updated] (SPARK-13932) CUBE Query with filter (HAVING) and condition (IF) raises an AnalysisException

2016-03-19 Thread Tien-Dung LE (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tien-Dung LE updated SPARK-13932: - Affects Version/s: 2.0.0 > CUBE Query with filter (HAVING) and condition (IF) raises an

[jira] [Commented] (SPARK-13950) Generate code for sort merge left/right outer join

2016-03-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15198162#comment-15198162 ] Apache Spark commented on SPARK-13950: -- User 'davies' has created a pull request for this issue:

[jira] [Updated] (SPARK-13982) SparkR - KMeans predict: Output column name of features is an unclear, automatic genetared text

2016-03-19 Thread Narine Kokhlikyan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Narine Kokhlikyan updated SPARK-13982: -- Summary: SparkR - KMeans predict: Output column name of features is an unclear,

[jira] [Assigned] (SPARK-13858) TPCDS query 21 returns wrong results compared to TPC official result set

2016-03-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13858: Assignee: Apache Spark > TPCDS query 21 returns wrong results compared to TPC official

[jira] [Updated] (SPARK-7992) Hide private classes/objects in in generated Java API doc

2016-03-19 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-7992: - Assignee: (was: Xiangrui Meng) > Hide private classes/objects in in generated Java API doc >

[jira] [Updated] (SPARK-13038) PySpark ml.pipeline support export/import - non-nested Pipelines

2016-03-19 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-13038: -- Summary: PySpark ml.pipeline support export/import - non-nested Pipelines (was:

[jira] [Created] (SPARK-14009) Fail the tests if the any catalyst rule reach max number of iteration.

2016-03-19 Thread Davies Liu (JIRA)
Davies Liu created SPARK-14009: -- Summary: Fail the tests if the any catalyst rule reach max number of iteration. Key: SPARK-14009 URL: https://issues.apache.org/jira/browse/SPARK-14009 Project: Spark

[jira] [Resolved] (SPARK-13776) Web UI is not available after ./sbin/start-master.sh

2016-03-19 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-13776. --- Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 11615

[jira] [Commented] (SPARK-13461) Duplicated example code merge and cleanup

2016-03-19 Thread Xusen Yin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15203076#comment-15203076 ] Xusen Yin commented on SPARK-13461: --- Yes we'll delete it. > Duplicated example code merge and cleanup

[jira] [Commented] (SPARK-13937) PySpark ML JavaWrapper, variable _java_obj should not be static

2016-03-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15197902#comment-15197902 ] Apache Spark commented on SPARK-13937: -- User 'BryanCutler' has created a pull request for this

[jira] [Closed] (SPARK-13821) TPC-DS Query 20 fails to compile

2016-03-19 Thread Roy Cecil (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roy Cecil closed SPARK-13821. - Resolution: Not A Problem > TPC-DS Query 20 fails to compile > > >

[jira] [Commented] (SPARK-13761) Deprecate validateParams

2016-03-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15200097#comment-15200097 ] Apache Spark commented on SPARK-13761: -- User 'jkbradley' has created a pull request for this issue:

[jira] [Comment Edited] (SPARK-13935) Other clients' connection hang up when someone do huge load

2016-03-19 Thread Tao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15197554#comment-15197554 ] Tao Wang edited comment on SPARK-13935 at 3/16/16 3:51 PM: --- [~marmbrus]

[jira] [Updated] (SPARK-13983) HiveThriftServer2 can not get "--hiveconf" or ''--hivevar" variables since 1.6 version (both multi-session and single session)

2016-03-19 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-13983: --- Assignee: Cheng Lian > HiveThriftServer2 can not get "--hiveconf" or ''--hivevar" variables since >

[jira] [Updated] (SPARK-12789) Support order by position in SQL

2016-03-19 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-12789: Summary: Support order by position in SQL (was: Support order by position) > Support order by

[jira] [Commented] (SPARK-13960) JAR/File HTTP Server doesn't respect "spark.driver.host" and there is no "spark.fileserver.host" option

2016-03-19 Thread Ilya Ostrovskiy (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15200690#comment-15200690 ] Ilya Ostrovskiy commented on SPARK-13960: - exporting the SPARK_LOCAL_IP environment variable

[jira] [Updated] (SPARK-12789) Support order by position in SQL

2016-03-19 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-12789: Description: This is to support order by position in SQL, e.g. {noformat} select c1, c2, c3 from

[jira] [Created] (SPARK-13961) spark.ml ChiSqSelector should support other numeric types for label

2016-03-19 Thread Nick Pentreath (JIRA)
Nick Pentreath created SPARK-13961: -- Summary: spark.ml ChiSqSelector should support other numeric types for label Key: SPARK-13961 URL: https://issues.apache.org/jira/browse/SPARK-13961 Project:

[jira] [Commented] (SPARK-13928) Move org.apache.spark.Logging into org.apache.spark.internal.Logging

2016-03-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15197584#comment-15197584 ] Apache Spark commented on SPARK-13928: -- User 'cloud-fan' has created a pull request for this issue:

[jira] [Comment Edited] (SPARK-13821) TPC-DS Query 20 fails to compile

2016-03-19 Thread Roy Cecil (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15201506#comment-15201506 ] Roy Cecil edited comment on SPARK-13821 at 3/18/16 2:09 PM: Dilip, Removed

[jira] [Assigned] (SPARK-13993) PySpark ml.feature.RFormula/RFormulaModel support export/import

2016-03-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13993: Assignee: Apache Spark > PySpark ml.feature.RFormula/RFormulaModel support export/import

[jira] [Commented] (SPARK-13733) Support initial weight distribution in personalized PageRank

2016-03-19 Thread Gayathri Murali (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15198327#comment-15198327 ] Gayathri Murali commented on SPARK-13733: - [~mengxr] Should the rest of the vertices also be set

[jira] [Resolved] (SPARK-13034) PySpark ml.classification support export/import

2016-03-19 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley resolved SPARK-13034. --- Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 11707

[jira] [Commented] (SPARK-14005) Make RDD more compatible with Scala's collection

2016-03-19 Thread zhengruifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15203056#comment-15203056 ] zhengruifeng commented on SPARK-14005: -- ok, plz close this jira. > Make RDD more compatible with

[jira] [Commented] (SPARK-13968) Use MurmurHash3 for hashing String features

2016-03-19 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15202003#comment-15202003 ] Joseph K. Bradley commented on SPARK-13968: --- I'm going to close this in favor of the older

[jira] [Commented] (SPARK-13629) Add binary toggle Param to CountVectorizer

2016-03-19 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15201993#comment-15201993 ] Joseph K. Bradley commented on SPARK-13629: --- [~mlnick] Thanks for handling these count/hashing

[jira] [Assigned] (SPARK-11319) PySpark silently accepts null values in non-nullable DataFrame fields.

2016-03-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-11319: Assignee: (was: Apache Spark) > PySpark silently accepts null values in non-nullable

[jira] [Updated] (SPARK-13948) MiMa Check should catch if the visibility change to `private`

2016-03-19 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-13948: --- Component/s: Project Infra > MiMa Check should catch if the visibility change to `private` >

[jira] [Commented] (SPARK-13865) TPCDS query 87 returns wrong results compared to TPC official result set

2016-03-19 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15200516#comment-15200516 ] Xiao Li commented on SPARK-13865: - This is the same as the

[jira] [Updated] (SPARK-13972) hive tests should fail if SQL generation failed

2016-03-19 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-13972: --- Assignee: Wenchen Fan > hive tests should fail if SQL generation failed >

[jira] [Updated] (SPARK-13776) Web UI is not available after ./sbin/start-master.sh

2016-03-19 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-13776: -- Assignee: Shixiong Zhu > Web UI is not available after ./sbin/start-master.sh >

[jira] [Resolved] (SPARK-10788) Decision Tree duplicates bins for unordered categorical features

2016-03-19 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley resolved SPARK-10788. --- Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 9474

[jira] [Commented] (SPARK-12719) SQL generation support for generators (including UDTF)

2016-03-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15200419#comment-15200419 ] Apache Spark commented on SPARK-12719: -- User 'yy2016' has created a pull request for this issue:

[jira] [Commented] (SPARK-13461) Duplicated example code merge and cleanup

2016-03-19 Thread Gabor Liptak (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15203041#comment-15203041 ] Gabor Liptak commented on SPARK-13461: -- [~yinxusen]

[jira] [Commented] (SPARK-13969) Extend input format that feature hashing can handle

2016-03-19 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15202007#comment-15202007 ] Joseph K. Bradley commented on SPARK-13969: --- I think HashingTF could be extended to handle this

[jira] [Updated] (SPARK-13960) HTTP-based JAR Server doesn't respect spark.driver.host and there is no "spark.fileserver.host" option

2016-03-19 Thread Ilya Ostrovskiy (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ilya Ostrovskiy updated SPARK-13960: Description: There is no option to specify which hostname/IP address the jar/file server

[jira] [Commented] (SPARK-13968) Use MurmurHash3 for hashing String features

2016-03-19 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15200254#comment-15200254 ] Nick Pentreath commented on SPARK-13968: Sure, I will assign to you. But I'd like to get some

[jira] [Created] (SPARK-13938) word2phrase feature created in ML

2016-03-19 Thread Steve Weng (JIRA)
Steve Weng created SPARK-13938: -- Summary: word2phrase feature created in ML Key: SPARK-13938 URL: https://issues.apache.org/jira/browse/SPARK-13938 Project: Spark Issue Type: New Feature

[jira] [Assigned] (SPARK-13958) Executor OOM due to unbounded growth of pointer array in Sorter

2016-03-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13958: Assignee: (was: Apache Spark) > Executor OOM due to unbounded growth of pointer array

[jira] [Updated] (SPARK-10574) HashingTF should use MurmurHash3

2016-03-19 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-10574: -- Assignee: Yanbo Liang > HashingTF should use MurmurHash3 > >

[jira] [Created] (SPARK-13951) PySpark ml.pipeline support export/import - nested Piplines

2016-03-19 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-13951: - Summary: PySpark ml.pipeline support export/import - nested Piplines Key: SPARK-13951 URL: https://issues.apache.org/jira/browse/SPARK-13951 Project: Spark

[jira] [Created] (SPARK-13988) Large history files block new applications from showing up in History UI.

2016-03-19 Thread Parth Brahmbhatt (JIRA)
Parth Brahmbhatt created SPARK-13988: Summary: Large history files block new applications from showing up in History UI. Key: SPARK-13988 URL: https://issues.apache.org/jira/browse/SPARK-13988

[jira] [Updated] (SPARK-10574) HashingTF should use MurmurHash3

2016-03-19 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-10574: -- Issue Type: Sub-task (was: Improvement) Parent: SPARK-13964 > HashingTF

[jira] [Commented] (SPARK-13886) ArrayType of BinaryType not supported in Row.equals method

2016-03-19 Thread MahmoudHanafy (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15198763#comment-15198763 ] MahmoudHanafy commented on SPARK-13886: --- I think List extends Seq !! In this case, How can you

[jira] [Commented] (SPARK-12177) Update KafkaDStreams to new Kafka 0.9 Consumer API

2016-03-19 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15203027#comment-15203027 ] Cody Koeninger commented on SPARK-12177: Unless I'm misunderstanding your point, those changes

[jira] [Assigned] (SPARK-13997) Use Hadoop 2.0 default value for compression in data sources

2016-03-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13997: Assignee: (was: Apache Spark) > Use Hadoop 2.0 default value for compression in data

[jira] [Commented] (SPARK-13865) TPCDS query 87 returns wrong results compared to TPC official result set

2016-03-19 Thread JESSE CHEN (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15200637#comment-15200637 ] JESSE CHEN commented on SPARK-13865: This maybe a TPC toolkit issue. Will be looking into this with

[jira] [Created] (SPARK-13995) Constraints should take care of Cast

2016-03-19 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created SPARK-13995: --- Summary: Constraints should take care of Cast Key: SPARK-13995 URL: https://issues.apache.org/jira/browse/SPARK-13995 Project: Spark Issue Type: Bug

[jira] [Created] (SPARK-13967) Add binary toggle Param to PySpark CountVectorizer

2016-03-19 Thread Nick Pentreath (JIRA)
Nick Pentreath created SPARK-13967: -- Summary: Add binary toggle Param to PySpark CountVectorizer Key: SPARK-13967 URL: https://issues.apache.org/jira/browse/SPARK-13967 Project: Spark Issue

[jira] [Commented] (SPARK-12719) SQL generation support for generators (including UDTF)

2016-03-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15199852#comment-15199852 ] Apache Spark commented on SPARK-12719: -- User 'yy2016' has created a pull request for this issue:

[jira] [Updated] (SPARK-13960) HTTP-based JAR Server doesn't respect spark.driver.host and there is no "spark.fileserver.host" option

2016-03-19 Thread Ilya Ostrovskiy (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ilya Ostrovskiy updated SPARK-13960: Description: There is no option to specify which hostname/IP address the jar/file server

[jira] [Assigned] (SPARK-13992) Add support for off-heap caching

2016-03-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13992: Assignee: Josh Rosen (was: Apache Spark) > Add support for off-heap caching >

[jira] [Commented] (SPARK-13967) Add binary toggle Param to PySpark CountVectorizer

2016-03-19 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15201339#comment-15201339 ] Nick Pentreath commented on SPARK-13967: [~yuhaoyan] or [~bryanc] would you like to take this? >

[jira] [Resolved] (SPARK-13989) Remove non-vectorized/unsafe-row parquet record reader

2016-03-19 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-13989. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 11799

[jira] [Created] (SPARK-14016) Support high-precision decimals in vectorized parquet reader

2016-03-19 Thread Sameer Agarwal (JIRA)
Sameer Agarwal created SPARK-14016: -- Summary: Support high-precision decimals in vectorized parquet reader Key: SPARK-14016 URL: https://issues.apache.org/jira/browse/SPARK-14016 Project: Spark

[jira] [Commented] (SPARK-13986) Make `DeveloperApi`-annotated things public

2016-03-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15200489#comment-15200489 ] Apache Spark commented on SPARK-13986: -- User 'dongjoon-hyun' has created a pull request for this

[jira] [Commented] (SPARK-12177) Update KafkaDStreams to new Kafka 0.9 Consumer API

2016-03-19 Thread Eugene Miretsky (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15203018#comment-15203018 ] Eugene Miretsky commented on SPARK-12177: - The new Kafka Java Consumer is using Deserializer

[jira] [Assigned] (SPARK-13976) do not remove sub-queries added by user when generate SQL

2016-03-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13976: Assignee: Apache Spark > do not remove sub-queries added by user when generate SQL >

[jira] [Created] (SPARK-13940) Predicate Transitive Closure Transformation

2016-03-19 Thread Alex Antonov (JIRA)
Alex Antonov created SPARK-13940: Summary: Predicate Transitive Closure Transformation Key: SPARK-13940 URL: https://issues.apache.org/jira/browse/SPARK-13940 Project: Spark Issue Type:

[jira] [Updated] (SPARK-13937) PySpark ML JavaWrapper, variable _java_obj should not be static

2016-03-19 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-13937: -- Priority: Trivial (was: Minor) > PySpark ML JavaWrapper, variable _java_obj should

[jira] [Updated] (SPARK-13938) word2phrase feature created in ML

2016-03-19 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-13938: -- [~s4weng] "Critical" is inappropriate here. Please read

[jira] [Assigned] (SPARK-913) log the size of each shuffle block in block manager

2016-03-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-913: -- Assignee: Apache Spark > log the size of each shuffle block in block manager >

[jira] [Created] (SPARK-13973) `ipython notebook` is going away...

2016-03-19 Thread Bogdan Pirvu (JIRA)
Bogdan Pirvu created SPARK-13973: Summary: `ipython notebook` is going away... Key: SPARK-13973 URL: https://issues.apache.org/jira/browse/SPARK-13973 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-13974) sub-query names do not need to be globally unique while generate SQL

2016-03-19 Thread Wenchen Fan (JIRA)
Wenchen Fan created SPARK-13974: --- Summary: sub-query names do not need to be globally unique while generate SQL Key: SPARK-13974 URL: https://issues.apache.org/jira/browse/SPARK-13974 Project: Spark

[jira] [Resolved] (SPARK-13360) pyspark related enviroment variable is not propagated to driver in yarn-cluster mode

2016-03-19 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-13360. Resolution: Fixed Assignee: Jeff Zhang Fix Version/s: 2.0.0 > pyspark

[jira] [Updated] (SPARK-14001) support multi-children Union in SQLBuilder

2016-03-19 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-14001: --- Assignee: Wenchen Fan > support multi-children Union in SQLBuilder >

[jira] [Commented] (SPARK-13877) Consider removing Kafka modules from Spark / Spark Streaming

2016-03-19 Thread Hari Shreedharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15200306#comment-15200306 ] Hari Shreedharan commented on SPARK-13877: -- You could have separate repos and separate releases,

[jira] [Created] (SPARK-13993) PySpark ml.feature.RFormula/RFormulaModel support export/import

2016-03-19 Thread Xusen Yin (JIRA)
Xusen Yin created SPARK-13993: - Summary: PySpark ml.feature.RFormula/RFormulaModel support export/import Key: SPARK-13993 URL: https://issues.apache.org/jira/browse/SPARK-13993 Project: Spark

[jira] [Assigned] (SPARK-13937) PySpark ML JavaWrapper, variable _java_obj should not be static

2016-03-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13937: Assignee: (was: Apache Spark) > PySpark ML JavaWrapper, variable _java_obj should not

[jira] [Commented] (SPARK-13955) Spark in yarn mode fails

2016-03-19 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15199198#comment-15199198 ] Sean Owen commented on SPARK-13955: --- Is this likely? the YARN tests succeed. There isn't detail here

[jira] [Assigned] (SPARK-13719) Bad JSON record raises java.lang.ClassCastException

2016-03-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13719: Assignee: (was: Apache Spark) > Bad JSON record raises java.lang.ClassCastException

[jira] [Commented] (SPARK-13864) TPCDS query 74 returns wrong results compared to TPC official result set

2016-03-19 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15198602#comment-15198602 ] Xiao Li commented on SPARK-13864: - This is the same issue as SPARK-13862. I think we can close this.

[jira] [Updated] (SPARK-12719) SQL generation support for generators (including UDTF)

2016-03-19 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-12719: --- Assignee: Wenchen Fan > SQL generation support for generators (including UDTF) >

[jira] [Commented] (SPARK-13865) TPCDS query 87 returns wrong results compared to TPC official result set

2016-03-19 Thread JESSE CHEN (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15200886#comment-15200886 ] JESSE CHEN commented on SPARK-13865: You rock! > TPCDS query 87 returns wrong results compared to

[jira] [Commented] (SPARK-13886) ArrayType of BinaryType not supported in Row.equals method

2016-03-19 Thread Rishabh Bhardwaj (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15198785#comment-15198785 ] Rishabh Bhardwaj commented on SPARK-13886: -- If we go through the implementation of `a.equals(b)`

[jira] [Created] (SPARK-13942) Remove Shark-related docs and visibility for 2.x

2016-03-19 Thread Dongjoon Hyun (JIRA)
Dongjoon Hyun created SPARK-13942: - Summary: Remove Shark-related docs and visibility for 2.x Key: SPARK-13942 URL: https://issues.apache.org/jira/browse/SPARK-13942 Project: Spark Issue

[jira] [Commented] (SPARK-14001) support multi-children Union in SQLBuilder

2016-03-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15201191#comment-15201191 ] Apache Spark commented on SPARK-14001: -- User 'cloud-fan' has created a pull request for this issue:

[jira] [Assigned] (SPARK-13664) Simplify and Speedup HadoopFSRelation

2016-03-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13664: Assignee: Michael Armbrust (was: Apache Spark) > Simplify and Speedup HadoopFSRelation >

  1   2   3   4   5   6   >