[jira] [Commented] (SPARK-14006) Builds of 1.6 branch fail R style check

2016-03-18 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15202560#comment-15202560 ] Yin Huai commented on SPARK-14006: -- OK. Cool. Then, let's fix branch 1.6. > Builds of 1.6 branch fail R

[jira] [Resolved] (SPARK-13970) Add Non-Negative Matrix Factorization to MLlib

2016-03-18 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-13970. --- Resolution: Won't Fix Per the PR, we won't include this in Spark in the near future. > Add

[jira] [Assigned] (SPARK-13719) Bad JSON record raises java.lang.ClassCastException

2016-03-18 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13719: Assignee: Apache Spark > Bad JSON record raises java.lang.ClassCastException >

[jira] [Commented] (SPARK-13859) TPCDS query 38 returns wrong results compared to TPC official result set

2016-03-18 Thread JESSE CHEN (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15199885#comment-15199885 ] JESSE CHEN commented on SPARK-13859: Testing both Q87 and Q38. Back shortly with results. > TPCDS

[jira] [Commented] (SPARK-13981) Improve Filter generated code to defer variable evaluation within operator

2016-03-18 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15200263#comment-15200263 ] Apache Spark commented on SPARK-13981: -- User 'nongli' has created a pull request for this issue:

[jira] [Commented] (SPARK-13857) Feature parity for ALS ML with MLLIB

2016-03-18 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15202240#comment-15202240 ] Xiangrui Meng commented on SPARK-13857: --- +1. We need to figure out the semantics in a pipeline

[jira] [Created] (SPARK-13997) Use Hadoop 2.0 default value for compression in data sources

2016-03-18 Thread Hyukjin Kwon (JIRA)
Hyukjin Kwon created SPARK-13997: Summary: Use Hadoop 2.0 default value for compression in data sources Key: SPARK-13997 URL: https://issues.apache.org/jira/browse/SPARK-13997 Project: Spark

[jira] [Resolved] (SPARK-13629) Add binary toggle Param to CountVectorizer

2016-03-18 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-13629. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 11536

[jira] [Commented] (SPARK-13774) IllegalArgumentException: Can not create a Path from an empty string for incorrect file path

2016-03-18 Thread Sunitha Kambhampati (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15198187#comment-15198187 ] Sunitha Kambhampati commented on SPARK-13774: - >From other JIRAs, I know there is

[jira] [Commented] (SPARK-14000) case class with a tuple field can't work in Dataset

2016-03-18 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15201118#comment-15201118 ] Apache Spark commented on SPARK-14000: -- User 'cloud-fan' has created a pull request for this issue:

[jira] [Assigned] (SPARK-13990) Automatically pick serializer when caching RDDs

2016-03-18 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13990: Assignee: Josh Rosen (was: Apache Spark) > Automatically pick serializer when caching

[jira] [Assigned] (SPARK-13937) PySpark ML JavaWrapper, variable _java_obj should not be static

2016-03-18 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13937: Assignee: Apache Spark > PySpark ML JavaWrapper, variable _java_obj should not be static

[jira] [Created] (SPARK-13971) Implicit group by with distinct modifier on having raises an unexpected error

2016-03-18 Thread JIRA
Javier Pérez created SPARK-13971: Summary: Implicit group by with distinct modifier on having raises an unexpected error Key: SPARK-13971 URL: https://issues.apache.org/jira/browse/SPARK-13971

[jira] [Commented] (SPARK-13456) Cannot create encoders for case classes defined in Spark shell after upgrading to Scala 2.11

2016-03-18 Thread Arjen P. de Vries (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15199774#comment-15199774 ] Arjen P. de Vries commented on SPARK-13456: --- I benefited from learning about this workaround

[jira] [Updated] (SPARK-13982) SparkR - KMeans predict: Output column name of features is an unclear, automatically genetared text

2016-03-18 Thread Narine Kokhlikyan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Narine Kokhlikyan updated SPARK-13982: -- Component/s: SparkR > SparkR - KMeans predict: Output column name of features is an

[jira] [Commented] (SPARK-13998) HashingTF should extend UnaryTransformer

2016-03-18 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15201998#comment-15201998 ] Joseph K. Bradley commented on SPARK-13998: --- I agree it'd require that. I don't think it's

[jira] [Created] (SPARK-13979) Killed executor is respawned without AWS keys in standalone spark cluster

2016-03-18 Thread Allen George (JIRA)
Allen George created SPARK-13979: Summary: Killed executor is respawned without AWS keys in standalone spark cluster Key: SPARK-13979 URL: https://issues.apache.org/jira/browse/SPARK-13979 Project:

[jira] [Commented] (SPARK-13859) TPCDS query 38 returns wrong results compared to TPC official result set

2016-03-18 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15200503#comment-15200503 ] Xiao Li commented on SPARK-13859: - Yeah, your understanding is right. When rewriting Intersect by Join,

[jira] [Commented] (SPARK-12014) Spark SQL query containing semicolon is broken in Beeline (related to HIVE-11100)

2016-03-18 Thread Teng Qiu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15199752#comment-15199752 ] Teng Qiu commented on SPARK-12014: -- any update on this? > Spark SQL query containing semicolon is

[jira] [Created] (SPARK-13955) Spark in yarn mode fails

2016-03-18 Thread Jeff Zhang (JIRA)
Jeff Zhang created SPARK-13955: -- Summary: Spark in yarn mode fails Key: SPARK-13955 URL: https://issues.apache.org/jira/browse/SPARK-13955 Project: Spark Issue Type: Bug Affects Versions:

[jira] [Commented] (SPARK-13980) Incrementally serialize blocks while unrolling them in MemoryStore

2016-03-18 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15200178#comment-15200178 ] Apache Spark commented on SPARK-13980: -- User 'JoshRosen' has created a pull request for this issue:

[jira] [Updated] (SPARK-13990) Automatically pick serializer when caching RDDs

2016-03-18 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-13990: --- Target Version/s: 2.0.0 > Automatically pick serializer when caching RDDs >

[jira] [Created] (SPARK-14018) BenchmarkWholeStageCodegen should accept 64-bit num records

2016-03-18 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-14018: --- Summary: BenchmarkWholeStageCodegen should accept 64-bit num records Key: SPARK-14018 URL: https://issues.apache.org/jira/browse/SPARK-14018 Project: Spark

[jira] [Assigned] (SPARK-13950) Generate code for sort merge left/right outer join

2016-03-18 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13950: Assignee: Davies Liu (was: Apache Spark) > Generate code for sort merge left/right outer

[jira] [Assigned] (SPARK-13982) SparkR - KMeans predict: Output column name of features is an unclear, automatically genetared text

2016-03-18 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13982: Assignee: (was: Apache Spark) > SparkR - KMeans predict: Output column name of

[jira] [Resolved] (SPARK-12718) SQL generation support for window functions

2016-03-18 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai resolved SPARK-12718. -- Resolution: Fixed Fix Version/s: 2.0.0 > SQL generation support for window functions >

[jira] [Commented] (SPARK-14018) BenchmarkWholeStageCodegen should accept 64-bit num records

2016-03-18 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15202573#comment-15202573 ] Apache Spark commented on SPARK-14018: -- User 'rxin' has created a pull request for this issue:

[jira] [Assigned] (SPARK-14018) BenchmarkWholeStageCodegen should accept 64-bit num records

2016-03-18 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-14018: Assignee: Apache Spark (was: Reynold Xin) > BenchmarkWholeStageCodegen should accept

[jira] [Assigned] (SPARK-13992) Add support for off-heap caching

2016-03-18 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13992: Assignee: Apache Spark (was: Josh Rosen) > Add support for off-heap caching >

[jira] [Created] (SPARK-14008) Cleanup/Extend the Vectorized Parquet Reader

2016-03-18 Thread Sameer Agarwal (JIRA)
Sameer Agarwal created SPARK-14008: -- Summary: Cleanup/Extend the Vectorized Parquet Reader Key: SPARK-14008 URL: https://issues.apache.org/jira/browse/SPARK-14008 Project: Spark Issue Type:

[jira] [Created] (SPARK-13966) Regression using .withColumn() on a parquet

2016-03-18 Thread Federico Ponzi (JIRA)
Federico Ponzi created SPARK-13966: -- Summary: Regression using .withColumn() on a parquet Key: SPARK-13966 URL: https://issues.apache.org/jira/browse/SPARK-13966 Project: Spark Issue Type:

[jira] [Commented] (SPARK-13579) Stop building assemblies for Spark

2016-03-18 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15200437#comment-15200437 ] Apache Spark commented on SPARK-13579: -- User 'vanzin' has created a pull request for this issue:

[jira] [Updated] (SPARK-13613) Provide ignored tests to export test dataset into CSV format

2016-03-18 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-13613: -- Assignee: Yanbo Liang > Provide ignored tests to export test dataset into CSV format >

[jira] [Commented] (SPARK-13935) Other clients' connection hang up when someone do huge load

2016-03-18 Thread Tao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15197554#comment-15197554 ] Tao Wang commented on SPARK-13935: -- [~marmbrus][~liancheng][~chenghao] > Other clients' connection

[jira] [Updated] (SPARK-13958) Executor OOM due to unbounded growth of pointer array in Sorter

2016-03-18 Thread Sital Kedia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sital Kedia updated SPARK-13958: Component/s: Spark Core Shuffle > Executor OOM due to unbounded growth of pointer

[jira] [Assigned] (SPARK-14004) AttributeReference and Alias should only use their first qualifier to build SQL representations

2016-03-18 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-14004: Assignee: Apache Spark (was: Cheng Lian) > AttributeReference and Alias should only use

[jira] [Commented] (SPARK-13815) Provide better Exception messages in Pipeline load methods

2016-03-18 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15198261#comment-15198261 ] Apache Spark commented on SPARK-13815: -- User 'joan38' has created a pull request for this issue:

[jira] [Comment Edited] (SPARK-13983) HiveThriftServer2 can not get "--hiveconf" or ''--hivevar" variables since 1.6 version (both multi-session and single session)

2016-03-18 Thread Teng Qiu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15200661#comment-15200661 ] Teng Qiu edited comment on SPARK-13983 at 3/17/16 11:58 PM: now i can confirm

[jira] [Resolved] (SPARK-13922) Filter rows with null attributes in parquet vectorized reader

2016-03-18 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai resolved SPARK-13922. -- Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 11749

[jira] [Updated] (SPARK-13761) Deprecate validateParams

2016-03-18 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-13761: -- Assignee: yuhao yang > Deprecate validateParams > > >

[jira] [Assigned] (SPARK-13985) WAL for determistic batches with IDs

2016-03-18 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13985: Assignee: Michael Armbrust (was: Apache Spark) > WAL for determistic batches with IDs >

[jira] [Resolved] (SPARK-14012) Extract VectorizedColumnReader from VectorizedParquetRecordReader

2016-03-18 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-14012. - Resolution: Fixed Assignee: Sameer Agarwal Fix Version/s: 2.0.0 > Extract

[jira] [Commented] (SPARK-13963) Add binary toggle Param to ml.HashingTF

2016-03-18 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1520#comment-1520 ] Bryan Cutler commented on SPARK-13963: -- Hi [~mlnick], mind if I work on this? > Add binary toggle

[jira] [Updated] (SPARK-13945) Enable native view flag by default

2016-03-18 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-13945: - Target Version/s: 2.0.0 > Enable native view flag by default >

[jira] [Commented] (SPARK-13874) Move docs of streaming-flume, streaming-mqtt, streaming-zeromq, streaming-akka, streaming-twitter to Spark packages

2016-03-18 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15201818#comment-15201818 ] Apache Spark commented on SPARK-13874: -- User 'zsxwing' has created a pull request for this issue:

[jira] [Created] (SPARK-13945) Enable native view flag by default

2016-03-18 Thread Yin Huai (JIRA)
Yin Huai created SPARK-13945: Summary: Enable native view flag by default Key: SPARK-13945 URL: https://issues.apache.org/jira/browse/SPARK-13945 Project: Spark Issue Type: Sub-task

[jira] [Commented] (SPARK-13865) TPCDS query 87 returns wrong results compared to TPC official result set

2016-03-18 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15197942#comment-15197942 ] Xiao Li commented on SPARK-13865: - I will take this. Thanks! > TPCDS query 87 returns wrong results

[jira] [Created] (SPARK-13992) Add support for off-heap caching

2016-03-18 Thread Josh Rosen (JIRA)
Josh Rosen created SPARK-13992: -- Summary: Add support for off-heap caching Key: SPARK-13992 URL: https://issues.apache.org/jira/browse/SPARK-13992 Project: Spark Issue Type: New Feature

[jira] [Commented] (SPARK-13900) Spark SQL queries with OR condition is not optimized properly

2016-03-18 Thread Ashok kumar Rajendran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15198705#comment-15198705 ] Ashok kumar Rajendran commented on SPARK-13900: --- Yes, I observed that too. My question is

[jira] [Updated] (SPARK-13994) Investigate types that are not supported by vectorized parquet record reader

2016-03-18 Thread Sameer Agarwal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sameer Agarwal updated SPARK-13994: --- Issue Type: Sub-task (was: Improvement) Parent: SPARK-14008 > Investigate types

[jira] [Assigned] (SPARK-13934) SqlParser.parseTableIdentifier cannot recognize table name start with scientific notation

2016-03-18 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13934: Assignee: (was: Apache Spark) > SqlParser.parseTableIdentifier cannot recognize table

[jira] [Commented] (SPARK-13289) Word2Vec generate infinite distances when numIterations>5

2016-03-18 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15200958#comment-15200958 ] Apache Spark commented on SPARK-13289: -- User 'flyjy' has created a pull request for this issue:

[jira] [Resolved] (SPARK-13862) TPCDS query 49 returns wrong results compared to TPC official result set

2016-03-18 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai resolved SPARK-13862. -- Resolution: Duplicate > TPCDS query 49 returns wrong results compared to TPC official result set >

[jira] [Commented] (SPARK-13954) spar-shell starts with exceptions

2016-03-18 Thread Pranas Baliuka (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15198689#comment-15198689 ] Pranas Baliuka commented on SPARK-13954: Exit from shell is also bugged: {code} scala> exit :26:

[jira] [Commented] (SPARK-13826) Revise ScalaDoc of the new Dataset API

2016-03-18 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15198010#comment-15198010 ] Apache Spark commented on SPARK-13826: -- User 'liancheng' has created a pull request for this issue:

[jira] [Resolved] (SPARK-13864) TPCDS query 74 returns wrong results compared to TPC official result set

2016-03-18 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai resolved SPARK-13864. -- Resolution: Duplicate > TPCDS query 74 returns wrong results compared to TPC official result set >

[jira] [Updated] (SPARK-13819) using a regexp_replace in a group by clause raises a nullpointerexception

2016-03-18 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-13819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Javier Pérez updated SPARK-13819: - Summary: using a regexp_replace in a group by clause raises a nullpointerexception (was: using

[jira] [Comment Edited] (SPARK-13832) TPC-DS Query 36 fails with Parser error

2016-03-18 Thread Xin Wu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15202412#comment-15202412 ] Xin Wu edited comment on SPARK-13832 at 3/19/16 12:29 AM: -- Jesse, you are

[jira] [Commented] (SPARK-13877) Consider removing Kafka modules from Spark / Spark Streaming

2016-03-18 Thread Mark Grover (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15200079#comment-15200079 ] Mark Grover commented on SPARK-13877: - I am guessing this needs to be done before Spark 2.0 code

[jira] [Commented] (SPARK-13983) HiveThriftServer2 can not get "--hiveconf" or ''--hivevar" variables since 1.6 version (both multi-session and single session)

2016-03-18 Thread Teng Qiu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15200366#comment-15200366 ] Teng Qiu commented on SPARK-13983: -- also tried start the server in single session mode:

[jira] [Created] (SPARK-13937) PySpark ML JavaWrapper, variable _java_obj should not be static

2016-03-18 Thread Bryan Cutler (JIRA)
Bryan Cutler created SPARK-13937: Summary: PySpark ML JavaWrapper, variable _java_obj should not be static Key: SPARK-13937 URL: https://issues.apache.org/jira/browse/SPARK-13937 Project: Spark

[jira] [Updated] (SPARK-13968) Use MurmurHash3 for hashing String features

2016-03-18 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-13968: --- Assignee: Yanbo Liang > Use MurmurHash3 for hashing String features >

[jira] [Created] (SPARK-13954) spar-shell starts with exceptions

2016-03-18 Thread Pranas Baliuka (JIRA)
Pranas Baliuka created SPARK-13954: -- Summary: spar-shell starts with exceptions Key: SPARK-13954 URL: https://issues.apache.org/jira/browse/SPARK-13954 Project: Spark Issue Type: Bug

[jira] [Closed] (SPARK-13968) Use MurmurHash3 for hashing String features

2016-03-18 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley closed SPARK-13968. - Resolution: Duplicate > Use MurmurHash3 for hashing String features >

[jira] [Commented] (SPARK-13948) MiMa Check should catch if the visibility change to `private`

2016-03-18 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15198467#comment-15198467 ] Apache Spark commented on SPARK-13948: -- User 'JoshRosen' has created a pull request for this issue:

[jira] [Commented] (SPARK-13908) Limit not pushed down

2016-03-18 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15201190#comment-15201190 ] Apache Spark commented on SPARK-13908: -- User 'viirya' has created a pull request for this issue:

[jira] [Assigned] (SPARK-13942) Remove Shark-related docs and visibility for 2.x

2016-03-18 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13942: Assignee: (was: Apache Spark) > Remove Shark-related docs and visibility for 2.x >

[jira] [Resolved] (SPARK-13937) PySpark ML JavaWrapper, variable _java_obj should not be static

2016-03-18 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley resolved SPARK-13937. --- Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 11767

[jira] [Updated] (SPARK-13869) Remove redundant conditions while combining filters

2016-03-18 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated SPARK-13869: - Assignee: Sameer Agarwal > Remove redundant conditions while combining filters >

[jira] [Updated] (SPARK-13960) HTTP-based JAR Server doesn't respect spark.driver.host and there is no "spark.fileserver.host" option

2016-03-18 Thread Ilya Ostrovskiy (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ilya Ostrovskiy updated SPARK-13960: Description: There is no option to specify which hostname/IP address the jar/file server

[jira] [Updated] (SPARK-13905) Change signature of as.data.frame() to be consistent with the R base package

2016-03-18 Thread Sun Rui (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sun Rui updated SPARK-13905: Description: change the signature of as.data.frame() to be consistent with that in the R base package to

[jira] [Created] (SPARK-13989) Remove non-vectorized/unsafe-row parquet record reader

2016-03-18 Thread Sameer Agarwal (JIRA)
Sameer Agarwal created SPARK-13989: -- Summary: Remove non-vectorized/unsafe-row parquet record reader Key: SPARK-13989 URL: https://issues.apache.org/jira/browse/SPARK-13989 Project: Spark

[jira] [Assigned] (SPARK-13976) do not remove sub-queries added by user when generate SQL

2016-03-18 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13976: Assignee: (was: Apache Spark) > do not remove sub-queries added by user when generate

[jira] [Assigned] (SPARK-13981) Improve Filter generated code to defer variable evaluation within operator

2016-03-18 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13981: Assignee: (was: Apache Spark) > Improve Filter generated code to defer variable

[jira] [Commented] (SPARK-13975) Cannot specify extra libs for executor from /extra-lib

2016-03-18 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15199685#comment-15199685 ] Sean Owen commented on SPARK-13975: --- I'm not suggesting the user should configure something, but that

[jira] [Assigned] (SPARK-13774) IllegalArgumentException: Can not create a Path from an empty string for incorrect file path

2016-03-18 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13774: Assignee: (was: Apache Spark) > IllegalArgumentException: Can not create a Path from

[jira] [Commented] (SPARK-12072) python dataframe ._jdf.schema().json() breaks on large metadata dataframes

2016-03-18 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15200758#comment-15200758 ] holdenk commented on SPARK-12072: - So coming back to this after chatting with [~mrares] I've got

[jira] [Commented] (SPARK-14006) Builds of 1.6 branch fail R style check

2016-03-18 Thread Rekha Joshi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15202004#comment-15202004 ] Rekha Joshi commented on SPARK-14006: - pushing a pull request in few mins.thanks! > Builds of 1.6

[jira] [Resolved] (SPARK-13826) Revise ScalaDoc of the new Dataset API

2016-03-18 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-13826. - Resolution: Fixed Fix Version/s: 2.0.0 > Revise ScalaDoc of the new Dataset API >

[jira] [Resolved] (SPARK-13976) do not remove sub-queries added by user when generate SQL

2016-03-18 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-13976. - Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 11786

[jira] [Commented] (SPARK-10680) Flaky test: network.RequestTimeoutIntegrationSuite.timeoutInactiveRequests

2016-03-18 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15202279#comment-15202279 ] Apache Spark commented on SPARK-10680: -- User 'zsxwing' has created a pull request for this issue:

[jira] [Assigned] (SPARK-10680) Flaky test: network.RequestTimeoutIntegrationSuite.timeoutInactiveRequests

2016-03-18 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-10680: Assignee: Apache Spark (was: Josh Rosen) > Flaky test:

[jira] [Commented] (SPARK-13587) Support virtualenv in PySpark

2016-03-18 Thread Mike Sukmanowsky (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15201763#comment-15201763 ] Mike Sukmanowsky commented on SPARK-13587: -- Sorry to bug [~juliet] - any thoughts? We're

[jira] [Comment Edited] (SPARK-12148) SparkR: rename DataFrame to SparkDataFrame

2016-03-18 Thread Anupama Joshi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15202376#comment-15202376 ] Anupama Joshi edited comment on SPARK-12148 at 3/18/16 11:48 PM: - Run the

[jira] [Updated] (SPARK-13041) Add a driver ui link and a mesos sandbox link on the dispatcher's ui page for each driver

2016-03-18 Thread Stavros Kontopoulos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stavros Kontopoulos updated SPARK-13041: Description: It would be convenient to have the uri of the driver's ui and the

[jira] [Commented] (SPARK-13967) Add binary toggle Param to PySpark CountVectorizer

2016-03-18 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15201803#comment-15201803 ] Bryan Cutler commented on SPARK-13967: -- Sure, I'd like to do this - thanks! > Add binary toggle

[jira] [Assigned] (SPARK-14001) support multi-children Union in SQLBuilder

2016-03-18 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-14001: Assignee: (was: Apache Spark) > support multi-children Union in SQLBuilder >

[jira] [Assigned] (SPARK-13963) Add binary toggle Param to ml.HashingTF

2016-03-18 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13963: Assignee: Apache Spark (was: Bryan Cutler) > Add binary toggle Param to ml.HashingTF >

[jira] [Commented] (SPARK-13832) TPC-DS Query 36 fails with Parser error

2016-03-18 Thread Xin Wu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15202264#comment-15202264 ] Xin Wu commented on SPARK-13832: what i meant is that in Spark 2.0, it seems that "grouping__id" is

[jira] [Comment Edited] (SPARK-13955) Spark in yarn mode fails

2016-03-18 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15202411#comment-15202411 ] Jeff Zhang edited comment on SPARK-13955 at 3/19/16 12:29 AM: -- I can

[jira] [Commented] (SPARK-14006) Builds of 1.6 branch fail R style check

2016-03-18 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15201906#comment-15201906 ] Yin Huai commented on SPARK-14006: -- Thanks! I just came across those builds. This is not blocking

[jira] [Closed] (SPARK-13863) TPCDS query 66 returns wrong results compared to TPC official result set

2016-03-18 Thread JESSE CHEN (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] JESSE CHEN closed SPARK-13863. -- Resolution: Workaround fixed schema. > TPCDS query 66 returns wrong results compared to TPC official

[jira] [Commented] (SPARK-11319) PySpark silently accepts null values in non-nullable DataFrame fields.

2016-03-18 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15199967#comment-15199967 ] Yin Huai commented on SPARK-11319: -- Thank you for the doc update. We recently added a check to make sure

[jira] [Created] (SPARK-14012) Extract VectorizedColumnReader from VectorizedParquetRecordReader

2016-03-18 Thread Sameer Agarwal (JIRA)
Sameer Agarwal created SPARK-14012: -- Summary: Extract VectorizedColumnReader from VectorizedParquetRecordReader Key: SPARK-14012 URL: https://issues.apache.org/jira/browse/SPARK-14012 Project: Spark

[jira] [Commented] (SPARK-13951) PySpark ml.pipeline support export/import - nested Piplines

2016-03-18 Thread Xusen Yin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15198642#comment-15198642 ] Xusen Yin commented on SPARK-13951: --- I start work on it now. > PySpark ml.pipeline support

[jira] [Updated] (SPARK-13960) JAR/File HTTP Server doesn't respect spark.driver.host and there is no "spark.fileserver.host" option

2016-03-18 Thread Ilya Ostrovskiy (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ilya Ostrovskiy updated SPARK-13960: Summary: JAR/File HTTP Server doesn't respect spark.driver.host and there is no

[jira] [Resolved] (SPARK-13761) Deprecate validateParams

2016-03-18 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley resolved SPARK-13761. --- Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 11620

[jira] [Updated] (SPARK-8971) Support balanced class labels when splitting train/cross validation sets

2016-03-18 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-8971: -- Shepherd: Nick Pentreath Target Version/s: (was: ) > Support balanced class

[jira] [Updated] (SPARK-7425) spark.ml Predictor should support other numeric types for label

2016-03-18 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-7425: -- Shepherd: Nick Pentreath > spark.ml Predictor should support other numeric types for label >

[jira] [Updated] (SPARK-13986) Make `DeveloperApi`-annotated things public

2016-03-18 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-13986: -- Summary: Make `DeveloperApi`-annotated things public (was: Make `DeveloperApi`-annotated

  1   2   >