[jira] [Commented] (SPARK-13865) TPCDS query 87 returns wrong results compared to TPC official result set

2016-03-19 Thread JESSE CHEN (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15200850#comment-15200850 ] JESSE CHEN commented on SPARK-13865: Hive, big sql, db2 queries are all generated off corresponding

[jira] [Commented] (SPARK-11293) Spillable collections leak shuffle memory

2016-03-19 Thread Nezih Yigitbasi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15200543#comment-15200543 ] Nezih Yigitbasi commented on SPARK-11293: - [~joshrosen] any plans to fix this? I believe we are

[jira] [Created] (SPARK-13983) HiveThriftServer2 can not get "--hiveconf" or ''--hivevar" variables since 1.6 version (both multi-session and single session)

2016-03-19 Thread Teng Qiu (JIRA)
Teng Qiu created SPARK-13983: Summary: HiveThriftServer2 can not get "--hiveconf" or ''--hivevar" variables since 1.6 version (both multi-session and single session) Key: SPARK-13983 URL:

[jira] [Commented] (SPARK-13987) Build fails due to scala version mismatch between

2016-03-19 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-13987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15200505#comment-15200505 ] Jean-Baptiste Onofré commented on SPARK-13987: -- I'm trying to use build/mvn now. > Build

[jira] [Created] (SPARK-13953) Support for specifying the field name for corrupted record at JSON datasource.

2016-03-19 Thread Hyukjin Kwon (JIRA)
Hyukjin Kwon created SPARK-13953: Summary: Support for specifying the field name for corrupted record at JSON datasource. Key: SPARK-13953 URL: https://issues.apache.org/jira/browse/SPARK-13953

[jira] [Commented] (SPARK-13955) Spark in yarn mode fails

2016-03-19 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15201840#comment-15201840 ] Marcelo Vanzin commented on SPARK-13955: Actually, this was an oversight. I need to change the

[jira] [Updated] (SPARK-13983) HiveThriftServer2 can not get "--hiveconf" or ''--hivevar" variables since 1.6 version (both multi-session and single session)

2016-03-19 Thread Teng Qiu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Teng Qiu updated SPARK-13983: - Environment: ubuntu, scala 2.10.5, hadoop 2.6 spark 1.6.0 standalone, spark 1.6.1 standalone (tried

[jira] [Updated] (SPARK-13953) Support for specifying the field name for corrupted record at JSON datasource.

2016-03-19 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-13953: - Description: It would be great if we maybe set {{spark.sql.columnNameOfCorruptRecord}} via

[jira] [Resolved] (SPARK-13927) Add row/column iterator to local matrices

2016-03-19 Thread DB Tsai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] DB Tsai resolved SPARK-13927. - Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 11757

[jira] [Created] (SPARK-13944) Separate out local linear algebra as a standalone module without Spark dependency

2016-03-19 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-13944: - Summary: Separate out local linear algebra as a standalone module without Spark dependency Key: SPARK-13944 URL: https://issues.apache.org/jira/browse/SPARK-13944

[jira] [Assigned] (SPARK-13908) Limit not pushed down

2016-03-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13908: Assignee: (was: Apache Spark) > Limit not pushed down > - > >

[jira] [Commented] (SPARK-13877) Consider removing Kafka modules from Spark / Spark Streaming

2016-03-19 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15200249#comment-15200249 ] Reynold Xin commented on SPARK-13877: - Seems really high overhead. Might as well just keep it in that

[jira] [Updated] (SPARK-12789) Support order by ordinal in SQL

2016-03-19 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-12789: Summary: Support order by ordinal in SQL (was: Support order by position in SQL) > Support order

[jira] [Commented] (SPARK-13363) Aggregator not working with DataFrame

2016-03-19 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15200094#comment-15200094 ] Reynold Xin commented on SPARK-13363: - [~maropu] if you have time to bring your patch up to date,

[jira] [Commented] (SPARK-13977) Bring back ShuffledHashJoin

2016-03-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1529#comment-1529 ] Apache Spark commented on SPARK-13977: -- User 'davies' has created a pull request for this issue:

[jira] [Created] (SPARK-13996) Add more not null attributes for Filter codegen

2016-03-19 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created SPARK-13996: --- Summary: Add more not null attributes for Filter codegen Key: SPARK-13996 URL: https://issues.apache.org/jira/browse/SPARK-13996 Project: Spark Issue

[jira] [Commented] (SPARK-13886) ArrayType of BinaryType not supported in Row.equals method

2016-03-19 Thread Rishabh Bhardwaj (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15198755#comment-15198755 ] Rishabh Bhardwaj commented on SPARK-13886: -- For q2: [~mahmoud.hanafy] You can use List instead

[jira] [Commented] (SPARK-13862) TPCDS query 49 returns wrong results compared to TPC official result set

2016-03-19 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15198590#comment-15198590 ] Xiao Li commented on SPARK-13862: - This one is simple. : ) Spark SQL 1.6 does not support "order by the

[jira] [Commented] (SPARK-13934) SqlParser.parseTableIdentifier cannot recognize table name start with scientific notation

2016-03-19 Thread Yang Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15197570#comment-15197570 ] Yang Wang commented on SPARK-13934: --- I have checked, backticked identifiers don't have this problem.

[jira] [Updated] (SPARK-13975) Cannot specify extra libs for executor from /extra-lib

2016-03-19 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-13975: -- Priority: Minor (was: Major) Issue Type: Improvement (was: Bug) It's probably best to lay down

[jira] [Comment Edited] (SPARK-13865) TPCDS query 87 returns wrong results compared to TPC official result set

2016-03-19 Thread JESSE CHEN (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15200850#comment-15200850 ] JESSE CHEN edited comment on SPARK-13865 at 3/18/16 2:24 AM: - Hive, big sql,

[jira] [Comment Edited] (SPARK-13886) ArrayType of BinaryType not supported in Row.equals method

2016-03-19 Thread MahmoudHanafy (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15198763#comment-15198763 ] MahmoudHanafy edited comment on SPARK-13886 at 3/17/16 5:24 AM: I think

[jira] [Updated] (SPARK-14004) AttributeReference and Alias should only use their first qualifier to build SQL representations

2016-03-19 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-14004: --- Priority: Minor (was: Major) > AttributeReference and Alias should only use their first qualifier

[jira] [Commented] (SPARK-13982) SparkR - KMeans predict: Output column name of features is an unclear, automatically genetared text

2016-03-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15200293#comment-15200293 ] Apache Spark commented on SPARK-13982: -- User 'NarineK' has created a pull request for this issue:

[jira] [Commented] (SPARK-13877) Consider removing Kafka modules from Spark / Spark Streaming

2016-03-19 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15200253#comment-15200253 ] Reynold Xin commented on SPARK-13877: - "Overhead". Tools are much better outside the ASF for

[jira] [Commented] (SPARK-7481) Add Hadoop 2.6+ profile to pull in object store FS accessors

2016-03-19 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15197451#comment-15197451 ] Nicholas Chammas commented on SPARK-7481: - (Sorry Steve; can't comment on your proposal since I

[jira] [Assigned] (SPARK-13958) Executor OOM due to unbounded growth of pointer array in Sorter

2016-03-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13958: Assignee: Apache Spark > Executor OOM due to unbounded growth of pointer array in Sorter

[jira] [Created] (SPARK-13985) WAL for determistic batches with IDs

2016-03-19 Thread Michael Armbrust (JIRA)
Michael Armbrust created SPARK-13985: Summary: WAL for determistic batches with IDs Key: SPARK-13985 URL: https://issues.apache.org/jira/browse/SPARK-13985 Project: Spark Issue Type:

[jira] [Commented] (SPARK-13860) TPCDS query 39 returns wrong results compared to TPC official result set

2016-03-19 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15198670#comment-15198670 ] Xiao Li commented on SPARK-13860: - All the missing rows have the same pattern. NaN. > TPCDS query 39

[jira] [Resolved] (SPARK-13871) Add support for inferring filters from data constraints

2016-03-19 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai resolved SPARK-13871. -- Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 11665

[jira] [Updated] (SPARK-13982) SparkR - KMeans predict: Output column name of features is an unclear, automaticly genetared text

2016-03-19 Thread Narine Kokhlikyan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Narine Kokhlikyan updated SPARK-13982: -- Summary: SparkR - KMeans predict: Output column name of features is an unclear,

[jira] [Commented] (SPARK-13859) TPCDS query 38 returns wrong results compared to TPC official result set

2016-03-19 Thread JESSE CHEN (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15200177#comment-15200177 ] JESSE CHEN commented on SPARK-13859: Tested both q87 and q38 on the lab's cluster. With this

[jira] [Commented] (SPARK-13823) Always specify Charset in String <-> byte[] conversions (and remaining Coverity items)

2016-03-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15197350#comment-15197350 ] Apache Spark commented on SPARK-13823: -- User 'srowen' has created a pull request for this issue:

[jira] [Commented] (SPARK-13865) TPCDS query 87 returns wrong results compared to TPC official result set

2016-03-19 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15200885#comment-15200885 ] Xiao Li commented on SPARK-13865: - Yeah, I will deliver a PR for improving Intersect in the next few

[jira] [Assigned] (SPARK-13986) Make `DeveloperApi`-annotated things public

2016-03-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13986: Assignee: Apache Spark > Make `DeveloperApi`-annotated things public >

[jira] [Commented] (SPARK-13821) TPC-DS Query 20 fails to compile

2016-03-19 Thread Roy Cecil (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15201707#comment-15201707 ] Roy Cecil commented on SPARK-13821: --- Dilip, we can close this. Likely a defect in the kit . > TPC-DS

[jira] [Assigned] (SPARK-13948) MiMa Check should catch if the visibility change to `private`

2016-03-19 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen reassigned SPARK-13948: -- Assignee: Josh Rosen > MiMa Check should catch if the visibility change to `private` >

[jira] [Commented] (SPARK-13975) Cannot specify extra libs for executor from /extra-lib

2016-03-19 Thread Leonid Poliakov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15199723#comment-15199723 ] Leonid Poliakov commented on SPARK-13975: - Do you know any other clean way to load bundled libs

[jira] [Updated] (SPARK-7425) spark.ml Predictor should support other numeric types for label

2016-03-19 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-7425: -- Assignee: Benjamin Fradet > spark.ml Predictor should support other numeric types for label >

[jira] [Commented] (SPARK-5594) SparkException: Failed to get broadcast (TorrentBroadcast)

2016-03-19 Thread Hiten Patel (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15198556#comment-15198556 ] Hiten Patel commented on SPARK-5594: Yes, this was indeed the problem in my case too. I had a custom

[jira] [Assigned] (SPARK-13948) MiMa Check should catch if the visibility change to `private`

2016-03-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13948: Assignee: Apache Spark > MiMa Check should catch if the visibility change to

[jira] [Commented] (SPARK-13933) hadoop-2.7 profile's curator version should be 2.7.1

2016-03-19 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15197206#comment-15197206 ] Steve Loughran commented on SPARK-13933: No custom guava built. Hadoop dealt with the problem by

[jira] [Assigned] (SPARK-13948) MiMa Check should catch if the visibility change to `private`

2016-03-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13948: Assignee: (was: Apache Spark) > MiMa Check should catch if the visibility

[jira] [Updated] (SPARK-13963) Add binary toggle Param to ml.HashingTF

2016-03-19 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-13963: --- Description: It would be handy to add a binary toggle Param to {{HashingTF}}, as in the

[jira] [Commented] (SPARK-13957) Support group by ordinal in SQL

2016-03-19 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15198839#comment-15198839 ] Xiao Li commented on SPARK-13957: - Sure, will do it. Thanks! > Support group by ordinal in SQL >

[jira] [Commented] (SPARK-13877) Consider removing Kafka modules from Spark / Spark Streaming

2016-03-19 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15200088#comment-15200088 ] Reynold Xin commented on SPARK-13877: - I'm not 100% sure it is a good idea to move it out (it might

[jira] [Created] (SPARK-14000) case class with a tuple field can't work in Dataset

2016-03-19 Thread Wenchen Fan (JIRA)
Wenchen Fan created SPARK-14000: --- Summary: case class with a tuple field can't work in Dataset Key: SPARK-14000 URL: https://issues.apache.org/jira/browse/SPARK-14000 Project: Spark Issue

[jira] [Updated] (SPARK-13935) Other clients' connection hang up when someone do huge load

2016-03-19 Thread Tao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Wang updated SPARK-13935: - Description: We run a sql like "insert overwrite table store_returns partition (sr_returned_date)

[jira] [Commented] (SPARK-13866) Handle decimal type in CSV inference

2016-03-19 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15197595#comment-15197595 ] Ruslan Dautkhanov commented on SPARK-13866: --- Would be great to have this fix in. It makes

[jira] [Created] (SPARK-13952) spark.ml GBT algs need to use random seed

2016-03-19 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-13952: - Summary: spark.ml GBT algs need to use random seed Key: SPARK-13952 URL: https://issues.apache.org/jira/browse/SPARK-13952 Project: Spark Issue

[jira] [Updated] (SPARK-13986) Make `DeveloperApi`-annotated class/object public

2016-03-19 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-13986: -- Description: Spark uses `@DeveloperApi` annotation, but sometimes it seems to conflict with

[jira] [Commented] (SPARK-799) Windows versions of the deploy scripts

2016-03-19 Thread Joan Goyeau (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15197313#comment-15197313 ] Joan Goyeau commented on SPARK-799: --- Make sense > Windows versions of the deploy scripts >

[jira] [Updated] (SPARK-13041) Add a driver history ui link and a mesos sandbox link on the dispatcher's ui page for each driver

2016-03-19 Thread Stavros Kontopoulos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stavros Kontopoulos updated SPARK-13041: Summary: Add a driver history ui link and a mesos sandbox link on the dispatcher's

[jira] [Commented] (SPARK-13995) Constraints should take care of Cast

2016-03-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15200903#comment-15200903 ] Apache Spark commented on SPARK-13995: -- User 'viirya' has created a pull request for this issue:

[jira] [Updated] (SPARK-13961) spark.ml ChiSqSelector and RFormula should support other numeric types for label

2016-03-19 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-13961: --- Summary: spark.ml ChiSqSelector and RFormula should support other numeric types for label

[jira] [Assigned] (SPARK-13994) Investigate types that are not supported by vectorized parquet record reader

2016-03-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13994: Assignee: Apache Spark > Investigate types that are not supported by vectorized parquet

[jira] [Commented] (SPARK-13831) TPC-DS Query 35 fails with the following compile error

2016-03-19 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15198827#comment-15198827 ] Davies Liu commented on SPARK-13831: Yes, Exists is not supported in Spark SQL yet. > TPC-DS Query

[jira] [Created] (SPARK-13959) Audit MiMa excludes added in SPARK-13948 to make sure none are unintended incompatibilities

2016-03-19 Thread Josh Rosen (JIRA)
Josh Rosen created SPARK-13959: -- Summary: Audit MiMa excludes added in SPARK-13948 to make sure none are unintended incompatibilities Key: SPARK-13959 URL: https://issues.apache.org/jira/browse/SPARK-13959

[jira] [Commented] (SPARK-13289) Word2Vec generate infinite distances when numIterations>5

2016-03-19 Thread Junyang Shen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15200962#comment-15200962 ] Junyang Shen commented on SPARK-13289: -- This PR gives the distance values between 0 and 1. scala>

[jira] [Commented] (SPARK-13860) TPCDS query 39 returns wrong results compared to TPC official result set

2016-03-19 Thread Suresh Thalamati (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15200330#comment-15200330 ] Suresh Thalamati commented on SPARK-13860: -- I looked into this issue and found out NULL values

[jira] [Created] (SPARK-14014) Replace existing analysis.Catalog with SessionCatalog

2016-03-19 Thread Andrew Or (JIRA)
Andrew Or created SPARK-14014: - Summary: Replace existing analysis.Catalog with SessionCatalog Key: SPARK-14014 URL: https://issues.apache.org/jira/browse/SPARK-14014 Project: Spark Issue Type:

[jira] [Created] (SPARK-13960) HTTP-based JAR Server doesn't respect spark.driver.host and there is no "spark.fileserver.host" option

2016-03-19 Thread Ilya Ostrovskiy (JIRA)
Ilya Ostrovskiy created SPARK-13960: --- Summary: HTTP-based JAR Server doesn't respect spark.driver.host and there is no "spark.fileserver.host" option Key: SPARK-13960 URL:

[jira] [Comment Edited] (SPARK-12148) SparkR: rename DataFrame to SparkDataFrame

2016-03-19 Thread Sun Rui (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15200940#comment-15200940 ] Sun Rui edited comment on SPARK-12148 at 3/18/16 3:33 AM: -- An issue reported in

[jira] [Commented] (SPARK-13934) SqlParser.parseTableIdentifier cannot recognize table name start with scientific notation

2016-03-19 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15197558#comment-15197558 ] Herman van Hovell commented on SPARK-13934: --- Could you check if backticked identifiers have

[jira] [Commented] (SPARK-13932) CUBE Query with filter (HAVING) and condition (IF) raises an AnalysisException

2016-03-19 Thread Tien-Dung LE (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15199683#comment-15199683 ] Tien-Dung LE commented on SPARK-13932: -- Hi Xiao, not yet! I only tried with Spark version 1.6.1 and

[jira] [Commented] (SPARK-12789) Support order by ordinal in SQL

2016-03-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15201071#comment-15201071 ] Apache Spark commented on SPARK-12789: -- User 'gatorsmile' has created a pull request for this issue:

[jira] [Assigned] (SPARK-13579) Stop building assemblies for Spark

2016-03-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13579: Assignee: (was: Apache Spark) > Stop building assemblies for Spark >

[jira] [Assigned] (SPARK-913) log the size of each shuffle block in block manager

2016-03-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-913: -- Assignee: (was: Apache Spark) > log the size of each shuffle block in block manager >

[jira] [Commented] (SPARK-13972) hive tests should fail if SQL generation failed

2016-03-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15199511#comment-15199511 ] Apache Spark commented on SPARK-13972: -- User 'cloud-fan' has created a pull request for this issue:

[jira] [Updated] (SPARK-13978) [GSoC 2016] Monitoring UI and infrastructure for Spark SQL and structured streaming

2016-03-19 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated SPARK-13978: - Labels: GSOC2016 (was: ) > [GSoC 2016] Monitoring UI and infrastructure for Spark SQL and structured >

[jira] [Assigned] (SPARK-14001) support multi-children Union in SQLBuilder

2016-03-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-14001: Assignee: Apache Spark > support multi-children Union in SQLBuilder >

[jira] [Assigned] (SPARK-13980) Incrementally serialize blocks while unrolling them in MemoryStore

2016-03-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13980: Assignee: Josh Rosen (was: Apache Spark) > Incrementally serialize blocks while

[jira] [Commented] (SPARK-12719) SQL generation support for generators (including UDTF)

2016-03-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15200512#comment-15200512 ] Apache Spark commented on SPARK-12719: -- User 'yy2016' has created a pull request for this issue:

[jira] [Updated] (SPARK-13838) Clear variable code to prevent it to be re-evaluated in BoundAttribute

2016-03-19 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-13838: -- Assignee: Liang-Chi Hsieh > Clear variable code to prevent it to be re-evaluated in BoundAttribute >

[jira] [Updated] (SPARK-13976) do not remove sub-queries added by user when generate SQL

2016-03-19 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-13976: -- Assignee: Wenchen Fan > do not remove sub-queries added by user when generate SQL >

[jira] [Updated] (SPARK-13958) Executor OOM due to unbounded growth of pointer array in Sorter

2016-03-19 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-13958: -- Assignee: Sital Kedia > Executor OOM due to unbounded growth of pointer array in Sorter >

[jira] [Updated] (SPARK-13989) Remove non-vectorized/unsafe-row parquet record reader

2016-03-19 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-13989: -- Assignee: Sameer Agarwal > Remove non-vectorized/unsafe-row parquet record reader >

[jira] [Updated] (SPARK-13930) Apply fast serialization on collect limit

2016-03-19 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-13930: -- Assignee: Liang-Chi Hsieh > Apply fast serialization on collect limit >

[jira] [Updated] (SPARK-13427) Support USING clause in JOIN

2016-03-19 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-13427: -- Assignee: Dilip Biswal > Support USING clause in JOIN > > >

[jira] [Updated] (SPARK-13939) Kafka createDirectStream not parallelizing properly

2016-03-19 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-13939: -- Component/s: Streaming > Kafka createDirectStream not parallelizing properly >

[jira] [Updated] (SPARK-13992) Add support for off-heap caching

2016-03-19 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-13992: -- Component/s: Spark Core > Add support for off-heap caching > > >

[jira] [Updated] (SPARK-14000) case class with a tuple field can't work in Dataset

2016-03-19 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-14000: -- Component/s: Spark Core > case class with a tuple field can't work in Dataset >

[jira] [Updated] (SPARK-13955) Spark in yarn mode fails

2016-03-19 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-13955: -- Component/s: YARN > Spark in yarn mode fails > > > Key:

[jira] [Updated] (SPARK-14013) Properly implement temporary functions in SessionCatalog

2016-03-19 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-14013: -- Component/s: SQL > Properly implement temporary functions in SessionCatalog >

[jira] [Updated] (SPARK-13981) Improve Filter generated code to defer variable evaluation within operator

2016-03-19 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-13981: -- Component/s: SQL > Improve Filter generated code to defer variable evaluation within operator >

[jira] [Updated] (SPARK-13999) Run 'group by' before building cube

2016-03-19 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-13999: -- Component/s: SQL > Run 'group by' before building cube > > >

[jira] [Updated] (SPARK-13966) Regression using .withColumn() on a parquet

2016-03-19 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-13966: -- Component/s: SQL > Regression using .withColumn() on a parquet >

[jira] [Updated] (SPARK-14014) Replace existing analysis.Catalog with SessionCatalog

2016-03-19 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-14014: -- Component/s: SQL > Replace existing analysis.Catalog with SessionCatalog >

[jira] [Commented] (SPARK-13983) HiveThriftServer2 can not get "--hiveconf" or ''--hivevar" variables since 1.6 version (both multi-session and single session)

2016-03-19 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15200328#comment-15200328 ] Davies Liu commented on SPARK-13983: [~lian cheng] Could you help to look at this one? >

[jira] [Commented] (SPARK-13970) Add Non-Negative Matrix Factorization to MLlib

2016-03-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15199262#comment-15199262 ] Apache Spark commented on SPARK-13970: -- User 'zhengruifeng' has created a pull request for this

[jira] [Resolved] (SPARK-7143) Add BM25 Estimator

2016-03-19 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-7143. -- Resolution: Won't Fix > Add BM25 Estimator > -- > > Key: SPARK-7143 >

[jira] [Resolved] (SPARK-13994) Investigate types that are not supported by vectorized parquet record reader

2016-03-19 Thread Sameer Agarwal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sameer Agarwal resolved SPARK-13994. Resolution: Done > Investigate types that are not supported by vectorized parquet record

[jira] [Commented] (SPARK-13934) SqlParser.parseTableIdentifier cannot recognize table name start with scientific notation

2016-03-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15197266#comment-15197266 ] Apache Spark commented on SPARK-13934: -- User 'wangyang1992' has created a pull request for this

[jira] [Commented] (SPARK-13968) Use MurmurHash3 for hashing String features

2016-03-19 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15199639#comment-15199639 ] Yanbo Liang commented on SPARK-13968: - [~mlnick] Can I work on this? > Use MurmurHash3 for hashing

[jira] [Updated] (SPARK-13968) User MurmurHash3 for hashing String features

2016-03-19 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-13968: --- Summary: User MurmurHash3 for hashing String features (was: User MurmurHash for feature

[jira] [Created] (SPARK-13965) Driver should kill the other running task attempts if any one task attempt succeeds for the same task

2016-03-19 Thread Devaraj K (JIRA)
Devaraj K created SPARK-13965: - Summary: Driver should kill the other running task attempts if any one task attempt succeeds for the same task Key: SPARK-13965 URL: https://issues.apache.org/jira/browse/SPARK-13965

[jira] [Resolved] (SPARK-13938) word2phrase feature created in ML

2016-03-19 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-13938. --- Resolution: Won't Fix > word2phrase feature created in ML > - > >

[jira] [Commented] (SPARK-13905) Change implementation of as.data.frame() to avoid conflict with the ones in the R base package

2016-03-19 Thread Sun Rui (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15200938#comment-15200938 ] Sun Rui commented on SPARK-13905: - This issue is not related to as.data.frame() in SparkR. but seems due

[jira] [Created] (SPARK-13994) Investigate types that are not supported by vectorized parquet record reader

2016-03-19 Thread Sameer Agarwal (JIRA)
Sameer Agarwal created SPARK-13994: -- Summary: Investigate types that are not supported by vectorized parquet record reader Key: SPARK-13994 URL: https://issues.apache.org/jira/browse/SPARK-13994

[jira] [Commented] (SPARK-13860) TPCDS query 39 returns wrong results compared to TPC official result set

2016-03-19 Thread Suresh Thalamati (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15200805#comment-15200805 ] Suresh Thalamati commented on SPARK-13860: -- [~yhuai] [~mengxr] I noticed there is discussion on

<    1   2   3   4   5   6   >