[jira] [Commented] (SPARK-8128) Schema Merging Broken: Dataframe Fails to Recognize Column in Schema

2016-10-08 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15558088#comment-15558088 ] Hyukjin Kwon commented on SPARK-8128: - I am not 100% sure but I recall I saw similar issue was

[jira] [Closed] (SPARK-17540) SparkR array serde cannot work correctly when array length == 0

2016-10-08 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu closed SPARK-17540. -- Resolution: Won't Fix > SparkR array serde cannot work correctly when array length == 0 >

[jira] [Closed] (SPARK-10427) Spark-sql -f or -e will output some

2016-10-08 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li closed SPARK-10427. --- Resolution: Not A Problem > Spark-sql -f or -e will output some > --- > >

[jira] [Resolved] (SPARK-16903) nullValue in first field is not respected by CSV source when read

2016-10-08 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-16903. -- Resolution: Duplicate [~falaki] I am going to make this as a duplicate because the PR was

[jira] [Closed] (SPARK-7012) Add support for NOT NULL modifier for column definitions on DDLParser

2016-10-08 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li closed SPARK-7012. -- Resolution: Not A Problem > Add support for NOT NULL modifier for column definitions on DDLParser >

[jira] [Commented] (SPARK-9442) java.lang.ArithmeticException: / by zero when reading Parquet

2016-10-08 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15558512#comment-15558512 ] Xiao Li commented on SPARK-9442: Is it still a problem in the latest branch? >

[jira] [Commented] (SPARK-17344) Kafka 0.8 support for Structured Streaming

2016-10-08 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15558474#comment-15558474 ] Cody Koeninger commented on SPARK-17344: I think this is premature until you have a fully

[jira] [Commented] (SPARK-10703) Physical filter operators should replace the general AND/OR/equality/etc with a special version that treats null as false

2016-10-08 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15558509#comment-15558509 ] Xiao Li commented on SPARK-10703: - The problem has been resolved, I think. Try the latest branch by

[jira] [Closed] (SPARK-10703) Physical filter operators should replace the general AND/OR/equality/etc with a special version that treats null as false

2016-10-08 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li closed SPARK-10703. --- Resolution: Not A Problem > Physical filter operators should replace the general AND/OR/equality/etc with >

[jira] [Commented] (SPARK-17812) More granular control of starting offsets

2016-10-08 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15558506#comment-15558506 ] Cody Koeninger commented on SPARK-17812: So I'm willing to do this work, mostly because I've

[jira] [Commented] (SPARK-11428) Schema Merging Broken for Some Queries

2016-10-08 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15558095#comment-15558095 ] Hyukjin Kwon commented on SPARK-11428: -- How about https://issues.apache.org/jira/browse/SPARK-8128 ?

[jira] [Commented] (SPARK-10044) AnalysisException in resolving reference for sorting with aggregation

2016-10-08 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15558470#comment-15558470 ] Xiao Li commented on SPARK-10044: - This has been resolved at least in the Spark 2.0. Thus, it should not

[jira] [Closed] (SPARK-10044) AnalysisException in resolving reference for sorting with aggregation

2016-10-08 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li closed SPARK-10044. --- Resolution: Not A Problem > AnalysisException in resolving reference for sorting with aggregation >

[jira] [Comment Edited] (SPARK-5511) [SQL] Possible optimisations for predicate pushdowns from Spark SQL to Parquet

2016-10-08 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15557667#comment-15557667 ] Hyukjin Kwon edited comment on SPARK-5511 at 10/8/16 4:52 PM: -- 1. I agree it

[jira] [Commented] (SPARK-10545) HiveMetastoreTypes.toMetastoreType should handle interval type

2016-10-08 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15558489#comment-15558489 ] Xiao Li commented on SPARK-10545: - Both Hive and Spark do not support INTERVAL as a column data type.

[jira] [Commented] (SPARK-7012) Add support for NOT NULL modifier for column definitions on DDLParser

2016-10-08 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15558466#comment-15558466 ] Xiao Li commented on SPARK-7012: Since 2.0, we have a native Parser. Thus, this has been resolved. Thanks!

[jira] [Commented] (SPARK-10972) UDFs in SQL joins

2016-10-08 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15558577#comment-15558577 ] Xiao Li commented on SPARK-10972: - There is a workaround to fix it. You can specify the filter above

[jira] [Closed] (SPARK-7097) Partitioned tables should only consider referred partitions in query during size estimation for checking against autoBroadcastJoinThreshold

2016-10-08 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li closed SPARK-7097. -- Resolution: Won't Fix > Partitioned tables should only consider referred partitions in query during > size

[jira] [Commented] (SPARK-7097) Partitioned tables should only consider referred partitions in query during size estimation for checking against autoBroadcastJoinThreshold

2016-10-08 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15558718#comment-15558718 ] Xiao Li commented on SPARK-7097: This will be resolved in the ongoing CBO work. Thus, close it now.

[jira] [Commented] (SPARK-17626) TPC-DS performance improvements using star-schema heuristics

2016-10-08 Thread Ioana Delaney (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15558762#comment-15558762 ] Ioana Delaney commented on SPARK-17626: --- [~mikewzh] Thank you. Yes, having informational RI

[jira] [Commented] (SPARK-10101) Spark JDBC writer mapping String to TEXT or VARCHAR

2016-10-08 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15558901#comment-15558901 ] Xiao Li commented on SPARK-10101: - This has been resolved in the master. If you still hit any bug, please

[jira] [Closed] (SPARK-10101) Spark JDBC writer mapping String to TEXT or VARCHAR

2016-10-08 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li closed SPARK-10101. --- Resolution: Not A Problem > Spark JDBC writer mapping String to TEXT or VARCHAR >

[jira] [Updated] (SPARK-17837) Disaster recovery of offsets from WAL

2016-10-08 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cody Koeninger updated SPARK-17837: --- Summary: Disaster recovery of offsets from WAL (was: Disaster recover of offsets from WAL)

[jira] [Commented] (SPARK-17815) Report committed offsets

2016-10-08 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15558528#comment-15558528 ] Cody Koeninger commented on SPARK-17815: So if you start committing offsets to kafka, there are

[jira] [Created] (SPARK-17837) Disaster recover of offsets from WAL

2016-10-08 Thread Cody Koeninger (JIRA)
Cody Koeninger created SPARK-17837: -- Summary: Disaster recover of offsets from WAL Key: SPARK-17837 URL: https://issues.apache.org/jira/browse/SPARK-17837 Project: Spark Issue Type:

[jira] [Updated] (SPARK-14212) Add configuration element for --packages option

2016-10-08 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk updated SPARK-14212: Component/s: (was: Spark Shell) (was: Spark Core) Documentation

[jira] [Commented] (SPARK-10804) "LOCAL" in LOAD DATA LOCAL INPATH means "remote"

2016-10-08 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15558532#comment-15558532 ] Xiao Li commented on SPARK-10804: - In Spark 2.0, we rewrote the whole part, especially the load command

[jira] [Closed] (SPARK-10804) "LOCAL" in LOAD DATA LOCAL INPATH means "remote"

2016-10-08 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li closed SPARK-10804. --- Resolution: Not A Problem > "LOCAL" in LOAD DATA LOCAL INPATH means "remote" >

[jira] [Commented] (SPARK-14212) Add configuration element for --packages option

2016-10-08 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15558540#comment-15558540 ] holdenk commented on SPARK-14212: - So I think this would be a good option to document for Python users,

[jira] [Updated] (SPARK-14212) Add configuration element for --packages option

2016-10-08 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk updated SPARK-14212: Labels: config starter (was: config fun happy pants spark-shell) > Add configuration element for

[jira] [Updated] (SPARK-14212) Add configuration element for --packages option

2016-10-08 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk updated SPARK-14212: Priority: Trivial (was: Major) > Add configuration element for --packages option >

[jira] [Commented] (SPARK-4960) Interceptor pattern in receivers

2016-10-08 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15558557#comment-15558557 ] Cody Koeninger commented on SPARK-4960: --- Is this idea pretty much dead at this point? It seems like

[jira] [Updated] (SPARK-10860) Bivariate Statistics: Chi-Squared independence test

2016-10-08 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-10860: Assignee: (was: Jihong MA) > Bivariate Statistics: Chi-Squared independence test >

[jira] [Updated] (SPARK-10860) Bivariate Statistics: Chi-Squared independence test

2016-10-08 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-10860: Component/s: (was: SQL) > Bivariate Statistics: Chi-Squared independence test >

[jira] [Updated] (SPARK-10646) Bivariate Statistics: Pearson's Chi-Squared goodness of fit test

2016-10-08 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-10646: Assignee: (was: Jihong MA) > Bivariate Statistics: Pearson's Chi-Squared goodness of fit test >

[jira] [Comment Edited] (SPARK-10972) UDFs in SQL joins

2016-10-08 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15558581#comment-15558581 ] Xiao Li edited comment on SPARK-10972 at 10/8/16 7:34 PM: -- Also try to use the

[jira] [Commented] (SPARK-10972) UDFs in SQL joins

2016-10-08 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15558581#comment-15558581 ] Xiao Li commented on SPARK-10972: - Also try to use the SQL interface? > UDFs in SQL joins >

[jira] [Closed] (SPARK-10933) Spark SQL Joins should have option to fail query when row multiplication is encountered

2016-10-08 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li closed SPARK-10933. --- Resolution: Won't Fix > Spark SQL Joins should have option to fail query when row multiplication is >

[jira] [Commented] (SPARK-10933) Spark SQL Joins should have option to fail query when row multiplication is encountered

2016-10-08 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15558625#comment-15558625 ] Xiao Li commented on SPARK-10933: - Now, we have a conf `spark.sql.crossJoin.enabled`. Let me close it

[jira] [Commented] (SPARK-10427) Spark-sql -f or -e will output some

2016-10-08 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15558525#comment-15558525 ] Xiao Li commented on SPARK-10427: - This is not an issue since 2.0. Thanks! > Spark-sql -f or -e will

[jira] [Resolved] (SPARK-6649) DataFrame created through SQLContext.jdbc() failed if columns table must be quoted

2016-10-08 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-6649. Resolution: Fixed > DataFrame created through SQLContext.jdbc() failed if columns table must be > quoted >

[jira] [Commented] (SPARK-17147) Spark Streaming Kafka 0.10 Consumer Can't Handle Non-consecutive Offsets

2016-10-08 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15558572#comment-15558572 ] Cody Koeninger commented on SPARK-17147: I talked with Sean in person about this, and think

[jira] [Resolved] (SPARK-10805) JSON Data Frame does not return correct string lengths

2016-10-08 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-10805. - Resolution: Won't Fix > JSON Data Frame does not return correct string lengths >

[jira] [Commented] (SPARK-10805) JSON Data Frame does not return correct string lengths

2016-10-08 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15558595#comment-15558595 ] Xiao Li commented on SPARK-10805: - This is pretty expensive to find the max length for each field. That

[jira] [Closed] (SPARK-11055) Use mixing hash-based and sort-based aggregation in TungstenAggregationIterator

2016-10-08 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li closed SPARK-11055. --- Resolution: Duplicate > Use mixing hash-based and sort-based aggregation in > TungstenAggregationIterator >

[jira] [Commented] (SPARK-11055) Use mixing hash-based and sort-based aggregation in TungstenAggregationIterator

2016-10-08 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15558706#comment-15558706 ] Xiao Li commented on SPARK-11055: - Based on the PR, Davies did a similar work. [SPARK-11425]

[jira] [Closed] (SPARK-5818) unable to use "add jar" in hql

2016-10-08 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li closed SPARK-5818. -- Resolution: Not A Problem This has been supported. Please try the latest branch. Thanks! > unable to use "add

[jira] [Commented] (SPARK-10496) Efficient DataFrame cumulative sum

2016-10-08 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15558724#comment-15558724 ] Reynold Xin commented on SPARK-10496: - I think there are two separate issues here: 1. The API to run

[jira] [Closed] (SPARK-9265) Dataframe.limit joined with another dataframe can be non-deterministic

2016-10-08 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li closed SPARK-9265. -- Resolution: Not A Problem > Dataframe.limit joined with another dataframe can be non-deterministic >

[jira] [Commented] (SPARK-9265) Dataframe.limit joined with another dataframe can be non-deterministic

2016-10-08 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15558875#comment-15558875 ] Xiao Li commented on SPARK-9265: This has been resolved since our Optimizer push down `Limit` below

[jira] [Closed] (SPARK-14017) dataframe.dtypes -> pyspark.sql.types aliases

2016-10-08 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk closed SPARK-14017. --- Resolution: Won't Fix Thanks for bringing this issue up - I don't think we necessarily want to add these

[jira] [Resolved] (SPARK-3146) Improve the flexibility of Spark Streaming Kafka API to offer user the ability to process message before storing into BM

2016-10-08 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cody Koeninger resolved SPARK-3146. --- Resolution: Fixed Fix Version/s: 1.3.0 > Improve the flexibility of Spark Streaming

[jira] [Commented] (SPARK-3146) Improve the flexibility of Spark Streaming Kafka API to offer user the ability to process message before storing into BM

2016-10-08 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15558550#comment-15558550 ] Cody Koeninger commented on SPARK-3146: --- SPARK-4964 / the direct stream added a messageHandler. >

[jira] [Commented] (SPARK-10794) Spark-SQL- select query on table column with binary Data Type displays error message- java.lang.ClassCastException: java.lang.String cannot be cast to [B

2016-10-08 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15558631#comment-15558631 ] Xiao Li commented on SPARK-10794: - The related parts are changed a lot. Could you retry it? Thanks! >

[jira] [Resolved] (SPARK-11062) Thrift server does not support operationLog

2016-10-08 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-11062. - Resolution: Duplicate > Thrift server does not support operationLog >

[jira] [Commented] (SPARK-9205) org.apache.spark.sql.hive.HiveSparkSubmitSuite failing for Scala 2.11

2016-10-08 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15559052#comment-15559052 ] Xiao Li commented on SPARK-9205: This is not an issue, right? Since this JIRA is stale, let us close it

[jira] [Closed] (SPARK-9205) org.apache.spark.sql.hive.HiveSparkSubmitSuite failing for Scala 2.11

2016-10-08 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li closed SPARK-9205. -- Resolution: Cannot Reproduce > org.apache.spark.sql.hive.HiveSparkSubmitSuite failing for Scala 2.11 >

[jira] [Resolved] (SPARK-9359) Support IntervalType for Parquet

2016-10-08 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-9359. Resolution: Duplicate > Support IntervalType for Parquet > > >

[jira] [Closed] (SPARK-9359) Support IntervalType for Parquet

2016-10-08 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li closed SPARK-9359. -- Assignee: (was: Liang-Chi Hsieh) > Support IntervalType for Parquet > > >

[jira] [Commented] (SPARK-11087) spark.sql.orc.filterPushdown does not work, No ORC pushdown predicate

2016-10-08 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15559068#comment-15559068 ] Xiao Li commented on SPARK-11087: - Can you retry it using the latest master/2.0.1 branch? Thanks! >

[jira] [Commented] (SPARK-11758) Missing Index column while creating a DataFrame from Pandas

2016-10-08 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15559240#comment-15559240 ] holdenk commented on SPARK-11758: - I believe dropping the index field is intentional (but we should

[jira] [Commented] (SPARK-10501) support UUID as an atomic type

2016-10-08 Thread Russell Spitzer (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15559271#comment-15559271 ] Russell Spitzer commented on SPARK-10501: - It's not that we need it as a unique identifier. It's

[jira] [Closed] (SPARK-11523) spark_partition_id() considered invalid function

2016-10-08 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li closed SPARK-11523. --- Resolution: Not A Problem > spark_partition_id() considered invalid function >

[jira] [Commented] (SPARK-11523) spark_partition_id() considered invalid function

2016-10-08 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15559072#comment-15559072 ] Xiao Li commented on SPARK-11523: - Native views are supported since 2.0. Thus, this JIRA is not needed.

[jira] [Commented] (SPARK-6413) For data source tables, we should provide better output for DESCRIBE FORMATTED

2016-10-08 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15559074#comment-15559074 ] Xiao Li commented on SPARK-6413: This has been well supported since Spark 2.0. Thus, close it now. Thanks!

[jira] [Commented] (SPARK-10318) Getting issue in spark connectivity with cassandra

2016-10-08 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15559103#comment-15559103 ] Xiao Li commented on SPARK-10318: - Yeah. Will follow your guideline in the future. Thanks! > Getting

[jira] [Closed] (SPARK-11479) add kmeans example for Dataset

2016-10-08 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li closed SPARK-11479. --- Resolution: Won't Fix > add kmeans example for Dataset > -- > >

[jira] [Commented] (SPARK-11479) add kmeans example for Dataset

2016-10-08 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15559114#comment-15559114 ] Xiao Li commented on SPARK-11479: - Based on the PR, we should close it now. Please reopen it if you still

[jira] [Closed] (SPARK-14420) keepLastCheckpoint Param for Python LDA with EM

2016-10-08 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk closed SPARK-14420. --- Resolution: Duplicate > keepLastCheckpoint Param for Python LDA with EM >

[jira] [Updated] (SPARK-14420) keepLastCheckpoint Param for Python LDA with EM

2016-10-08 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk updated SPARK-14420: Fix Version/s: 2.0.0 > keepLastCheckpoint Param for Python LDA with EM >

[jira] [Commented] (SPARK-17626) TPC-DS performance improvements using star-schema heuristics

2016-10-08 Thread Ron Hu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15559044#comment-15559044 ] Ron Hu commented on SPARK-17626: In the CBO design spec we posted in

[jira] [Closed] (SPARK-6413) For data source tables, we should provide better output for DESCRIBE FORMATTED

2016-10-08 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li closed SPARK-6413. -- Resolution: Not A Problem > For data source tables, we should provide better output for DESCRIBE FORMATTED >

[jira] [Commented] (SPARK-7659) Sort by attributes that are not present in the SELECT clause when there is windowfunction analysis error

2016-10-08 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15559117#comment-15559117 ] Xiao Li commented on SPARK-7659: This should have been fixed in 2.0. Please reopen it if you still hit it.

[jira] [Closed] (SPARK-7659) Sort by attributes that are not present in the SELECT clause when there is windowfunction analysis error

2016-10-08 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li closed SPARK-7659. -- Resolution: Not A Problem > Sort by attributes that are not present in the SELECT clause when there is >

[jira] [Resolved] (SPARK-8115) Remove TestData

2016-10-08 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-8115. Resolution: Later It sounds like this is not being fixed in the short term. Please reopen it if it is

[jira] [Closed] (SPARK-10502) tidy up the exception message text to be less verbose/"User friendly"

2016-10-08 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li closed SPARK-10502. --- Resolution: Won't Fix > tidy up the exception message text to be less verbose/"User friendly" >

[jira] [Closed] (SPARK-10318) Getting issue in spark connectivity with cassandra

2016-10-08 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li closed SPARK-10318. --- Resolution: Fixed > Getting issue in spark connectivity with cassandra >

[jira] [Updated] (SPARK-17835) Optimize NaiveBayes mllib wrapper to eliminate extra pass on data

2016-10-08 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanbo Liang updated SPARK-17835: Description: SPARK-14077 copied the {{NaiveBayes}} implementation from mllib to ml and left mllib

[jira] [Commented] (SPARK-17820) Spark sqlContext.sql() performs only first insert for HiveQL "FROM target INSERT INTO dest" command to insert into multiple target tables from same source

2016-10-08 Thread Jiang Xingbo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15557433#comment-15557433 ] Jiang Xingbo commented on SPARK-17820: -- Looks we could support this by expanding

[jira] [Commented] (SPARK-17820) Spark sqlContext.sql() performs only first insert for HiveQL "FROM target INSERT INTO dest" command to insert into multiple target tables from same source

2016-10-08 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15557458#comment-15557458 ] Herman van Hovell commented on SPARK-17820: --- Yeah, sure we should take a look at this . Just

[jira] [Reopened] (SPARK-10318) Getting issue in spark connectivity with cassandra

2016-10-08 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reopened SPARK-10318: --- [~smilegator] shouldn't we resolve this as a duplicate of the main, fixed issue rather than the other

[jira] [Resolved] (SPARK-10318) Getting issue in spark connectivity with cassandra

2016-10-08 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-10318. --- Resolution: Duplicate > Getting issue in spark connectivity with cassandra >

[jira] [Commented] (SPARK-8377) Identifiers caseness information should be available at any time

2016-10-08 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15557542#comment-15557542 ] Sean Owen commented on SPARK-8377: -- This sounds more like "Not a Problem" if the resolution wasn't a

[jira] [Updated] (SPARK-1792) Missing Spark-Shell Configure Options

2016-10-08 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-1792: - Fix Version/s: 1.1.0 > Missing Spark-Shell Configure Options > - > >

[jira] [Updated] (SPARK-17836) Use cross validation to determine the number of clusters for EM or KMeans algorithms

2016-10-08 Thread Lei Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei Wang updated SPARK-17836: - Issue Type: New Feature (was: Bug) > Use cross validation to determine the number of clusters for EM or

[jira] [Resolved] (SPARK-9685) "Unsupported dataType: char(X)" in Hive

2016-10-08 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-9685. -- Resolution: Duplicate > "Unsupported dataType: char(X)" in Hive >

[jira] [Commented] (SPARK-8842) Spark SQL - Insert into table Issue

2016-10-08 Thread James Greenwood (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15557584#comment-15557584 ] James Greenwood commented on SPARK-8842: No, does no work > Spark SQL - Insert into table Issue >

[jira] [Reopened] (SPARK-9685) "Unsupported dataType: char(X)" in Hive

2016-10-08 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reopened SPARK-9685: -- > "Unsupported dataType: char(X)" in Hive > --- > >

[jira] [Updated] (SPARK-15989) PySpark SQL python-only UDTs don't support nested types

2016-10-08 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-15989: -- Assignee: Liang-Chi Hsieh > PySpark SQL python-only UDTs don't support nested types >

[jira] [Updated] (SPARK-16186) Support partition batch pruning with `IN` predicate in InMemoryTableScanExec

2016-10-08 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-16186: -- Assignee: Dongjoon Hyun > Support partition batch pruning with `IN` predicate in InMemoryTableScanExec

[jira] [Updated] (SPARK-15487) Spark Master UI to reverse proxy Application and Workers UI

2016-10-08 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-15487: -- Assignee: Gurvinder > Spark Master UI to reverse proxy Application and Workers UI >

[jira] [Updated] (SPARK-16804) Correlated subqueries containing non-deterministic operators return incorrect results

2016-10-08 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-16804: -- Assignee: Nattavut Sutyanyong > Correlated subqueries containing non-deterministic operators return

[jira] [Updated] (SPARK-16596) Refactor DataSourceScanExec to do partition discovery at execution instead of planning time

2016-10-08 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-16596: -- Assignee: Eric Liang > Refactor DataSourceScanExec to do partition discovery at execution instead of

[jira] [Updated] (SPARK-16525) Enable Row Based HashMap in HashAggregateExec

2016-10-08 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-16525: -- Assignee: Qifan Pu > Enable Row Based HashMap in HashAggregateExec >

[jira] [Comment Edited] (SPARK-10221) RowReaderFactory does not work with blobs

2016-10-08 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15557326#comment-15557326 ] Xiao Li edited comment on SPARK-10221 at 10/8/16 6:09 AM: -- This should be a bug

[jira] [Closed] (SPARK-10221) RowReaderFactory does not work with blobs

2016-10-08 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li closed SPARK-10221. --- Resolution: Won't Fix > RowReaderFactory does not work with blobs >

[jira] [Commented] (SPARK-10502) tidy up the exception message text to be less verbose/"User friendly"

2016-10-08 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15557353#comment-15557353 ] Xiao Li commented on SPARK-10502: - In 2.0, we introduced a new Parser. Thus, this becomes invalid. >

[jira] [Updated] (SPARK-17835) Optimize NaiveBayes mllib wrapper to eliminate extra pass on data

2016-10-08 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanbo Liang updated SPARK-17835: Description: SPARK-14077 copied the {{NaiveBayes}} implementation from mllib to ml and left mllib

[jira] [Updated] (SPARK-17835) Optimize NaiveBayes mllib wrapper to eliminate extra pass on data

2016-10-08 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanbo Liang updated SPARK-17835: Description: SPARK-14077 copied the {{NaiveBayes}} implementation from mllib to ml and left ml as

[jira] [Updated] (SPARK-17835) Optimize NaiveBayes mllib wrapper to eliminate extra pass on data

2016-10-08 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanbo Liang updated SPARK-17835: Issue Type: Improvement (was: Bug) > Optimize NaiveBayes mllib wrapper to eliminate extra pass on

  1   2   3   >