[jira] [Commented] (SPARK-12107) Update spark-ec2 versions

2015-12-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15038561#comment-15038561 ] Michael Armbrust commented on SPARK-12107: -- Yeah, I was planning do a bulk update if/when

[jira] [Commented] (SPARK-12083) java.lang.IllegalArgumentException: requirement failed: Overflowed precision (q98)

2015-12-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15038829#comment-15038829 ] Michael Armbrust commented on SPARK-12083: -- I mean the first release candidate (RC1): http

Re: SparkSQL API to insert DataFrame into a static partition?

2015-12-02 Thread Michael Armbrust
you might also coalesce to 1 (or some small number) before writing to avoid creating a lot of files in that partition if you know that there is not a ton of data. On Wed, Dec 2, 2015 at 12:59 AM, Rishi Mishra wrote: > As long as all your data is being inserted by Spark ,

[VOTE] Release Apache Spark 1.6.0 (RC1)

2015-12-02 Thread Michael Armbrust
Please vote on releasing the following candidate as Apache Spark version 1.6.0! The vote is open until Saturday, December 5, 2015 at 21:00 UTC and passes if a majority of at least 3 +1 PMC votes are cast. [ ] +1 Release this package as Apache Spark 1.6.0 [ ] -1 Do not release this package

[jira] [Updated] (SPARK-12000) `sbt publishLocal` hits a Scala compiler bug caused by `Since` annotation

2015-12-02 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-12000: - Target Version/s: 1.7.0 (was: 1.6.0) > `sbt publishLocal` hits a Scala compiler

Re: When to cut RCs

2015-12-02 Thread Michael Armbrust
> > Sorry for a second email so soon. I meant to also ask, what keeps the cost > of making an RC high? Can we bring it down with better tooling? > There is a lot of tooling: https://amplab.cs.berkeley.edu/jenkins/view/Spark-Packaging/ Still you have check JIRA, sync with people who have been

[jira] [Updated] (SPARK-12107) Update spark-ec2 versions

2015-12-02 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-12107: - Target Version/s: 1.6.0 > Update spark-ec2 versi

[jira] [Commented] (SPARK-12066) spark sql throw java.lang.ArrayIndexOutOfBoundsException when use table.* with join

2015-12-02 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15036683#comment-15036683 ] Michael Armbrust commented on SPARK-12066: -- Can you reproduce this on 1.6-rc1? > spark

[jira] [Updated] (SPARK-12089) java.lang.NegativeArraySizeException when growing BufferHolder

2015-12-02 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-12089: - Target Version/s: 1.6.0 > java.lang.NegativeArraySizeException when growing BufferHol

[jira] [Updated] (SPARK-12089) java.lang.NegativeArraySizeException when growing BufferHolder

2015-12-02 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-12089: - Priority: Blocker (was: Critical) > java.lang.NegativeArraySizeException when grow

[jira] [Updated] (SPARK-12108) Event logs are much bigger in 1.6 than in 1.5

2015-12-02 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-12108: - Target Version/s: 1.6.0 (was: 1.6.1) > Event logs are much bigger in 1.6 than in

[jira] [Updated] (SPARK-7264) SparkR API for parallel functions

2015-12-02 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-7264: Target Version/s: (was: 1.6.0) > SparkR API for parallel functi

[jira] [Updated] (SPARK-9697) Project Tungsten (Spark 1.6)

2015-12-02 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-9697: Target Version/s: (was: 1.6.0) > Project Tungsten (Spark

Re: [VOTE] Release Apache Spark 1.6.0 (RC1)

2015-12-02 Thread Michael Armbrust
I'm going to kick the voting off with a +1 (binding). We ran TPC-DS and most queries are faster than 1.5. We've also ported several production pipelines to 1.6.

[jira] [Commented] (SPARK-12083) java.lang.IllegalArgumentException: requirement failed: Overflowed precision (q98)

2015-12-02 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15036954#comment-15036954 ] Michael Armbrust commented on SPARK-12083: -- Can you test with 1.6-RC1

[jira] [Updated] (SPARK-12063) Group by Column Number identifier is not successfully parsed

2015-12-02 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-12063: - Issue Type: New Feature (was: Bug) > Group by Column Number identif

[jira] [Updated] (SPARK-12088) check connection.isClose before connection.getAutoCommit in JDBCRDD.close

2015-12-02 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-12088: - Target Version/s: 1.6.0 > check connection.isClose before connection.getAutoCom

[jira] [Updated] (SPARK-11868) wrong results returned from dataframe create from Rows without consistent schma on pyspark

2015-12-02 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-11868: - Target Version/s: (was: 1.6.0) > wrong results returned from dataframe create f

Re: When to cut RCs

2015-12-02 Thread Michael Armbrust
Thanks for bringing this up Sean. I think we are all happy to adopt concrete suggestions to make the release process more transparent, including pinging the list before kicking off the release build. Technically there's still a Blocker bug: > https://issues.apache.org/jira/browse/SPARK-12000

[jira] [Commented] (SPARK-11873) Regression for TPC-DS query 63 when used with decimal datatype and windows function

2015-12-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034908#comment-15034908 ] Michael Armbrust commented on SPARK-11873: -- We did a lot of performance work in Spark 1.6 (e.g

[jira] [Updated] (SPARK-12061) Persist for Map/filter with Lambda Functions don't always read from Cache

2015-12-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-12061: - Target Version/s: 1.7.0 > Persist for Map/filter with Lambda Functions don't always r

[jira] [Updated] (SPARK-12061) [SQL] Dataset API: Adding Persist for Map/filter with Lambda Functions

2015-12-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-12061: - Issue Type: Bug (was: Improvement) > [SQL] Dataset API: Adding Persist for Map/fil

[jira] [Updated] (SPARK-12000) `sbt publishLocal` hits a Scala compiler bug caused by `Since` annotation

2015-12-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-12000: - Priority: Blocker (was: Major) > `sbt publishLocal` hits a Scala compiler bug cau

[jira] [Updated] (SPARK-11932) trackStateByKey throws java.lang.IllegalArgumentException: requirement failed on restarting from checkpoint

2015-12-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-11932: - Priority: Critical (was: Blocker) > trackStateByKey thr

[jira] [Resolved] (SPARK-11503) SQL API audit for Spark 1.6

2015-12-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-11503. -- Resolution: Fixed Fix Version/s: 1.6.0 > SQL API audit for Spark

[jira] [Updated] (SPARK-11780) Provide type aliases in org.apache.spark.sql.types for backwards compatibility

2015-12-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-11780: - Assignee: Santiago M. Mola > Provide type aliases in org.apache.spark.sql.ty

[jira] [Updated] (SPARK-11596) SQL execution very slow for nested query plans because of DataFrame.withNewExecutionId

2015-12-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-11596: - Assignee: Yin Huai > SQL execution very slow for nested query plans beca

[jira] [Updated] (SPARK-12061) Persist for Map/filter with Lambda Functions don't always read from Cache

2015-12-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-12061: - Summary: Persist for Map/filter with Lambda Functions don't always read from Cache

[jira] [Updated] (SPARK-11352) codegen.GeneratePredicate fails due to unquoted comment

2015-12-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-11352: - Assignee: Yin Huai > codegen.GeneratePredicate fails due to unquoted comm

[jira] [Updated] (SPARK-12061) Persist for Map/filter with Lambda Functions

2015-12-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-12061: - Summary: Persist for Map/filter with Lambda Functions (was: [SQL] Dataset API: Adding

[jira] [Resolved] (SPARK-11596) SQL execution very slow for nested query plans because of DataFrame.withNewExecutionId

2015-12-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-11596. -- Resolution: Fixed Fix Version/s: 1.6.0 Issue resolved by pull request 10079

[jira] [Resolved] (SPARK-12046) Visibility and format issues in ScalaDoc/JavaDoc for branch-1.6

2015-12-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-12046. -- Resolution: Fixed Fix Version/s: 1.6.0 Issue resolved by pull request 10063

[jira] [Resolved] (SPARK-12068) use a single column in Dataset.groupBy and count will fail

2015-12-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-12068. -- Resolution: Fixed Fix Version/s: 1.6.0 Issue resolved by pull request 10059

[jira] [Updated] (SPARK-11856) add type cast if the real type is different but compatible with encoder schema

2015-12-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-11856: - Assignee: Wenchen Fan > add type cast if the real type is different but compati

[jira] [Resolved] (SPARK-11856) add type cast if the real type is different but compatible with encoder schema

2015-12-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-11856. -- Resolution: Fixed Fix Version/s: 1.6.0 Issue resolved by pull request 9840

[jira] [Updated] (SPARK-11780) Provide type aliases in org.apache.spark.sql.types for backwards compatibility

2015-12-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-11780: - Target Version/s: 1.6.0 > Provide type aliases in org.apache.spark.sql.ty

[jira] [Resolved] (SPARK-11954) Encoder for JavaBeans / POJOs

2015-12-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-11954. -- Resolution: Fixed Fix Version/s: 1.6.0 Issue resolved by pull request 9937

[jira] [Resolved] (SPARK-11905) [SQL] Support Persist/Cache and Unpersist in Dataset APIs

2015-12-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-11905. -- Resolution: Fixed Fix Version/s: 1.6.0 Issue resolved by pull request 9889

Re: Getting all files of a table

2015-12-01 Thread Michael Armbrust
sqlContext.table("...").inputFiles (this is best effort, but should work for hive tables). Michael On Tue, Dec 1, 2015 at 10:55 AM, Krzysztof Zarzycki wrote: > Hi there, > Do you know how easily I can get a list of all files of a Hive table? > > What I want to achieve is

[jira] [Updated] (SPARK-8414) Ensure ContextCleaner actually triggers clean ups

2015-12-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-8414: Target Version/s: (was: 1.6.0) > Ensure ContextCleaner actually triggers clean

[jira] [Commented] (SPARK-12032) Filter can't be pushed down to correct Join because of bad order of Join

2015-11-30 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15032403#comment-15032403 ] Michael Armbrust commented on SPARK-12032: -- The standard algorithm for join reordering should

[jira] [Updated] (SPARK-11553) row.getInt(i) if row[i]=null returns 0

2015-11-30 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-11553: - Labels: releasenotes (was: ) > row.getInt(i) if row[i]=null return

[jira] [Commented] (SPARK-11966) Spark API for UDTFs

2015-11-30 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15032877#comment-15032877 ] Michael Armbrust commented on SPARK-11966: -- Have you seen [explode|https://github.com/apache

[jira] [Updated] (SPARK-11941) JSON representation of nested StructTypes could be more uniform

2015-11-30 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-11941: - Issue Type: Improvement (was: Bug) > JSON representation of nested StructTypes co

[jira] [Commented] (SPARK-11941) JSON representation of nested StructTypes could be more uniform

2015-11-30 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15032899#comment-15032899 ] Michael Armbrust commented on SPARK-11941: -- /cc [~lian cheng] > JSON representation of nes

[jira] [Updated] (SPARK-12032) Filter can't be pushed down to correct Join because of bad order of Join

2015-11-30 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-12032: - Issue Type: Improvement (was: Bug) > Filter can't be pushed down to correct J

[jira] [Commented] (SPARK-11941) JSON representation of nested StructTypes is incorrect

2015-11-30 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15032889#comment-15032889 ] Michael Armbrust commented on SPARK-11941: -- While I can appreciate that this might be nicer

[jira] [Updated] (SPARK-11941) JSON representation of nested StructTypes could be more uniform

2015-11-30 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-11941: - Summary: JSON representation of nested StructTypes could be more uniform (was: JSON

[jira] [Commented] (SPARK-11966) Spark API for UDTFs

2015-11-30 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15032925#comment-15032925 ] Michael Armbrust commented on SPARK-11966: -- Ah, I was proposing the DataFrame function explode

[jira] [Commented] (SPARK-11941) JSON representation of nested StructTypes could be more uniform

2015-11-30 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15032946#comment-15032946 ] Michael Armbrust commented on SPARK-11941: -- Sorry, maybe I'm misunderstanding. Can you

[jira] [Commented] (SPARK-11873) Regression for TPC-DS query 63 when used with decimal datatype and windows function

2015-11-30 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15032921#comment-15032921 ] Michael Armbrust commented on SPARK-11873: -- What about with Spark 1.6? > Regression for TPC

[jira] [Updated] (SPARK-12030) Incorrect results when aggregate joined data

2015-11-30 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-12030: - Target Version/s: 1.6.0 Priority: Blocker (was: Critical) > Incorr

[jira] [Resolved] (SPARK-12018) Refactor common subexpression elimination code

2015-11-30 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-12018. -- Resolution: Fixed Fix Version/s: 1.6.0 Issue resolved by pull request 10009

[jira] [Updated] (SPARK-11315) Add YARN extension service to publish Spark events to YARN timeline service

2015-11-30 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-11315: - Target Version/s: (was: 1.6.0) > Add YARN extension service to publish Spark eve

[jira] [Updated] (SPARK-11796) Docker JDBC integration tests fail in Maven build due to dependency issue

2015-11-30 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-11796: - Component/s: Tests > Docker JDBC integration tests fail in Maven build due to depende

[jira] [Updated] (SPARK-11601) ML 1.6 QA: API: Binary incompatible changes

2015-11-30 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-11601: - Component/s: Documentation > ML 1.6 QA: API: Binary incompatible chan

[jira] [Updated] (SPARK-11954) Encoder for JavaBeans / POJOs

2015-11-30 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-11954: - Assignee: Wenchen Fan > Encoder for JavaBeans / PO

[jira] [Updated] (SPARK-12031) Integer overflow when do sampling.

2015-11-30 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-12031: - Description: In my case, some partitions contain too much items. When do range partition

[jira] [Updated] (SPARK-8966) Design a mechanism to ensure that temporary files created in tasks are cleaned up after failures

2015-11-30 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-8966: Target Version/s: (was: 1.6.0) > Design a mechanism to ensure that temporary fi

[jira] [Updated] (SPARK-11600) Spark MLlib 1.6 QA umbrella

2015-11-30 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-11600: - Component/s: Documentation > Spark MLlib 1.6 QA umbre

[jira] [Commented] (SPARK-8414) Ensure ContextCleaner actually triggers clean ups

2015-11-30 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15033103#comment-15033103 ] Michael Armbrust commented on SPARK-8414: - Still planning to do this for 1.6? > Ens

[jira] [Created] (SPARK-12069) Documentation update for Datasets

2015-11-30 Thread Michael Armbrust (JIRA)
Michael Armbrust created SPARK-12069: Summary: Documentation update for Datasets Key: SPARK-12069 URL: https://issues.apache.org/jira/browse/SPARK-12069 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-11966) Spark API for UDTFs

2015-11-30 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-11966: - Target Version/s: 1.7.0 > Spark API for UD

[jira] [Updated] (SPARK-7348) DAG visualization: add links to RDD page

2015-11-30 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-7348: Target Version/s: (was: 1.6.0) > DAG visualization: add links to RDD p

[jira] [Updated] (SPARK-12060) Avoid memory copy in JavaSerializerInstance.serialize

2015-11-30 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-12060: - Priority: Critical (was: Major) > Avoid memory copy in JavaSerializerInstance.serial

[jira] [Updated] (SPARK-11985) Update Spark Streaming - Kinesis Library Documentation regarding data de-aggregation and message handler

2015-11-30 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-11985: - Component/s: Documentation > Update Spark Streaming - Kinesis Library Documentat

[jira] [Updated] (SPARK-6518) Add example code and user guide for bisecting k-means

2015-11-30 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-6518: Component/s: Documentation > Add example code and user guide for bisecting k-me

[jira] [Updated] (SPARK-11603) ML 1.6 QA: API: Experimental, DeveloperApi, final, sealed audit

2015-11-30 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-11603: - Component/s: Documentation > ML 1.6 QA: API: Experimental, DeveloperApi, final, sea

[jira] [Updated] (SPARK-11607) Update MLlib website for 1.6

2015-11-30 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-11607: - Component/s: Documentation > Update MLlib website for

[jira] [Updated] (SPARK-6280) Remove Akka systemName from Spark

2015-11-30 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-6280: Target Version/s: (was: 1.6.0) > Remove Akka systemName from Sp

[jira] [Updated] (SPARK-12031) Integer overflow when do sampling.

2015-11-30 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-12031: - Priority: Critical (was: Major) > Integer overflow when do sampl

[jira] [Commented] (SPARK-12017) Java Doc Publishing Broken

2015-11-30 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15033119#comment-15033119 ] Michael Armbrust commented on SPARK-12017: -- Fixed in https://github.com/apache/spark/pull/10049

[jira] [Resolved] (SPARK-12017) Java Doc Publishing Broken

2015-11-30 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-12017. -- Resolution: Fixed Assignee: Josh Rosen Fix Version/s: 1.6.0 > Java

[jira] [Updated] (SPARK-12010) Spark JDBC requires support for column-name-free INSERT syntax

2015-11-30 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-12010: - Target Version/s: (was: 1.6.0) > Spark JDBC requires support for column-name-f

[jira] [Commented] (SPARK-12010) Spark JDBC requires support for column-name-free INSERT syntax

2015-11-30 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15033124#comment-15033124 ] Michael Armbrust commented on SPARK-12010: -- Thanks for working on this, but we've already hit

[jira] [Resolved] (SPARK-11990) DataFrame recompute UDF in some situation.

2015-11-26 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-11990. -- Resolution: Duplicate Fix Version/s: 1.6.0 This is already fixed in Spark 1.6

[jira] [Created] (SPARK-12017) Java Doc Publishing Broken

2015-11-26 Thread Michael Armbrust (JIRA)
Michael Armbrust created SPARK-12017: Summary: Java Doc Publishing Broken Key: SPARK-12017 URL: https://issues.apache.org/jira/browse/SPARK-12017 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-12017) Java Doc Publishing Broken

2015-11-26 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15029153#comment-15029153 ] Michael Armbrust commented on SPARK-12017: -- /cc [~joshrosen] > Java Doc Publishing Bro

[jira] [Resolved] (SPARK-11863) Unable to resolve order by if it contains mixture of aliases and real columns.

2015-11-26 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-11863. -- Resolution: Fixed Fix Version/s: 1.6.0 Issue resolved by pull request 9961

[jira] [Updated] (SPARK-11942) fix encoder life cycle for CoGroup

2015-11-24 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-11942: - Assignee: Wenchen Fan > fix encoder life cycle for CoGr

[jira] [Resolved] (SPARK-11942) fix encoder life cycle for CoGroup

2015-11-24 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-11942. -- Resolution: Fixed Fix Version/s: 1.6.0 Issue resolved by pull request 9928

[jira] [Commented] (SPARK-9141) DataFrame recomputed instead of using cached parent.

2015-11-24 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15024927#comment-15024927 ] Michael Armbrust commented on SPARK-9141: - [~tianyi] please provide a reproduction of the issue

[jira] [Commented] (SPARK-9328) Netty IO layer should implement read timeouts

2015-11-24 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025107#comment-15025107 ] Michael Armbrust commented on SPARK-9328: - [~joshrosen] is this actually a 1.6 blocker? > Ne

[jira] [Resolved] (SPARK-11926) unify GetStructField and GetInternalRowField

2015-11-24 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-11926. -- Resolution: Fixed Fix Version/s: 1.6.0 Issue resolved by pull request 9909

[jira] [Resolved] (SPARK-11913) support typed aggregate for complex buffer schema

2015-11-23 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-11913. -- Resolution: Fixed Fix Version/s: 1.6.0 Issue resolved by pull request 9898

[jira] [Resolved] (SPARK-11894) Incorrect results are returned when using null

2015-11-23 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-11894. -- Resolution: Fixed Fix Version/s: 1.6.0 Issue resolved by pull request 9904

[jira] [Resolved] (SPARK-11921) fix `nullable` of encoder schema

2015-11-23 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-11921. -- Resolution: Fixed Fix Version/s: 1.6.0 Issue resolved by pull request 9906

Re: Relation between RDDs, DataFrames and Project Tungsten

2015-11-23 Thread Michael Armbrust
Here is how I view the relationship between the various components of Spark: - *RDDs - *a low level API for expressing DAGs that will be executed in parallel by Spark workers - *Catalyst -* an internal library for expressing trees that we use to build relational algebra and expression

[ANNOUNCE] Spark 1.6.0 Release Preview

2015-11-22 Thread Michael Armbrust
In order to facilitate community testing of Spark 1.6.0, I'm excited to announce the availability of an early preview of the release. This is not a release candidate, so there is no voting involved. However, it'd be awesome if community members can start testing with this preview package and

[jira] [Updated] (SPARK-7539) Perf tests for Python MLlib

2015-11-22 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-7539: Component/s: Tests > Perf tests for Python ML

[jira] [Resolved] (SPARK-11819) nice error message for missing encoder

2015-11-20 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-11819. -- Resolution: Fixed Fix Version/s: 1.6.0 Issue resolved by pull request 9810

[jira] [Resolved] (SPARK-11876) [SQL] Support PrintSchema in DataSet APIs

2015-11-20 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-11876. -- Resolution: Fixed Fix Version/s: 1.6.0 Issue resolved by pull request 9855

[jira] [Updated] (SPARK-11873) Regression for TPC-DS query 63 when used with decimal datatype and windows function

2015-11-20 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-11873: - Issue Type: Improvement (was: Bug) > Regression for TPC-DS query 63 when u

[jira] [Updated] (SPARK-11819) nice error message for missing encoder

2015-11-20 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-11819: - Assignee: Wenchen Fan > nice error message for missing enco

[jira] [Updated] (SPARK-11876) [SQL] Support PrintSchema in DataSet APIs

2015-11-20 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-11876: - Assignee: Xiao Li > [SQL] Support PrintSchema in DataSet A

[jira] [Created] (SPARK-11889) Type inference in REPL broken for GroupedDataset.agg

2015-11-20 Thread Michael Armbrust (JIRA)
Michael Armbrust created SPARK-11889: Summary: Type inference in REPL broken for GroupedDataset.agg Key: SPARK-11889 URL: https://issues.apache.org/jira/browse/SPARK-11889 Project: Spark

[jira] [Updated] (SPARK-11889) Type inference in REPL broken for GroupedDataset.agg

2015-11-20 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-11889: - Target Version/s: 1.6.0 > Type inference in REPL broken for GroupedDataset.

[jira] [Updated] (SPARK-11889) Type inference in REPL broken for GroupedDataset.agg

2015-11-20 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-11889: - Component/s: SQL > Type inference in REPL broken for GroupedDataset.

[jira] [Created] (SPARK-11890) Encoder errors logic breaks on Scala 2.11

2015-11-20 Thread Michael Armbrust (JIRA)
Michael Armbrust created SPARK-11890: Summary: Encoder errors logic breaks on Scala 2.11 Key: SPARK-11890 URL: https://issues.apache.org/jira/browse/SPARK-11890 Project: Spark Issue Type

[jira] [Updated] (SPARK-11890) Encoder errors logic breaks on Scala 2.11

2015-11-20 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-11890: - Description: {code} [error] /home/jenkins/workspace/Spark-Master-Scala211-Compile/sql

<    9   10   11   12   13   14   15   16   17   18   >