[jira] [Created] (SPARK-3349) [Spark SQL] Incorrect partitioning after LIMIT operator

2014-09-02 Thread Eric Liang (JIRA)
Eric Liang created SPARK-3349: - Summary: [Spark SQL] Incorrect partitioning after LIMIT operator Key: SPARK-3349 URL: https://issues.apache.org/jira/browse/SPARK-3349 Project: Spark Issue Type:

[jira] [Created] (SPARK-3394) TakeOrdered crashes when limit is 0

2014-09-03 Thread Eric Liang (JIRA)
Eric Liang created SPARK-3394: - Summary: TakeOrdered crashes when limit is 0 Key: SPARK-3394 URL: https://issues.apache.org/jira/browse/SPARK-3394 Project: Spark Issue Type: Bug

[jira] [Closed] (SPARK-3349) Incorrect partitioning after LIMIT operator

2014-09-08 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Liang closed SPARK-3349. - Incorrect partitioning after LIMIT operator ---

[jira] [Closed] (SPARK-3394) TakeOrdered crashes when limit is 0

2014-09-08 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Liang closed SPARK-3394. - TakeOrdered crashes when limit is 0 --- Key: SPARK-3394

[jira] [Commented] (SPARK-9895) User Guide for RFormula Feature Transformer

2015-08-12 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14694371#comment-14694371 ] Eric Liang commented on SPARK-9895: --- Sure, I can take this task. User Guide for

[jira] [Created] (SPARK-9463) Expose model coefficients with names in SparkR RFormula

2015-07-29 Thread Eric Liang (JIRA)
Eric Liang created SPARK-9463: - Summary: Expose model coefficients with names in SparkR RFormula Key: SPARK-9463 URL: https://issues.apache.org/jira/browse/SPARK-9463 Project: Spark Issue Type:

[jira] [Created] (SPARK-9492) LogisticRegression should provide model statistics

2015-07-30 Thread Eric Liang (JIRA)
Eric Liang created SPARK-9492: - Summary: LogisticRegression should provide model statistics Key: SPARK-9492 URL: https://issues.apache.org/jira/browse/SPARK-9492 Project: Spark Issue Type:

[jira] [Created] (SPARK-9681) Support R feature interactions in RFormula

2015-08-06 Thread Eric Liang (JIRA)
Eric Liang created SPARK-9681: - Summary: Support R feature interactions in RFormula Key: SPARK-9681 URL: https://issues.apache.org/jira/browse/SPARK-9681 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-9713) Document SparkR MLlib glm() integration in Spark 1.5

2015-08-06 Thread Eric Liang (JIRA)
Eric Liang created SPARK-9713: - Summary: Document SparkR MLlib glm() integration in Spark 1.5 Key: SPARK-9713 URL: https://issues.apache.org/jira/browse/SPARK-9713 Project: Spark Issue Type:

[jira] [Created] (SPARK-9391) Support minus, dot, and intercept operators in SparkR RFormula

2015-07-27 Thread Eric Liang (JIRA)
Eric Liang created SPARK-9391: - Summary: Support minus, dot, and intercept operators in SparkR RFormula Key: SPARK-9391 URL: https://issues.apache.org/jira/browse/SPARK-9391 Project: Spark

[jira] [Created] (SPARK-9230) SparkR RFormula should support StringType features

2015-07-21 Thread Eric Liang (JIRA)
Eric Liang created SPARK-9230: - Summary: SparkR RFormula should support StringType features Key: SPARK-9230 URL: https://issues.apache.org/jira/browse/SPARK-9230 Project: Spark Issue Type: New

[jira] [Commented] (SPARK-9230) SparkR RFormula should support StringType features

2015-07-21 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14635790#comment-14635790 ] Eric Liang commented on SPARK-9230: --- Hmm, I think it would be hard to support that in a

[jira] [Created] (SPARK-9201) Integrate MLlib with SparkR using RFormula

2015-07-20 Thread Eric Liang (JIRA)
Eric Liang created SPARK-9201: - Summary: Integrate MLlib with SparkR using RFormula Key: SPARK-9201 URL: https://issues.apache.org/jira/browse/SPARK-9201 Project: Spark Issue Type: New Feature

[jira] [Commented] (SPARK-10523) SparkR formula syntax to turn strings/factors into numerics

2015-09-09 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14737960#comment-14737960 ] Eric Liang commented on SPARK-10523: We can convert to boolean easily enough, but supporting >2

[jira] [Commented] (SPARK-11965) Update user guide for RFormula feature interactions

2015-12-08 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15047525#comment-15047525 ] Eric Liang commented on SPARK-11965: Will do On Tue, Dec 8, 2015, 1:11 PM Joseph K. Bradley (JIRA)

[jira] [Issue Comment Deleted] (SPARK-11965) Update user guide for RFormula feature interactions

2015-12-08 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Liang updated SPARK-11965: --- Comment: was deleted (was: Will do On Tue, Dec 8, 2015, 1:11 PM Joseph K. Bradley (JIRA)

[jira] [Created] (SPARK-12346) GLM summary crashes with NoSuchElementException if attributes are missing names

2015-12-15 Thread Eric Liang (JIRA)
Eric Liang created SPARK-12346: -- Summary: GLM summary crashes with NoSuchElementException if attributes are missing names Key: SPARK-12346 URL: https://issues.apache.org/jira/browse/SPARK-12346 Project:

[jira] [Created] (SPARK-15735) Allow specifying min time to run in microbenchmarks

2016-06-02 Thread Eric Liang (JIRA)
Eric Liang created SPARK-15735: -- Summary: Allow specifying min time to run in microbenchmarks Key: SPARK-15735 URL: https://issues.apache.org/jira/browse/SPARK-15735 Project: Spark Issue Type:

[jira] [Created] (SPARK-15794) Should truncate toString() of very wide schemas

2016-06-06 Thread Eric Liang (JIRA)
Eric Liang created SPARK-15794: -- Summary: Should truncate toString() of very wide schemas Key: SPARK-15794 URL: https://issues.apache.org/jira/browse/SPARK-15794 Project: Spark Issue Type: Bug

[jira] [Created] (SPARK-15881) Update microbenchmark results

2016-06-10 Thread Eric Liang (JIRA)
Eric Liang created SPARK-15881: -- Summary: Update microbenchmark results Key: SPARK-15881 URL: https://issues.apache.org/jira/browse/SPARK-15881 Project: Spark Issue Type: Sub-task

[jira] [Created] (SPARK-15860) Metrics for codegen size and perf

2016-06-09 Thread Eric Liang (JIRA)
Eric Liang created SPARK-15860: -- Summary: Metrics for codegen size and perf Key: SPARK-15860 URL: https://issues.apache.org/jira/browse/SPARK-15860 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-15520) SparkSession builder in python should also allow overriding confs of existing sessions

2016-05-24 Thread Eric Liang (JIRA)
Eric Liang created SPARK-15520: -- Summary: SparkSession builder in python should also allow overriding confs of existing sessions Key: SPARK-15520 URL: https://issues.apache.org/jira/browse/SPARK-15520

[jira] [Updated] (SPARK-15520) SparkSession builder in python should also allow overriding confs of existing sessions

2016-05-24 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Liang updated SPARK-15520: --- Component/s: SQL > SparkSession builder in python should also allow overriding confs of existing >

[jira] [Comment Edited] (SPARK-15634) SQL repl is bricked if a function is registered with a non-existent jar

2016-05-27 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15304872#comment-15304872 ] Eric Liang edited comment on SPARK-15634 at 5/27/16 9:57 PM: - Note that

[jira] [Commented] (SPARK-15634) SQL repl is bricked if a function is registered with a non-existent jar

2016-05-27 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15304872#comment-15304872 ] Eric Liang commented on SPARK-15634: Note that adding jars in the repl also doesn't work currently,

[jira] [Updated] (SPARK-15724) Add benchmarks for performance over wide schemas

2016-06-01 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Liang updated SPARK-15724: --- Component/s: SQL > Add benchmarks for performance over wide schemas >

[jira] [Updated] (SPARK-15724) Add benchmarks for performance over wide schemas

2016-06-01 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Liang updated SPARK-15724: --- Description: There are some reported degradations in 2.0 when querying over very wide / deeply

[jira] [Created] (SPARK-15724) Add benchmarks for performance over wide schemas

2016-06-01 Thread Eric Liang (JIRA)
Eric Liang created SPARK-15724: -- Summary: Add benchmarks for performance over wide schemas Key: SPARK-15724 URL: https://issues.apache.org/jira/browse/SPARK-15724 Project: Spark Issue Type:

[jira] [Updated] (SPARK-15724) Add benchmarks for performance over wide schemas

2016-06-01 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Liang updated SPARK-15724: --- Affects Version/s: 2.0.0 > Add benchmarks for performance over wide schemas >

[jira] [Updated] (SPARK-15724) Add benchmarks for performance over wide schemas

2016-06-01 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Liang updated SPARK-15724: --- Description: There are some reported degradations in 2.0 when querying over very wide/nested

[jira] [Created] (SPARK-15634) SQL repl is bricked if a function is registered with a non-existent jar

2016-05-27 Thread Eric Liang (JIRA)
Eric Liang created SPARK-15634: -- Summary: SQL repl is bricked if a function is registered with a non-existent jar Key: SPARK-15634 URL: https://issues.apache.org/jira/browse/SPARK-15634 Project: Spark

[jira] [Created] (SPARK-16021) Zero out freed memory in test to help catch correctness bugs

2016-06-17 Thread Eric Liang (JIRA)
Eric Liang created SPARK-16021: -- Summary: Zero out freed memory in test to help catch correctness bugs Key: SPARK-16021 URL: https://issues.apache.org/jira/browse/SPARK-16021 Project: Spark

[jira] [Created] (SPARK-16025) Document OFF_HEAP storage level in 2.0

2016-06-17 Thread Eric Liang (JIRA)
Eric Liang created SPARK-16025: -- Summary: Document OFF_HEAP storage level in 2.0 Key: SPARK-16025 URL: https://issues.apache.org/jira/browse/SPARK-16025 Project: Spark Issue Type: Documentation

[jira] [Created] (SPARK-16238) Metrics for generated method bytecode size

2016-06-27 Thread Eric Liang (JIRA)
Eric Liang created SPARK-16238: -- Summary: Metrics for generated method bytecode size Key: SPARK-16238 URL: https://issues.apache.org/jira/browse/SPARK-16238 Project: Spark Issue Type:

[jira] [Created] (SPARK-14475) Propagate user-defined context from driver to executors

2016-04-07 Thread Eric Liang (JIRA)
Eric Liang created SPARK-14475: -- Summary: Propagate user-defined context from driver to executors Key: SPARK-14475 URL: https://issues.apache.org/jira/browse/SPARK-14475 Project: Spark Issue

[jira] [Commented] (SPARK-14475) Propagate user-defined context from driver to executors

2016-04-08 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15232455#comment-15232455 ] Eric Liang commented on SPARK-14475: I think the main difference is that this is transparent to the

[jira] [Commented] (SPARK-14252) Executors do not try to download remote cached blocks

2016-04-05 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15227493#comment-15227493 ] Eric Liang commented on SPARK-14252: I'm going to take a look at fixing this > Executors do not try

[jira] [Created] (SPARK-14227) [SQL] Add method for printing out generated code for debugging

2016-03-28 Thread Eric Liang (JIRA)
Eric Liang created SPARK-14227: -- Summary: [SQL] Add method for printing out generated code for debugging Key: SPARK-14227 URL: https://issues.apache.org/jira/browse/SPARK-14227 Project: Spark

[jira] [Commented] (SPARK-14359) Improve user experience for typed aggregate functions in Java

2016-04-04 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15224476#comment-15224476 ] Eric Liang commented on SPARK-14359: Sure > Improve user experience for typed aggregate functions in

[jira] [Created] (SPARK-14851) Support radix sort with nullable longs

2016-04-22 Thread Eric Liang (JIRA)
Eric Liang created SPARK-14851: -- Summary: Support radix sort with nullable longs Key: SPARK-14851 URL: https://issues.apache.org/jira/browse/SPARK-14851 Project: Spark Issue Type: Improvement

[jira] [Resolved] (SPARK-15496) Spill metrics not updated when off-heap memory is enabled

2016-05-23 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Liang resolved SPARK-15496. Resolution: Fixed Fix Version/s: 2.0.0 Ah, actually the reproduction was incorrect. This

[jira] [Created] (SPARK-15496) Spill metrics not updated when off-heap memory is enabled

2016-05-23 Thread Eric Liang (JIRA)
Eric Liang created SPARK-15496: -- Summary: Spill metrics not updated when off-heap memory is enabled Key: SPARK-15496 URL: https://issues.apache.org/jira/browse/SPARK-15496 Project: Spark Issue

[jira] [Created] (SPARK-15259) Sort time metric

2016-05-10 Thread Eric Liang (JIRA)
Eric Liang created SPARK-15259: -- Summary: Sort time metric Key: SPARK-15259 URL: https://issues.apache.org/jira/browse/SPARK-15259 Project: Spark Issue Type: Bug Components: SQL

[jira] [Updated] (SPARK-15259) Sort time metric should not include spill and record insertion time

2016-05-10 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Liang updated SPARK-15259: --- Summary: Sort time metric should not include spill and record insertion time (was: Sort time

[jira] [Created] (SPARK-14733) Allow custom timing control in microbenchmarks

2016-04-19 Thread Eric Liang (JIRA)
Eric Liang created SPARK-14733: -- Summary: Allow custom timing control in microbenchmarks Key: SPARK-14733 URL: https://issues.apache.org/jira/browse/SPARK-14733 Project: Spark Issue Type:

[jira] [Commented] (SPARK-14790) Scalastyle should run on compile in sbt

2016-04-20 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15251171#comment-15251171 ] Eric Liang commented on SPARK-14790: You can cache the style results so it's not that different >

[jira] [Created] (SPARK-14790) Scalastyle should run on compile in sbt

2016-04-20 Thread Eric Liang (JIRA)
Eric Liang created SPARK-14790: -- Summary: Scalastyle should run on compile in sbt Key: SPARK-14790 URL: https://issues.apache.org/jira/browse/SPARK-14790 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-14724) Improve performance of sorting by using radix sort when possible

2016-04-18 Thread Eric Liang (JIRA)
Eric Liang created SPARK-14724: -- Summary: Improve performance of sorting by using radix sort when possible Key: SPARK-14724 URL: https://issues.apache.org/jira/browse/SPARK-14724 Project: Spark

[jira] [Updated] (SPARK-14724) Improve performance of sorting by using radix sort when possible

2016-04-18 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Liang updated SPARK-14724: --- Component/s: Spark Core > Improve performance of sorting by using radix sort when possible >

[jira] [Created] (SPARK-16818) Exchange reuse incorrectly reuses scans over different sets of partitions

2016-07-30 Thread Eric Liang (JIRA)
Eric Liang created SPARK-16818: -- Summary: Exchange reuse incorrectly reuses scans over different sets of partitions Key: SPARK-16818 URL: https://issues.apache.org/jira/browse/SPARK-16818 Project: Spark

[jira] [Created] (SPARK-17042) Repl-defined classes cannot be replicated

2016-08-12 Thread Eric Liang (JIRA)
Eric Liang created SPARK-17042: -- Summary: Repl-defined classes cannot be replicated Key: SPARK-17042 URL: https://issues.apache.org/jira/browse/SPARK-17042 Project: Spark Issue Type: Sub-task

[jira] [Updated] (SPARK-17042) Repl-defined classes cannot be replicated

2016-08-12 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Liang updated SPARK-17042: --- Description: A simple fix is to erase the classTag when using the default serializer, since it's

[jira] [Created] (SPARK-16514) RegexExtract and RegexReplace crash on non-nullable input

2016-07-12 Thread Eric Liang (JIRA)
Eric Liang created SPARK-16514: -- Summary: RegexExtract and RegexReplace crash on non-nullable input Key: SPARK-16514 URL: https://issues.apache.org/jira/browse/SPARK-16514 Project: Spark Issue

[jira] [Created] (SPARK-16596) Refactor DataSourceScanExec to do partition discovery at execution instead of planning time

2016-07-17 Thread Eric Liang (JIRA)
Eric Liang created SPARK-16596: -- Summary: Refactor DataSourceScanExec to do partition discovery at execution instead of planning time Key: SPARK-16596 URL: https://issues.apache.org/jira/browse/SPARK-16596

[jira] [Created] (SPARK-16432) Empty blocks fail to serialize due to assert in ChunkedByteBuffer

2016-07-07 Thread Eric Liang (JIRA)
Eric Liang created SPARK-16432: -- Summary: Empty blocks fail to serialize due to assert in ChunkedByteBuffer Key: SPARK-16432 URL: https://issues.apache.org/jira/browse/SPARK-16432 Project: Spark

[jira] [Updated] (SPARK-16432) Empty blocks fail to serialize due to assert in ChunkedByteBuffer

2016-07-07 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Liang updated SPARK-16432: --- Component/s: Spark Core > Empty blocks fail to serialize due to assert in ChunkedByteBuffer >

[jira] [Updated] (SPARK-16884) Move DataSourceScanExec out of ExistingRDD.scala file

2016-08-03 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Liang updated SPARK-16884: --- Issue Type: Improvement (was: Bug) > Move DataSourceScanExec out of ExistingRDD.scala file >

[jira] [Created] (SPARK-16884) Move DataSourceScanExec out of ExistingRDD.scala file

2016-08-03 Thread Eric Liang (JIRA)
Eric Liang created SPARK-16884: -- Summary: Move DataSourceScanExec out of ExistingRDD.scala file Key: SPARK-16884 URL: https://issues.apache.org/jira/browse/SPARK-16884 Project: Spark Issue

[jira] [Created] (SPARK-17069) Expose spark.range() as table-valued function in SQL

2016-08-15 Thread Eric Liang (JIRA)
Eric Liang created SPARK-17069: -- Summary: Expose spark.range() as table-valued function in SQL Key: SPARK-17069 URL: https://issues.apache.org/jira/browse/SPARK-17069 Project: Spark Issue Type:

[jira] [Created] (SPARK-17162) Range does not support SQL generation

2016-08-19 Thread Eric Liang (JIRA)
Eric Liang created SPARK-17162: -- Summary: Range does not support SQL generation Key: SPARK-17162 URL: https://issues.apache.org/jira/browse/SPARK-17162 Project: Spark Issue Type: Bug

[jira] [Created] (SPARK-17370) Shuffle service files not invalidated when a slave is lost

2016-09-01 Thread Eric Liang (JIRA)
Eric Liang created SPARK-17370: -- Summary: Shuffle service files not invalidated when a slave is lost Key: SPARK-17370 URL: https://issues.apache.org/jira/browse/SPARK-17370 Project: Spark Issue

[jira] [Created] (SPARK-17371) Resubmitted stage outputs deleted by zombie map tasks on stop()

2016-09-01 Thread Eric Liang (JIRA)
Eric Liang created SPARK-17371: -- Summary: Resubmitted stage outputs deleted by zombie map tasks on stop() Key: SPARK-17371 URL: https://issues.apache.org/jira/browse/SPARK-17371 Project: Spark

[jira] [Updated] (SPARK-17370) Shuffle service files not invalidated when a slave is lost

2016-09-01 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Liang updated SPARK-17370: --- Component/s: Spark Core > Shuffle service files not invalidated when a slave is lost >

[jira] [Created] (SPARK-17472) Better error message for serialization failures of large objects in Python

2016-09-09 Thread Eric Liang (JIRA)
Eric Liang created SPARK-17472: -- Summary: Better error message for serialization failures of large objects in Python Key: SPARK-17472 URL: https://issues.apache.org/jira/browse/SPARK-17472 Project:

[jira] [Commented] (SPARK-17042) Repl-defined classes cannot be replicated

2016-08-22 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15431616#comment-15431616 ] Eric Liang commented on SPARK-17042: Yeah, my bad. I was trying to split this up but it turns out to

[jira] [Created] (SPARK-17713) Move row-datasource related tests out of JDBCSuite

2016-09-28 Thread Eric Liang (JIRA)
Eric Liang created SPARK-17713: -- Summary: Move row-datasource related tests out of JDBCSuite Key: SPARK-17713 URL: https://issues.apache.org/jira/browse/SPARK-17713 Project: Spark Issue Type:

[jira] [Commented] (SPARK-17673) Reused Exchange Aggregations Produce Incorrect Results

2016-09-27 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15528145#comment-15528145 ] Eric Liang commented on SPARK-17673: Russell, could you try applying this patch (wip) to see if it

[jira] [Created] (SPARK-17701) Refactor DataSourceScanExec so its sameResult call does not compare strings

2016-09-27 Thread Eric Liang (JIRA)
Eric Liang created SPARK-17701: -- Summary: Refactor DataSourceScanExec so its sameResult call does not compare strings Key: SPARK-17701 URL: https://issues.apache.org/jira/browse/SPARK-17701 Project:

[jira] [Updated] (SPARK-17701) Refactor DataSourceScanExec so its sameResult call does not compare strings

2016-09-27 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Liang updated SPARK-17701: --- Component/s: SQL > Refactor DataSourceScanExec so its sameResult call does not compare strings >

[jira] [Commented] (SPARK-17673) Reused Exchange Aggregations Produce Incorrect Results

2016-09-27 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15527339#comment-15527339 ] Eric Liang commented on SPARK-17673: I'm looking at this now. > Reused Exchange Aggregations Produce

[jira] [Created] (SPARK-17740) Spark tests should mock / interpose HDFS to ensure that streams are closed

2016-09-29 Thread Eric Liang (JIRA)
Eric Liang created SPARK-17740: -- Summary: Spark tests should mock / interpose HDFS to ensure that streams are closed Key: SPARK-17740 URL: https://issues.apache.org/jira/browse/SPARK-17740 Project:

[jira] [Created] (SPARK-18101) ExternalCatalogSuite should test with mixed case fields

2016-10-25 Thread Eric Liang (JIRA)
Eric Liang created SPARK-18101: -- Summary: ExternalCatalogSuite should test with mixed case fields Key: SPARK-18101 URL: https://issues.apache.org/jira/browse/SPARK-18101 Project: Spark Issue

[jira] [Updated] (SPARK-18101) ExternalCatalogSuite should test with mixed case fields

2016-10-25 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Liang updated SPARK-18101: --- Issue Type: Sub-task (was: Test) Parent: SPARK-17861 > ExternalCatalogSuite should test

[jira] [Created] (SPARK-18103) Rename *FileCatalog to *FileProvider

2016-10-25 Thread Eric Liang (JIRA)
Eric Liang created SPARK-18103: -- Summary: Rename *FileCatalog to *FileProvider Key: SPARK-18103 URL: https://issues.apache.org/jira/browse/SPARK-18103 Project: Spark Issue Type: Improvement

[jira] [Updated] (SPARK-18145) Update documentation for hive partition management in 2.1

2016-10-27 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Liang updated SPARK-18145: --- Summary: Update documentation for hive partition management in 2.1 (was: Update documentation) >

[jira] [Updated] (SPARK-18145) Update documentation

2016-10-27 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Liang updated SPARK-18145: --- Issue Type: Sub-task (was: Documentation) Parent: SPARK-17861 > Update documentation >

[jira] [Updated] (SPARK-18145) Update documentation for hive partition management in 2.1

2016-10-27 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Liang updated SPARK-18145: --- Component/s: SQL > Update documentation for hive partition management in 2.1 >

[jira] [Created] (SPARK-18145) Update documentation

2016-10-27 Thread Eric Liang (JIRA)
Eric Liang created SPARK-18145: -- Summary: Update documentation Key: SPARK-18145 URL: https://issues.apache.org/jira/browse/SPARK-18145 Project: Spark Issue Type: Documentation

[jira] [Created] (SPARK-18146) Avoid using Union to chain together create table and repair partition commands

2016-10-27 Thread Eric Liang (JIRA)
Eric Liang created SPARK-18146: -- Summary: Avoid using Union to chain together create table and repair partition commands Key: SPARK-18146 URL: https://issues.apache.org/jira/browse/SPARK-18146 Project:

[jira] [Commented] (SPARK-17916) CSV data source treats empty string as null no matter what nullValue option is

2016-11-08 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15649039#comment-15649039 ] Eric Liang commented on SPARK-17916: We're hitting this as a regression from 2.0 as well. Ideally,

[jira] [Commented] (SPARK-17916) CSV data source treats empty string as null no matter what nullValue option is

2016-11-09 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15652414#comment-15652414 ] Eric Liang commented on SPARK-17916: In our case, a user wants the empty string (whether actually

[jira] [Created] (SPARK-18393) DataFrame pivot output column names should respect aliases

2016-11-09 Thread Eric Liang (JIRA)
Eric Liang created SPARK-18393: -- Summary: DataFrame pivot output column names should respect aliases Key: SPARK-18393 URL: https://issues.apache.org/jira/browse/SPARK-18393 Project: Spark Issue

[jira] [Updated] (SPARK-18185) Should fix INSERT OVERWRITE TABLE of Datasource tables with dynamic partitions

2016-11-07 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Liang updated SPARK-18185: --- Issue Type: Sub-task (was: Bug) Parent: SPARK-17861 > Should fix INSERT OVERWRITE TABLE of

[jira] [Updated] (SPARK-18185) Should fix INSERT OVERWRITE TABLE of Datasource tables with dynamic partitions

2016-11-07 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Liang updated SPARK-18185: --- Target Version/s: 2.1.0 > Should fix INSERT OVERWRITE TABLE of Datasource tables with dynamic

[jira] [Updated] (SPARK-17990) ALTER TABLE ... ADD PARTITION does not play nice with mixed-case partition column names

2016-11-07 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Liang updated SPARK-17990: --- Target Version/s: 2.1.0 > ALTER TABLE ... ADD PARTITION does not play nice with mixed-case partition

[jira] [Updated] (SPARK-18333) Revert hacks in parquet and orc reader to support case insensitive resolution

2016-11-07 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Liang updated SPARK-18333: --- Target Version/s: 2.1.0 > Revert hacks in parquet and orc reader to support case insensitive

[jira] [Updated] (SPARK-18145) Update documentation for hive partition management in 2.1

2016-11-07 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Liang updated SPARK-18145: --- Target Version/s: 2.1.0 > Update documentation for hive partition management in 2.1 >

[jira] [Commented] (SPARK-18185) Should fix INSERT OVERWRITE TABLE of Datasource tables with dynamic partitions

2016-11-07 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15645238#comment-15645238 ] Eric Liang commented on SPARK-18185: I'm currently working on this. > Should fix INSERT OVERWRITE

[jira] [Created] (SPARK-18333) Revert hacks in parquet and orc reader to support case insensitive resolution

2016-11-07 Thread Eric Liang (JIRA)
Eric Liang created SPARK-18333: -- Summary: Revert hacks in parquet and orc reader to support case insensitive resolution Key: SPARK-18333 URL: https://issues.apache.org/jira/browse/SPARK-18333 Project:

[jira] [Updated] (SPARK-17983) Can't filter over mixed case parquet columns of converted Hive tables

2016-10-17 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Liang updated SPARK-17983: --- Description: We should probably revive https://github.com/apache/spark/pull/14750 in order to fix

[jira] [Commented] (SPARK-17862) Feature flag SPARK-16980

2016-10-18 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15586210#comment-15586210 ] Eric Liang commented on SPARK-17862: Yes, this is the flag: {code} val

[jira] [Updated] (SPARK-17980) Fix refreshByPath for converted Hive tables

2016-10-18 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Liang updated SPARK-17980: --- Issue Type: Sub-task (was: Bug) Parent: SPARK-17861 > Fix refreshByPath for converted Hive

[jira] [Created] (SPARK-17991) Enable metastore partition pruning for unconverted hive tables by default

2016-10-18 Thread Eric Liang (JIRA)
Eric Liang created SPARK-17991: -- Summary: Enable metastore partition pruning for unconverted hive tables by default Key: SPARK-17991 URL: https://issues.apache.org/jira/browse/SPARK-17991 Project: Spark

[jira] [Updated] (SPARK-17991) Enable metastore partition pruning for unconverted hive tables by default

2016-10-18 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Liang updated SPARK-17991: --- Issue Type: Sub-task (was: Improvement) Parent: SPARK-17861 > Enable metastore partition

[jira] [Updated] (SPARK-17994) Add back a file status cache for catalog tables

2016-10-18 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Liang updated SPARK-17994: --- Issue Type: Sub-task (was: Improvement) Parent: SPARK-17861 > Add back a file status cache

[jira] [Updated] (SPARK-17994) Add back a file status cache for catalog tables

2016-10-18 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Liang updated SPARK-17994: --- Description: In SPARK-16980, we removed the full in-memory cache of table partitions in favor of

[jira] [Created] (SPARK-17994) Add back a file status cache for catalog tables

2016-10-18 Thread Eric Liang (JIRA)
Eric Liang created SPARK-17994: -- Summary: Add back a file status cache for catalog tables Key: SPARK-17994 URL: https://issues.apache.org/jira/browse/SPARK-17994 Project: Spark Issue Type:

[jira] [Commented] (SPARK-17983) Can't filter over mixed case parquet columns of converted Hive tables

2016-10-18 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15586476#comment-15586476 ] Eric Liang commented on SPARK-17983: Since we already store the original (case-sensitive) schema of

[jira] [Created] (SPARK-18087) Optimize insert to not require REPAIR TABLE

2016-10-24 Thread Eric Liang (JIRA)
Eric Liang created SPARK-18087: -- Summary: Optimize insert to not require REPAIR TABLE Key: SPARK-18087 URL: https://issues.apache.org/jira/browse/SPARK-18087 Project: Spark Issue Type: Sub-task

[jira] [Updated] (SPARK-18026) should not always lowercase partition columns of partition spec in parser

2016-10-24 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Liang updated SPARK-18026: --- Issue Type: Sub-task (was: Improvement) Parent: SPARK-17861 > should not always lowercase

  1   2   >