[jira] [Commented] (SPARK-19162) UserDefinedFunction constructor should verify that func is callable

2017-01-20 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832127#comment-15832127 ] Ryan Blue commented on SPARK-19162: --- [~rxin], I think this one is ready for a final review and commit,

[jira] [Comment Edited] (SPARK-19160) Decorator for UDF creation.

2017-01-20 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832124#comment-15832124 ] Ryan Blue edited comment on SPARK-19160 at 1/20/17 5:14 PM: [~rxin], I think

[jira] [Commented] (SPARK-19160) Decorator for UDF creation.

2017-01-20 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832124#comment-15832124 ] Ryan Blue commented on SPARK-19160: --- @rxin, I think this one is ready to be merged. Who is a good

[jira] [Commented] (SPARK-19159) PySpark UDF API improvements

2017-01-10 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15816530#comment-15816530 ] Ryan Blue commented on SPARK-19159: --- [~zero323], is there an order to the pull requests? I'll start

[jira] [Resolved] (SPARK-19138) Python: new HiveContext will use a stopped SparkContext

2017-01-09 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Blue resolved SPARK-19138. --- Resolution: Duplicate > Python: new HiveContext will use a stopped SparkContext >

[jira] [Created] (SPARK-19138) Python: new HiveContext will use a stopped SparkContext

2017-01-09 Thread Ryan Blue (JIRA)
Ryan Blue created SPARK-19138: - Summary: Python: new HiveContext will use a stopped SparkContext Key: SPARK-19138 URL: https://issues.apache.org/jira/browse/SPARK-19138 Project: Spark Issue

[jira] [Commented] (SPARK-16032) Audit semantics of various insertion operations related to partitioned tables

2016-12-16 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15755221#comment-15755221 ] Ryan Blue commented on SPARK-16032: --- +1 > Audit semantics of various insertion operations related to

[jira] [Commented] (SPARK-16178) SQL - Hive writer should not require partition names to match table partitions

2016-12-15 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15752632#comment-15752632 ] Ryan Blue commented on SPARK-16178: --- Sure. I think the result was Won't Fix. > SQL - Hive writer

[jira] [Created] (SPARK-18677) Json path implementation fails to parse ['key']

2016-12-01 Thread Ryan Blue (JIRA)
Ryan Blue created SPARK-18677: - Summary: Json path implementation fails to parse ['key'] Key: SPARK-18677 URL: https://issues.apache.org/jira/browse/SPARK-18677 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-18387) Test that expressions can be serialized

2016-11-10 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15655454#comment-15655454 ] Ryan Blue commented on SPARK-18387: --- Yeah, I'm working on it. Thanks! On Thu, Nov 10, 2016 at 1:37 PM,

[jira] [Updated] (SPARK-18387) Test that expressions can be serialized

2016-11-09 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Blue updated SPARK-18387: -- Affects Version/s: 2.1.0 2.0.1 > Test that expressions can be serialized >

[jira] [Created] (SPARK-18387) Test that expressions can be serialized

2016-11-09 Thread Ryan Blue (JIRA)
Ryan Blue created SPARK-18387: - Summary: Test that expressions can be serialized Key: SPARK-18387 URL: https://issues.apache.org/jira/browse/SPARK-18387 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-18368) Regular expression replace throws NullPointerException when serialized

2016-11-08 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Blue updated SPARK-18368: -- Description: This query fails with a [NullPointerException on line

[jira] [Created] (SPARK-18368) Regular expression replace throws NullPointerException when serialized

2016-11-08 Thread Ryan Blue (JIRA)
Ryan Blue created SPARK-18368: - Summary: Regular expression replace throws NullPointerException when serialized Key: SPARK-18368 URL: https://issues.apache.org/jira/browse/SPARK-18368 Project: Spark

[jira] [Commented] (SPARK-18086) Regression: Hive variables no longer work in Spark 2.0

2016-11-03 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15634605#comment-15634605 ] Ryan Blue commented on SPARK-18086: --- Yeah, I'll update the PR. > Regression: Hive variables no longer

[jira] [Commented] (SPARK-18086) Regression: Hive variables no longer work in Spark 2.0

2016-11-02 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15630849#comment-15630849 ] Ryan Blue commented on SPARK-18086: --- What is the rationale for propagating configuration but not

[jira] [Commented] (SPARK-18086) Regression: Hive variables no longer work in Spark 2.0

2016-11-02 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15630132#comment-15630132 ] Ryan Blue commented on SPARK-18086: --- Hive variables are set on the Hive SessionState and I think it is

[jira] [Commented] (SPARK-18086) Regression: Hive variables no longer work in Spark 2.0

2016-11-02 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15630045#comment-15630045 ] Ryan Blue commented on SPARK-18086: --- [~rxin], I think the fix for this should go into 2.1.0 The linked

[jira] [Created] (SPARK-18086) Regression: Hive variables no longer work in Spark 2.0

2016-10-24 Thread Ryan Blue (JIRA)
Ryan Blue created SPARK-18086: - Summary: Regression: Hive variables no longer work in Spark 2.0 Key: SPARK-18086 URL: https://issues.apache.org/jira/browse/SPARK-18086 Project: Spark Issue Type:

[jira] [Commented] (SPARK-17995) Use new attributes for columns from outer joins

2016-10-19 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15589692#comment-15589692 ] Ryan Blue commented on SPARK-17995: --- I'm not sure how that would work. Here's an example. Say I have

[jira] [Commented] (SPARK-17995) Use new attributes for columns from outer joins

2016-10-18 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15586972#comment-15586972 ] Ryan Blue commented on SPARK-17995: --- [~cloud_fan] and [~yhuai], I'd like to help fix this, but I'm not

[jira] [Created] (SPARK-17995) Use new attributes for columns from outer joins

2016-10-18 Thread Ryan Blue (JIRA)
Ryan Blue created SPARK-17995: - Summary: Use new attributes for columns from outer joins Key: SPARK-17995 URL: https://issues.apache.org/jira/browse/SPARK-17995 Project: Spark Issue Type:

[jira] [Created] (SPARK-17532) Add thread lock information from JMX to thread dump UI

2016-09-13 Thread Ryan Blue (JIRA)
Ryan Blue created SPARK-17532: - Summary: Add thread lock information from JMX to thread dump UI Key: SPARK-17532 URL: https://issues.apache.org/jira/browse/SPARK-17532 Project: Spark Issue Type:

[jira] [Commented] (SPARK-17424) Dataset job fails from unsound substitution in ScalaReflect

2016-09-12 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15484718#comment-15484718 ] Ryan Blue commented on SPARK-17424: --- I'm adding the above fix in a PR. This fix works for us (the job

[jira] [Commented] (SPARK-17302) Cannot set non-Spark SQL session variables in hive-site.xml, spark-defaults.conf, or using --conf

2016-09-08 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15475156#comment-15475156 ] Ryan Blue commented on SPARK-17302: --- In 1.6.x, Spark pulled session config for Hive from a

[jira] [Commented] (SPARK-17396) Threads number keep increasing when query on external CSV partitioned table

2016-09-06 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15468932#comment-15468932 ] Ryan Blue commented on SPARK-17396: --- I opened a PR with a fix. It still uses a ForkJoinPool because the

[jira] [Updated] (SPARK-17424) Dataset job fails from unsound substitution in ScalaReflect

2016-09-06 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Blue updated SPARK-17424: -- Description: I have a job that uses datasets in 1.6.1 and is failing with this error: {code} 16/09/02

[jira] [Created] (SPARK-17424) Dataset job fails from unsound substitution in ScalaReflect

2016-09-06 Thread Ryan Blue (JIRA)
Ryan Blue created SPARK-17424: - Summary: Dataset job fails from unsound substitution in ScalaReflect Key: SPARK-17424 URL: https://issues.apache.org/jira/browse/SPARK-17424 Project: Spark Issue

[jira] [Commented] (SPARK-17396) Threads number keep increasing when query on external CSV partitioned table

2016-09-06 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15467956#comment-15467956 ] Ryan Blue commented on SPARK-17396: --- I'll put together a patch for this with a shared executor service.

[jira] [Commented] (SPARK-17396) Threads number keep increasing when query on external CSV partitioned table

2016-09-05 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15465375#comment-15465375 ] Ryan Blue commented on SPARK-17396: --- I'm not sure that the ForkJoinPool is to blame. Each partition in

[jira] [Updated] (SPARK-17396) Threads number keep increasing when query on external CSV partitioned table

2016-09-05 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Blue updated SPARK-17396: -- Description: 1. Create a external partitioned table row format CSV 2. Add 16 partitions to the table

[jira] [Created] (SPARK-17302) Cannot set non-Spark SQL session variables in hive-site.xml, spark-defaults.conf, or using --conf

2016-08-29 Thread Ryan Blue (JIRA)
Ryan Blue created SPARK-17302: - Summary: Cannot set non-Spark SQL session variables in hive-site.xml, spark-defaults.conf, or using --conf Key: SPARK-17302 URL: https://issues.apache.org/jira/browse/SPARK-17302

[jira] [Created] (SPARK-17300) ClosedChannelException caused by missing block manager when speculative tasks are killed

2016-08-29 Thread Ryan Blue (JIRA)
Ryan Blue created SPARK-17300: - Summary: ClosedChannelException caused by missing block manager when speculative tasks are killed Key: SPARK-17300 URL: https://issues.apache.org/jira/browse/SPARK-17300

[jira] [Commented] (SPARK-16344) Array of struct with a single field name "element" can't be decoded from Parquet files written by Spark 1.6+

2016-07-11 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15371013#comment-15371013 ] Ryan Blue commented on SPARK-16344: --- Sounds good to me! I like the idea of converting back to the

[jira] [Commented] (SPARK-16435) Behavior changes if initialExecutor is less than minExecutor for dynamic allocation

2016-07-08 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15367883#comment-15367883 ] Ryan Blue commented on SPARK-16435: --- I think a warning is appropriate, but there's no need to throw an

[jira] [Created] (SPARK-16420) UnsafeShuffleWriter leaks compression streams with off-heap memory.

2016-07-07 Thread Ryan Blue (JIRA)
Ryan Blue created SPARK-16420: - Summary: UnsafeShuffleWriter leaks compression streams with off-heap memory. Key: SPARK-16420 URL: https://issues.apache.org/jira/browse/SPARK-16420 Project: Spark

[jira] [Commented] (SPARK-16344) Array of struct with a single field name "element" can't be decoded from Parquet files written by Spark 1.6+

2016-07-07 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15366296#comment-15366296 ] Ryan Blue commented on SPARK-16344: --- I agree, using list/element is reasonable. I just want to note

[jira] [Resolved] (SPARK-16382) YARN - Dynamic allocation with spark.executor.instances should increase max executors.

2016-07-06 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Blue resolved SPARK-16382. --- Resolution: Won't Fix > YARN - Dynamic allocation with spark.executor.instances should increase max

[jira] [Commented] (SPARK-16382) YARN - Dynamic allocation with spark.executor.instances should increase max executors.

2016-07-06 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15365255#comment-15365255 ] Ryan Blue commented on SPARK-16382: --- [~jerryshao], [~tgraves], I think you're both right that this is

[jira] [Commented] (SPARK-16344) Array of struct with a single field name "element" can't be decoded from Parquet files written by Spark 1.6+

2016-07-06 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15365062#comment-15365062 ] Ryan Blue commented on SPARK-16344: --- It looks like the main change is to specifically catch the 3-level

[jira] [Commented] (SPARK-16344) Array of struct with a single field name "element" can't be decoded from Parquet files written by Spark 1.6+

2016-07-06 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15364706#comment-15364706 ] Ryan Blue commented on SPARK-16344: --- [~lian cheng], I'm looking at this today. > Array of struct with

[jira] [Commented] (SPARK-16382) YARN - Dynamic allocation with spark.executor.instances should increase max executors.

2016-07-05 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15362798#comment-15362798 ] Ryan Blue commented on SPARK-16382: --- [~tgraves], what do you think the right behavior is here? > YARN

[jira] [Created] (SPARK-16382) YARN - Dynamic allocation with spark.executor.instances should increase max executors.

2016-07-05 Thread Ryan Blue (JIRA)
Ryan Blue created SPARK-16382: - Summary: YARN - Dynamic allocation with spark.executor.instances should increase max executors. Key: SPARK-16382 URL: https://issues.apache.org/jira/browse/SPARK-16382

[jira] [Created] (SPARK-16178) SQL - Hive writer should not require partition names to match table partitions

2016-06-23 Thread Ryan Blue (JIRA)
Ryan Blue created SPARK-16178: - Summary: SQL - Hive writer should not require partition names to match table partitions Key: SPARK-16178 URL: https://issues.apache.org/jira/browse/SPARK-16178 Project:

[jira] [Comment Edited] (SPARK-16032) Audit semantics of various insertion operations related to partitioned tables

2016-06-23 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15347172#comment-15347172 ] Ryan Blue edited comment on SPARK-16032 at 6/23/16 10:59 PM: - bq. I am not

[jira] [Commented] (SPARK-16032) Audit semantics of various insertion operations related to partitioned tables

2016-06-23 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15347172#comment-15347172 ] Ryan Blue commented on SPARK-16032: --- bq. I am not sure apply by-name resolution just to partition

[jira] [Commented] (SPARK-16032) Audit semantics of various insertion operations related to partitioned tables

2016-06-21 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15343243#comment-15343243 ] Ryan Blue commented on SPARK-16032: --- [~cloud_fan], while I think by-name insertion is important in the

[jira] [Commented] (SPARK-16032) Audit semantics of various insertion operations related to partitioned tables

2016-06-21 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15342963#comment-15342963 ] Ryan Blue commented on SPARK-16032: --- I'm referring to disabling the use of {{partitionBy}} with

[jira] [Commented] (SPARK-16032) Audit semantics of various insertion operations related to partitioned tables

2016-06-21 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15342224#comment-15342224 ] Ryan Blue commented on SPARK-16032: --- bq. the most important issue we would like to address here is

[jira] [Commented] (SPARK-16032) Audit semantics of various insertion operations related to partitioned tables

2016-06-21 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15342216#comment-15342216 ] Ryan Blue commented on SPARK-16032: --- bq. I don't think the package matters, the pre-insert is still an

[jira] [Commented] (SPARK-16032) Audit semantics of various insertion operations related to partitioned tables

2016-06-21 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15342135#comment-15342135 ] Ryan Blue commented on SPARK-16032: --- I agree with the push to unify the Hive and DataSource

[jira] [Commented] (SPARK-16032) Audit semantics of various insertion operations related to partitioned tables

2016-06-20 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15340929#comment-15340929 ] Ryan Blue commented on SPARK-16032: --- Overall, I'm *-1* on these changes going into Spark 2.0. Looking

[jira] [Commented] (SPARK-16032) Audit semantics of various insertion operations related to partitioned tables

2016-06-20 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15340797#comment-15340797 ] Ryan Blue commented on SPARK-16032: --- I'm going to put review comments here because the PRs are already

[jira] [Commented] (SPARK-16032) Audit semantics of various insertion operations related to partitioned tables

2016-06-20 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15340409#comment-15340409 ] Ryan Blue commented on SPARK-16032: --- The changes here don't look like the rule that was added in

[jira] [Commented] (SPARK-16033) DataFrameWriter.partitionBy() can't be used together with DataFrameWriter.insertInto()

2016-06-20 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15340386#comment-15340386 ] Ryan Blue commented on SPARK-16033: --- The previous behavior was to re-order the data frame's columns,

[jira] [Commented] (SPARK-16037) use by-position resolution when insert into hive table

2016-06-20 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15340305#comment-15340305 ] Ryan Blue commented on SPARK-16037: --- I agree, but I don't think this addresses the problem where the

[jira] [Commented] (SPARK-16037) use by-position resolution when insert into hive table

2016-06-20 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15340304#comment-15340304 ] Ryan Blue commented on SPARK-16037: --- I agree, but I don't think this addresses the problem where the

[jira] [Issue Comment Deleted] (SPARK-16037) use by-position resolution when insert into hive table

2016-06-20 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Blue updated SPARK-16037: -- Comment: was deleted (was: I agree, but I don't think this addresses the problem where the user

[jira] [Commented] (SPARK-16032) Audit semantics of various insertion operations related to partitioned tables

2016-06-20 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15340294#comment-15340294 ] Ryan Blue commented on SPARK-16032: --- Sounds good, I'm glad to see that at least the cast changes were

[jira] [Commented] (SPARK-16037) use by-position resolution when insert into hive table

2016-06-20 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15340257#comment-15340257 ] Ryan Blue commented on SPARK-16037: --- I agree that this behavior is correct according to SQL, but I

[jira] [Updated] (SPARK-16037) use by-position resolution when insert into hive table

2016-06-20 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Blue updated SPARK-16037: -- Description: INSERT INTO TABLE src SELECT 1, 2 AS c, 3 AS b; The result is 1, 3, 2 for hive table,

[jira] [Commented] (SPARK-16032) Audit semantics of various insertion operations related to partitioned tables

2016-06-20 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15340185#comment-15340185 ] Ryan Blue commented on SPARK-16032: --- Why does the DDL for the data source table differ from the Hive

[jira] [Commented] (SPARK-16032) Audit semantics of various insertion operations related to partitioned tables

2016-06-20 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15340138#comment-15340138 ] Ryan Blue commented on SPARK-16032: --- [~yhuai], thanks for pinging me. I'll take a look at this today.

[jira] [Updated] (SPARK-15725) Dynamic allocation hangs YARN app when executors time out

2016-06-02 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Blue updated SPARK-15725: -- Description: We've had a problem with a dynamic allocation and YARN (since 1.6) where a large stage

[jira] [Updated] (SPARK-15725) Dynamic allocation hangs YARN app when executors time out

2016-06-01 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Blue updated SPARK-15725: -- Description: We've had a problem with a dynamic allocation and YARN (since 1.6) where a large stage

[jira] [Commented] (SPARK-15725) Dynamic allocation hangs YARN app when executors time out

2016-06-01 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15311567#comment-15311567 ] Ryan Blue commented on SPARK-15725: --- I'm linking to a work-around that ensures the AM thread that

[jira] [Created] (SPARK-15725) Dynamic allocation hangs YARN app when executors time out

2016-06-01 Thread Ryan Blue (JIRA)
Ryan Blue created SPARK-15725: - Summary: Dynamic allocation hangs YARN app when executors time out Key: SPARK-15725 URL: https://issues.apache.org/jira/browse/SPARK-15725 Project: Spark Issue

[jira] [Commented] (SPARK-13723) YARN - Change behavior of --num-executors when spark.dynamicAllocation.enabled true

2016-05-26 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15303103#comment-15303103 ] Ryan Blue commented on SPARK-13723: --- I'm porting our changes forward to the 2.0.0 preview so I opened a

[jira] [Commented] (SPARK-15455) For IsolatedClientLoader, we need to provide a conf to disable sharing Hadoop classes

2016-05-23 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15296640#comment-15296640 ] Ryan Blue commented on SPARK-15455: --- Why can't Hive use a newer version of the Hadoop classes, if those

[jira] [Created] (SPARK-15420) Repartition and sort before Parquet writes

2016-05-19 Thread Ryan Blue (JIRA)
Ryan Blue created SPARK-15420: - Summary: Repartition and sort before Parquet writes Key: SPARK-15420 URL: https://issues.apache.org/jira/browse/SPARK-15420 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-14459) SQL partitioning must match existing tables, but is not checked.

2016-05-09 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15276493#comment-15276493 ] Ryan Blue commented on SPARK-14459: --- Thank you [~lian cheng]! > SQL partitioning must match existing

[jira] [Issue Comment Deleted] (SPARK-14797) Spark SQL should not hardcode dependency on spark-sketch_2.11

2016-04-22 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Blue updated SPARK-14797: -- Comment: was deleted (was: [~joshrosen], I can no longer build using maven after this commit. I'm

[jira] [Commented] (SPARK-14797) Spark SQL should not hardcode dependency on spark-sketch_2.11

2016-04-22 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15254746#comment-15254746 ] Ryan Blue commented on SPARK-14797: --- [~joshrosen], I can no longer build using maven after this commit.

[jira] [Created] (SPARK-14679) UI DAG visualization causes OOM generating data

2016-04-15 Thread Ryan Blue (JIRA)
Ryan Blue created SPARK-14679: - Summary: UI DAG visualization causes OOM generating data Key: SPARK-14679 URL: https://issues.apache.org/jira/browse/SPARK-14679 Project: Spark Issue Type: Bug

[jira] [Created] (SPARK-14543) SQL/Hive insertInto has unexpected results

2016-04-11 Thread Ryan Blue (JIRA)
Ryan Blue created SPARK-14543: - Summary: SQL/Hive insertInto has unexpected results Key: SPARK-14543 URL: https://issues.apache.org/jira/browse/SPARK-14543 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-14459) SQL partitioning must match existing tables, but is not checked.

2016-04-07 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Blue updated SPARK-14459: -- Description: Writing into partitioned Hive tables has unexpected results because the table's

[jira] [Created] (SPARK-14459) SQL partitioning must match existing tables, but is not checked.

2016-04-07 Thread Ryan Blue (JIRA)
Ryan Blue created SPARK-14459: - Summary: SQL partitioning must match existing tables, but is not checked. Key: SPARK-14459 URL: https://issues.apache.org/jira/browse/SPARK-14459 Project: Spark

[jira] [Commented] (SPARK-13723) YARN - Change behavior of --num-executors when spark.dynamicAllocation.enabled true

2016-03-30 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15218800#comment-15218800 ] Ryan Blue commented on SPARK-13723: --- +1 > YARN - Change behavior of --num-executors when >

[jira] [Commented] (SPARK-13723) YARN - Change behavior of --num-executors when spark.dynamicAllocation.enabled true

2016-03-30 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15218348#comment-15218348 ] Ryan Blue commented on SPARK-13723: --- I agree with this suggestion. The problem with leaving the current

[jira] [Updated] (SPARK-13779) YarnAllocator cancels and resubmits container requests with no locality preference

2016-03-09 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Blue updated SPARK-13779: -- Description: SPARK-9817 attempts to improve locality by considering the set of pending container

[jira] [Updated] (SPARK-13779) YarnAllocator cancels and resubmits container requests with no locality preference

2016-03-09 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Blue updated SPARK-13779: -- Description: SPARK-9817 attempts to improve locality by considering the set of pending container

[jira] [Created] (SPARK-13779) YarnAllocator cancels and resubmits container requests with no locality preference

2016-03-09 Thread Ryan Blue (JIRA)
Ryan Blue created SPARK-13779: - Summary: YarnAllocator cancels and resubmits container requests with no locality preference Key: SPARK-13779 URL: https://issues.apache.org/jira/browse/SPARK-13779

[jira] [Commented] (SPARK-13496) Optimizing count distinct changes the resulting column name

2016-03-06 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15182332#comment-15182332 ] Ryan Blue commented on SPARK-13496: --- I wouldn't say this is a duplicate, though I'm fine with

[jira] [Updated] (SPARK-13688) Add option to use dynamic allocation even if spark.executor.instances is set.

2016-03-04 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Blue updated SPARK-13688: -- Affects Version/s: 1.6.0 > Add option to use dynamic allocation even if spark.executor.instances is

[jira] [Updated] (SPARK-13688) Add option to use dynamic allocation even if spark.executor.instances is set.

2016-03-04 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Blue updated SPARK-13688: -- Component/s: YARN > Add option to use dynamic allocation even if spark.executor.instances is set. >

[jira] [Created] (SPARK-13688) Add option to use dynamic allocation even if spark.executor.instances is set.

2016-03-04 Thread Ryan Blue (JIRA)
Ryan Blue created SPARK-13688: - Summary: Add option to use dynamic allocation even if spark.executor.instances is set. Key: SPARK-13688 URL: https://issues.apache.org/jira/browse/SPARK-13688 Project:

[jira] [Commented] (SPARK-13496) Optimizing count distinct changes the resulting column name

2016-02-29 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15172311#comment-15172311 ] Ryan Blue commented on SPARK-13496: --- [~smilegator], do you know what commit fixed this problem? If so,

[jira] [Created] (SPARK-13496) Optimizing count distinct changes the resulting column name

2016-02-25 Thread Ryan Blue (JIRA)
Ryan Blue created SPARK-13496: - Summary: Optimizing count distinct changes the resulting column name Key: SPARK-13496 URL: https://issues.apache.org/jira/browse/SPARK-13496 Project: Spark Issue

[jira] [Commented] (SPARK-10001) Allow Ctrl-C in spark-shell to kill running job

2016-02-22 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15157316#comment-15157316 ] Ryan Blue commented on SPARK-10001: --- Sorry about suggesting to fork the discussion. I just want to

[jira] [Commented] (SPARK-10001) Allow Ctrl-C in spark-shell to kill running job

2016-02-19 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15154885#comment-15154885 ] Ryan Blue commented on SPARK-10001: --- bq. I . . . am uneasy about adopting unusual semantics for a

[jira] [Created] (SPARK-13403) HiveConf used for SparkSQL is not based on the Hadoop configuration

2016-02-19 Thread Ryan Blue (JIRA)
Ryan Blue created SPARK-13403: - Summary: HiveConf used for SparkSQL is not based on the Hadoop configuration Key: SPARK-13403 URL: https://issues.apache.org/jira/browse/SPARK-13403 Project: Spark

[jira] [Commented] (SPARK-9926) Parallelize file listing for partitioned Hive table

2016-02-17 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15151211#comment-15151211 ] Ryan Blue commented on SPARK-9926: -- I've just posted [PR

[jira] [Commented] (SPARK-10340) Use S3 bulk listing for S3-backed Hive tables

2016-02-17 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15150799#comment-15150799 ] Ryan Blue commented on SPARK-10340: --- >From discussion on the pull request, it looks like the solution

[jira] [Created] (SPARK-12297) Add work-around for Parquet/Hive int96 timestamp bug.

2015-12-11 Thread Ryan Blue (JIRA)
Ryan Blue created SPARK-12297: - Summary: Add work-around for Parquet/Hive int96 timestamp bug. Key: SPARK-12297 URL: https://issues.apache.org/jira/browse/SPARK-12297 Project: Spark Issue Type:

[jira] [Commented] (SPARK-10143) Parquet changed the behavior of calculating splits

2015-08-21 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14706875#comment-14706875 ] Ryan Blue commented on SPARK-10143: --- [~yhuai], you're right that the input format now

[jira] [Commented] (SPARK-10143) Parquet changed the behavior of calculating splits

2015-08-21 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14707414#comment-14707414 ] Ryan Blue commented on SPARK-10143: --- [~yhuai], yes, you'd want to determine the number

[jira] [Commented] (SPARK-10143) Parquet changed the behavior of calculating splits

2015-08-21 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14707148#comment-14707148 ] Ryan Blue commented on SPARK-10143: --- [~yhuai] if you do that, you will get the current

[jira] [Commented] (SPARK-10143) Parquet changed the behavior of calculating splits

2015-08-21 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14707222#comment-14707222 ] Ryan Blue commented on SPARK-10143: --- I think you're going to end up assuming every row

[jira] [Commented] (SPARK-9340) ParquetTypeConverter incorrectly handling of repeated types results in schema mismatch

2015-08-10 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14680355#comment-14680355 ] Ryan Blue commented on SPARK-9340: -- Sorry to jump in late on this issue... I think you're

[jira] [Commented] (SPARK-9340) CatalystSchemaConverter and CatalystRowConverter don't handle unannotated repeated fields correctly

2015-08-10 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14680978#comment-14680978 ] Ryan Blue commented on SPARK-9340: -- I just took a look at PR #8070 and it looks good to

<    1   2   3   4   >