[jira] [Updated] (SPARK-22232) Row objects in pyspark created using the `Row(**kwars)` syntax do not get serialized/deserialized properly

2017-10-09 Thread Bago Amirbekian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bago Amirbekian updated SPARK-22232: Summary: Row objects in pyspark created using the `Row(**kwars)` syntax do not get serializ

[jira] [Updated] (SPARK-22233) filter out empty InputSplit in HadoopRDD

2017-10-09 Thread Lijia Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lijia Liu updated SPARK-22233: -- Description: Sometimes, Hive will create an empty table with many empty files, Spark use the InputForm

[jira] [Updated] (SPARK-22233) filter out empty InputSplit in HadoopRDD

2017-10-09 Thread Lijia Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lijia Liu updated SPARK-22233: -- Description: Sometimes, Hive will create an empty table with many empty files, Spark use the InputForm

[jira] [Commented] (SPARK-22199) Spark Job on YARN fails with executors "Slave registration failed"

2017-10-09 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16198258#comment-16198258 ] Saisai Shao commented on SPARK-22199: - Can you please list the steps to reproduce thi

[jira] [Commented] (SPARK-21737) Create communication channel between arbitrary clients and the Spark AM in YARN mode

2017-10-09 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16198204#comment-16198204 ] Saisai Shao commented on SPARK-21737: - Hi [~tgraves], I'm trying to understand the de

[jira] [Comment Edited] (SPARK-22192) An RDD of nested POJO objects cannot be converted into a DataFrame using SQLContext.createDataFrame API

2017-10-09 Thread Asif Hussain Shahid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16191526#comment-16191526 ] Asif Hussain Shahid edited comment on SPARK-22192 at 10/10/17 5:46 AM:

[jira] [Created] (SPARK-22233) filter out empty InputSplit in HadoopRDD

2017-10-09 Thread Lijia Liu (JIRA)
Lijia Liu created SPARK-22233: - Summary: filter out empty InputSplit in HadoopRDD Key: SPARK-22233 URL: https://issues.apache.org/jira/browse/SPARK-22233 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-18492) GeneratedIterator grows beyond 64 KB

2017-10-09 Thread Muhammad Ramzan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16198192#comment-16198192 ] Muhammad Ramzan commented on SPARK-18492: - I am using spark 2.1.1 on a production

[jira] [Resolved] (SPARK-22225) wholeTextFilesIterators

2017-10-09 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-5. -- Resolution: Won't Fix I think we are able to do it and don't think it's worth. Let me resolve i

[jira] [Resolved] (SPARK-22222) Fix the ARRAY_MAX in BufferHolder and add a test

2017-10-09 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-2. - Resolution: Fixed Assignee: Feng Liu Fix Version/s: 2.3.0 > Fix the ARRAY_MAX in BufferHo

[jira] [Commented] (SPARK-22231) Support of map, filter, withColumn, dropColumn in nested list of structures

2017-10-09 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16198146#comment-16198146 ] Liang-Chi Hsieh commented on SPARK-22231: - Btw, the capacity to work on nested da

[jira] [Updated] (SPARK-8515) Improve ML attribute API

2017-10-09 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang-Chi Hsieh updated SPARK-8515: --- Attachment: (was: SPARK-8515.pdf) > Improve ML attribute API > >

[jira] [Updated] (SPARK-8515) Improve ML attribute API

2017-10-09 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang-Chi Hsieh updated SPARK-8515: --- Attachment: SPARK-8515.pdf > Improve ML attribute API > > >

[jira] [Comment Edited] (SPARK-22231) Support of map, filter, withColumn, dropColumn in nested list of structures

2017-10-09 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16198096#comment-16198096 ] Liang-Chi Hsieh edited comment on SPARK-22231 at 10/10/17 3:28 AM:

[jira] [Commented] (SPARK-22231) Support of map, filter, withColumn, dropColumn in nested list of structures

2017-10-09 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16198096#comment-16198096 ] Liang-Chi Hsieh commented on SPARK-22231: - Looks like `mapItems` is an API can wo

[jira] [Comment Edited] (SPARK-22231) Support of map, filter, withColumn, dropColumn in nested list of structures

2017-10-09 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16198082#comment-16198082 ] Liang-Chi Hsieh edited comment on SPARK-22231 at 10/10/17 3:18 AM:

[jira] [Commented] (SPARK-22231) Support of map, filter, withColumn, dropColumn in nested list of structures

2017-10-09 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16198082#comment-16198082 ] Liang-Chi Hsieh commented on SPARK-22231: - I think there is a typo in the second

[jira] [Commented] (SPARK-22232) Row objects in pyspark using the `Row(**kwars)` syntax do not get serialized/deserialized properly

2017-10-09 Thread Bago Amirbekian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16198068#comment-16198068 ] Bago Amirbekian commented on SPARK-22232: - Full trace: {code:none} [Row(a=u'a',

[jira] [Updated] (SPARK-22232) Row objects in pyspark using the `Row(**kwars)` syntax do not get serialized/deserialized properly

2017-10-09 Thread Bago Amirbekian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bago Amirbekian updated SPARK-22232: Description: The fields in a Row object created from a dict (ie {{Row(**kwargs)}}) should b

[jira] [Updated] (SPARK-22232) Row objects in pyspark using the `Row(**kwars)` syntax do not get serialized/deserialized properly

2017-10-09 Thread Bago Amirbekian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bago Amirbekian updated SPARK-22232: Description: The fields in a Row object created from a dict (ie {{Row(**kwargs)}}) should b

[jira] [Updated] (SPARK-22232) Row objects in pyspark using the `Row(**kwars)` syntax do not get serialized/deserialized properly

2017-10-09 Thread Bago Amirbekian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bago Amirbekian updated SPARK-22232: Description: The fields in a Row object created from a dict (ie {{Row(**kwargs)}}) should b

[jira] [Commented] (SPARK-2243) Support multiple SparkContexts in the same JVM

2017-10-09 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16198062#comment-16198062 ] ASF GitHub Bot commented on SPARK-2243: --- GitHub user 561152 opened a pull request:

[jira] [Updated] (SPARK-22232) Row objects in pyspark using the `Row(**kwars)` syntax do not get serialized/deserialized properly

2017-10-09 Thread Bago Amirbekian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bago Amirbekian updated SPARK-22232: Description: The fields in a Row object created from a dict (ie {{Row(**kwargs)}}) should b

[jira] [Updated] (SPARK-22232) Row objects in pyspark using the `Row(**kwars)` syntax do not get serialized/deserialized properly

2017-10-09 Thread Bago Amirbekian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bago Amirbekian updated SPARK-22232: Description: The fields in a Row object created from a dict (ie {{Row(**kwargs)}}) should b

[jira] [Updated] (SPARK-22232) Row objects in pyspark using the `Row(**kwars)` syntax do not get serialized/deserialized properly

2017-10-09 Thread Bago Amirbekian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bago Amirbekian updated SPARK-22232: Description: The fields in a Row object created from a dict (ie `Row(**kwargs)`) should be

[jira] [Updated] (SPARK-22232) Row objects in pyspark using the `Row(**kwars)` syntax do not get serialized/deserialized properly

2017-10-09 Thread Bago Amirbekian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bago Amirbekian updated SPARK-22232: Description: bq. The fields in a Row object created from a dict (ie `Row(**kwargs)`) should

[jira] [Updated] (SPARK-22232) Row objects in pyspark using the `Row(**kwars)` syntax do not get serialized/deserialized properly

2017-10-09 Thread Bago Amirbekian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bago Amirbekian updated SPARK-22232: Description: The fields in a Row object created from a dict (ie `Row(**kwargs)`) should be

[jira] [Created] (SPARK-22232) Row objects in pyspark using the `Row(**kwars)` syntax do not get serialized/deserialized properly

2017-10-09 Thread Bago Amirbekian (JIRA)
Bago Amirbekian created SPARK-22232: --- Summary: Row objects in pyspark using the `Row(**kwars)` syntax do not get serialized/deserialized properly Key: SPARK-22232 URL: https://issues.apache.org/jira/browse/SPARK

[jira] [Updated] (SPARK-22232) Row objects in pyspark using the `Row(**kwars)` syntax do not get serialized/deserialized properly

2017-10-09 Thread Bago Amirbekian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bago Amirbekian updated SPARK-22232: Description: The fields in a Row object created from a dict (ie `Row(**kwargs)`) should be

[jira] [Commented] (SPARK-22159) spark.sql.execution.arrow.enable and spark.sql.codegen.aggregate.map.twolevel.enable -> enabled

2017-10-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16198052#comment-16198052 ] Apache Spark commented on SPARK-22159: -- User 'ueshin' has created a pull request for

[jira] [Updated] (SPARK-22231) Support of map, filter, withColumn, dropColumn in nested list of structures

2017-10-09 Thread DB Tsai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] DB Tsai updated SPARK-22231: Description: At Netflix's algorithm team, we work on ranking problems to find the great content to fulfill

[jira] [Updated] (SPARK-22231) Support of map, filter, withColumn, dropColumn in nested list of structures

2017-10-09 Thread DB Tsai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] DB Tsai updated SPARK-22231: Description: At Netflix's algorithm team, we work on ranking problems to find the great content to fulfill

[jira] [Assigned] (SPARK-22230) agg(last('attr)) gives weird results for streaming

2017-10-09 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu reassigned SPARK-22230: Assignee: Jose Torres > agg(last('attr)) gives weird results for streaming > -

[jira] [Resolved] (SPARK-22230) agg(last('attr)) gives weird results for streaming

2017-10-09 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-22230. -- Resolution: Fixed > agg(last('attr)) gives weird results for streaming > --

[jira] [Comment Edited] (SPARK-22220) Spark SQL: LATERAL VIEW OUTER null pointer exception with GROUP BY

2017-10-09 Thread Dian Fay (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-0?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16197899#comment-16197899 ] Dian Fay edited comment on SPARK-0 at 10/9/17 11:55 PM: I

[jira] [Commented] (SPARK-22220) Spark SQL: LATERAL VIEW OUTER null pointer exception with GROUP BY

2017-10-09 Thread Dian Fay (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-0?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16197899#comment-16197899 ] Dian Fay commented on SPARK-0: -- I did find a slightly more detailed mention in the Z

[jira] [Commented] (SPARK-22220) Spark SQL: LATERAL VIEW OUTER null pointer exception with GROUP BY

2017-10-09 Thread Dian Fay (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-0?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16197883#comment-16197883 ] Dian Fay commented on SPARK-0: -- Your example does work and mine still fails, but I'm

[jira] [Updated] (SPARK-22230) agg(last('attr)) gives weird results for streaming

2017-10-09 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-22230: - Fix Version/s: 2.3.0 > agg(last('attr)) gives weird results for streaming > -

[jira] [Created] (SPARK-22231) Support of map, filter, withColumn, dropColumn in nested list of structures

2017-10-09 Thread DB Tsai (JIRA)
DB Tsai created SPARK-22231: --- Summary: Support of map, filter, withColumn, dropColumn in nested list of structures Key: SPARK-22231 URL: https://issues.apache.org/jira/browse/SPARK-22231 Project: Spark

[jira] [Resolved] (SPARK-22170) Broadcast join holds an extra copy of rows in driver memory

2017-10-09 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-22170. - Resolution: Fixed Assignee: Ryan Blue Fix Version/s: 2.3.0 > Broadcast join holds an extr

[jira] [Updated] (SPARK-22170) Broadcast join holds an extra copy of rows in driver memory

2017-10-09 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-22170: Issue Type: Improvement (was: Bug) > Broadcast join holds an extra copy of rows in driver memory > ---

[jira] [Issue Comment Deleted] (SPARK-20589) Allow limiting task concurrency per stage

2017-10-09 Thread Mani Vijayakumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mani Vijayakumar updated SPARK-20589: - Comment: was deleted (was: A comment with security level 'jira-users' was removed.) > Al

[jira] [Assigned] (SPARK-22230) agg(last('attr)) gives weird results for streaming

2017-10-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-22230: Assignee: Apache Spark > agg(last('attr)) gives weird results for streaming >

[jira] [Assigned] (SPARK-22230) agg(last('attr)) gives weird results for streaming

2017-10-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-22230: Assignee: (was: Apache Spark) > agg(last('attr)) gives weird results for streaming > -

[jira] [Commented] (SPARK-22230) agg(last('attr)) gives weird results for streaming

2017-10-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16197685#comment-16197685 ] Apache Spark commented on SPARK-22230: -- User 'joseph-torres' has created a pull requ

[jira] [Created] (SPARK-22230) agg(last('attr)) gives weird results for streaming

2017-10-09 Thread Jose Torres (JIRA)
Jose Torres created SPARK-22230: --- Summary: agg(last('attr)) gives weird results for streaming Key: SPARK-22230 URL: https://issues.apache.org/jira/browse/SPARK-22230 Project: Spark Issue Type:

[jira] [Resolved] (SPARK-19430) Cannot read external tables with VARCHAR columns if they're backed by ORC files written by Hive 1.2.1

2017-10-09 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-19430. --- Resolution: Duplicate This is resolved via SPARK-19459 in Spark 2.2. > Cannot read external

[jira] [Assigned] (SPARK-22222) Fix the ARRAY_MAX in BufferHolder and add a test

2017-10-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-2: Assignee: Apache Spark > Fix the ARRAY_MAX in BufferHolder and add a test > --

[jira] [Commented] (SPARK-22222) Fix the ARRAY_MAX in BufferHolder and add a test

2017-10-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16197620#comment-16197620 ] Apache Spark commented on SPARK-2: -- User 'liufengdb' has created a pull request

[jira] [Assigned] (SPARK-22222) Fix the ARRAY_MAX in BufferHolder and add a test

2017-10-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-2: Assignee: (was: Apache Spark) > Fix the ARRAY_MAX in BufferHolder and add a test > ---

[jira] [Updated] (SPARK-22218) spark shuffle services fails to update secret on application re-attempts

2017-10-09 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin updated SPARK-22218: --- Affects Version/s: (was: 2.2.0) 2.2.1 > spark shuffle services fai

[jira] [Resolved] (SPARK-22218) spark shuffle services fails to update secret on application re-attempts

2017-10-09 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-22218. Resolution: Fixed Assignee: Thomas Graves Fix Version/s: 2.3.0

[jira] [Resolved] (SPARK-21568) ConsoleProgressBar should only be enabled in shells

2017-10-09 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-21568. Resolution: Fixed Assignee: Dongjoon Hyun Fix Version/s: 2.3.0 > ConsolePro

[jira] [Assigned] (SPARK-20791) Use Apache Arrow to Improve Spark createDataFrame from Pandas.DataFrame

2017-10-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20791: Assignee: Apache Spark > Use Apache Arrow to Improve Spark createDataFrame from Pandas.Dat

[jira] [Commented] (SPARK-20791) Use Apache Arrow to Improve Spark createDataFrame from Pandas.DataFrame

2017-10-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16197581#comment-16197581 ] Apache Spark commented on SPARK-20791: -- User 'BryanCutler' has created a pull reques

[jira] [Assigned] (SPARK-20791) Use Apache Arrow to Improve Spark createDataFrame from Pandas.DataFrame

2017-10-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20791: Assignee: (was: Apache Spark) > Use Apache Arrow to Improve Spark createDataFrame from

[jira] [Updated] (SPARK-22229) SPIP: RDMA Accelerated Shuffle Engine

2017-10-09 Thread Yuval Degani (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuval Degani updated SPARK-9: - Attachment: SPARK-9_SPIP_RDMA_Accelerated_Shuffle_Engine_Rev_1.0.pdf > SPIP: RDMA Accelerated

[jira] [Updated] (SPARK-22229) SPIP: RDMA Accelerated Shuffle Engine

2017-10-09 Thread Yuval Degani (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuval Degani updated SPARK-9: - Description: An RDMA-accelerated shuffle engine can provide enormous performance benefits to shu

[jira] [Created] (SPARK-22229) SPIP: RDMA Accelerated Shuffle Engine

2017-10-09 Thread Yuval Degani (JIRA)
Yuval Degani created SPARK-9: Summary: SPIP: RDMA Accelerated Shuffle Engine Key: SPARK-9 URL: https://issues.apache.org/jira/browse/SPARK-9 Project: Spark Issue Type: Improvement

[jira] [Updated] (SPARK-22228) Add support for Array so from_json can parse

2017-10-09 Thread kant kodali (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kant kodali updated SPARK-8: Description: {code:java} val inputDS = Seq("""["foo", "bar"]""").toDF {code} {code:java} inputDS.

[jira] [Updated] (SPARK-22228) Add support for Array so from_json can parse

2017-10-09 Thread kant kodali (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kant kodali updated SPARK-8: Description: {code:java} val inputDS = Seq("""["foo", "bar"]""").toDF {code} {code:java} inputDS.

[jira] [Updated] (SPARK-22228) Add support for Array so from_json can parse

2017-10-09 Thread kant kodali (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kant kodali updated SPARK-8: Description: {code:java} val inputDS = Seq("""["foo", "bar"]""").toDF {code} inputDS.printSchema()

[jira] [Created] (SPARK-22228) Add support for Array so from_json can parse

2017-10-09 Thread kant kodali (JIRA)
kant kodali created SPARK-8: --- Summary: Add support for Array so from_json can parse Key: SPARK-8 URL: https://issues.apache.org/jira/browse/SPARK-8 Project: Spark Issue Type: Impro

[jira] [Updated] (SPARK-22228) Add support for Array so from_json can parse

2017-10-09 Thread kant kodali (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kant kodali updated SPARK-8: Description: `val inputDS = Seq("""["foo", "bar"]""").toDF` inputDS.printSchema() root |-- value

[jira] [Commented] (SPARK-1529) Support DFS based shuffle in addition to Netty shuffle

2017-10-09 Thread Karthik Natarajan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16197449#comment-16197449 ] Karthik Natarajan commented on SPARK-1529: -- Hello [~rkannan82] Are there any upd

[jira] [Comment Edited] (SPARK-20589) Allow limiting task concurrency per stage

2017-10-09 Thread Michael Park (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16197402#comment-16197402 ] Michael Park edited comment on SPARK-20589 at 10/9/17 5:56 PM:

[jira] [Commented] (SPARK-20589) Allow limiting task concurrency per stage

2017-10-09 Thread Michael Park (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16197402#comment-16197402 ] Michael Park commented on SPARK-20589: -- Pardon my ignorance of the inner workings of

[jira] [Commented] (SPARK-22227) DiskBlockManager.getAllBlocks could fail if called during shuffle

2017-10-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16197303#comment-16197303 ] Apache Spark commented on SPARK-7: -- User 'superbobry' has created a pull request

[jira] [Assigned] (SPARK-22227) DiskBlockManager.getAllBlocks could fail if called during shuffle

2017-10-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-7: Assignee: (was: Apache Spark) > DiskBlockManager.getAllBlocks could fail if called dur

[jira] [Assigned] (SPARK-22227) DiskBlockManager.getAllBlocks could fail if called during shuffle

2017-10-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-7: Assignee: Apache Spark > DiskBlockManager.getAllBlocks could fail if called during shuffle

[jira] [Commented] (SPARK-22226) splitExpression can create too many method calls (generating a Constant Pool limit error)

2017-10-09 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16197286#comment-16197286 ] Marco Gaido commented on SPARK-6: - Exactly [~kiszk], sorry for the bad initial ti

[jira] [Commented] (SPARK-22227) DiskBlockManager.getAllBlocks could fail if called during shuffle

2017-10-09 Thread Sergei Lebedev (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16197275#comment-16197275 ] Sergei Lebedev commented on SPARK-7: Sidenote: the trace above is caused by t

[jira] [Commented] (SPARK-22226) splitExpression can create too many method calls (generating a Constant Pool limit error)

2017-10-09 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16197272#comment-16197272 ] Kazuaki Ishizaki commented on SPARK-6: -- You are right. [This PR|https://gith

[jira] [Created] (SPARK-22227) DiskBlockManager.getAllBlocks could fail if called during shuffle

2017-10-09 Thread Sergei Lebedev (JIRA)
Sergei Lebedev created SPARK-7: -- Summary: DiskBlockManager.getAllBlocks could fail if called during shuffle Key: SPARK-7 URL: https://issues.apache.org/jira/browse/SPARK-7 Project: Spark

[jira] [Updated] (SPARK-22226) splitExpression can create too many method calls (generating a Constant Pool limit error)

2017-10-09 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido updated SPARK-6: Summary: splitExpression can create too many method calls (generating a Constant Pool limit error)

[jira] [Commented] (SPARK-22226) Code generation fails for dataframes with 10000 columns

2017-10-09 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16197254#comment-16197254 ] Marco Gaido commented on SPARK-6: - [~kiszk] I am not sure that the PR you mention

[jira] [Commented] (SPARK-22226) Code generation fails for dataframes with 10000 columns

2017-10-09 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16197164#comment-16197164 ] Kazuaki Ishizaki commented on SPARK-6: -- [This PR|https://github.com/apache/s

[jira] [Commented] (SPARK-22226) Code generation fails for dataframes with 10000 columns

2017-10-09 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16197121#comment-16197121 ] Sean Owen commented on SPARK-6: --- If it's truly different I'd try to edit the title/

[jira] [Commented] (SPARK-22226) Code generation fails for dataframes with 10000 columns

2017-10-09 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16197111#comment-16197111 ] Marco Gaido commented on SPARK-6: - I am not sure about what the current open PR i

[jira] [Issue Comment Deleted] (SPARK-22226) Code generation fails for dataframes with 10000 columns

2017-10-09 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-6: -- Comment: was deleted (was: Would the resolution to the linked issue not resolve this? because it's alr

[jira] [Commented] (SPARK-22226) Code generation fails for dataframes with 10000 columns

2017-10-09 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16197082#comment-16197082 ] Sean Owen commented on SPARK-6: --- Would the resolution to the linked issue not resol

[jira] [Commented] (SPARK-22226) Code generation fails for dataframes with 10000 columns

2017-10-09 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16197081#comment-16197081 ] Sean Owen commented on SPARK-6: --- Would the resolution to the linked issue not resol

[jira] [Commented] (SPARK-22226) Code generation fails for dataframes with 10000 columns

2017-10-09 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16197036#comment-16197036 ] Marco Gaido commented on SPARK-6: - [~srowen] I know that there are many ticket fo

[jira] [Resolved] (SPARK-22226) Code generation fails for dataframes with 10000 columns

2017-10-09 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-6. --- Resolution: Duplicate Duplicate of several > Code generation fails for dataframes with 1 columns

[jira] [Created] (SPARK-22226) Code generation fails for dataframes with 10000 columns

2017-10-09 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-6: --- Summary: Code generation fails for dataframes with 1 columns Key: SPARK-6 URL: https://issues.apache.org/jira/browse/SPARK-6 Project: Spark Issue T

[jira] [Commented] (SPARK-22225) wholeTextFilesIterators

2017-10-09 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16196952#comment-16196952 ] Hyukjin Kwon commented on SPARK-5: -- Couldn't we do this via {{sc.binaryFiles}} o

[jira] [Commented] (SPARK-18965) wholeTextFiles() is not able to read large files

2017-10-09 Thread sam (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16196872#comment-16196872 ] sam commented on SPARK-18965: - [~pradeep_misra] [~srowen]. Yes it's a new feature. What we

[jira] [Created] (SPARK-22225) wholeTextFilesIterators

2017-10-09 Thread sam (JIRA)
sam created SPARK-5: --- Summary: wholeTextFilesIterators Key: SPARK-5 URL: https://issues.apache.org/jira/browse/SPARK-5 Project: Spark Issue Type: New Feature Components: Spark Cor

[jira] [Commented] (SPARK-18170) Confusing error message when using rangeBetween without specifying an "orderBy"

2017-10-09 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16196839#comment-16196839 ] Hyukjin Kwon commented on SPARK-18170: -- {code} org.apache.spark.sql.AnalysisExceptio

[jira] [Commented] (SPARK-18233) Failed to deserialize the task

2017-10-09 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16196833#comment-16196833 ] Hyukjin Kwon commented on SPARK-18233: -- Hi [~davies], do you maybe remember how to r

[jira] [Updated] (SPARK-17952) SparkSession createDataFrame method throws exception for nested JavaBeans

2017-10-09 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-17952: - Affects Version/s: 2.3.0 > SparkSession createDataFrame method throws exception for nested JavaBe

[jira] [Commented] (SPARK-17952) SparkSession createDataFrame method throws exception for nested JavaBeans

2017-10-09 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16196814#comment-16196814 ] Hyukjin Kwon commented on SPARK-17952: -- This still happens in the master. > SparkSe

[jira] [Resolved] (SPARK-17275) Flaky test: org.apache.spark.deploy.RPackageUtilsSuite.jars that don't exist are skipped and print warning

2017-10-09 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-17275. -- Resolution: Not A Problem Let me resolve this for now but I will keep my eyes on builds and wil

[jira] [Resolved] (SPARK-17890) scala.ScalaReflectionException

2017-10-09 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-17890. -- Resolution: Cannot Reproduce I can't reproduce this by both {{spark-submit}} and {{spark-shell}

[jira] [Updated] (SPARK-22192) An RDD of nested POJO objects cannot be converted into a DataFrame using SQLContext.createDataFrame API

2017-10-09 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-22192: -- Target Version/s: (was: 2.2.0) > An RDD of nested POJO objects cannot be converted into a DataFrame u

[jira] [Updated] (SPARK-22222) Fix the ARRAY_MAX in BufferHolder and add a test

2017-10-09 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-2: -- Affects Version/s: (was: 2.2.1) 2.3.0 > Fix the ARRAY_MAX in BufferHolder an

[jira] [Updated] (SPARK-17877) Can not checkpoint connectedComponents resulting graph

2017-10-09 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-17877: - Affects Version/s: 2.3.0 > Can not checkpoint connectedComponents resulting graph > -

[jira] [Commented] (SPARK-17877) Can not checkpoint connectedComponents resulting graph

2017-10-09 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16196748#comment-16196748 ] Hyukjin Kwon commented on SPARK-17877: -- I tested this and the last line returned {{f

[jira] [Commented] (SPARK-17820) Spark sqlContext.sql() performs only first insert for HiveQL "FROM target INSERT INTO dest" command to insert into multiple target tables from same source

2017-10-09 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16196735#comment-16196735 ] Hyukjin Kwon commented on SPARK-17820: -- Hi [~kmbeyond], would you maybe be able to t

[jira] [Commented] (SPARK-13030) Change OneHotEncoder to Estimator

2017-10-09 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16196717#comment-16196717 ] Weichen Xu commented on SPARK-13030: [~bago.amirbekian] Multi-column means generate s

  1   2   >