[jira] [Updated] (SPARK-26565) modify dev/create-release/release-build.sh to let jenkins build packages w/o publishing

2019-01-08 Thread shane knapp (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shane knapp updated SPARK-26565: Description: about a year+ ago, we stopped publishing releases directly from jenkins... this

[jira] [Commented] (SPARK-26565) modify dev/create-release/release-build.sh to let jenkins build packages w/o publishing

2019-01-08 Thread shane knapp (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737520#comment-16737520 ] shane knapp commented on SPARK-26565: - test build passed! > modify

[jira] [Created] (SPARK-26573) Python worker not reused with mapPartitions if not consuming iterator

2019-01-08 Thread Bryan Cutler (JIRA)
Bryan Cutler created SPARK-26573: Summary: Python worker not reused with mapPartitions if not consuming iterator Key: SPARK-26573 URL: https://issues.apache.org/jira/browse/SPARK-26573 Project: Spark

[jira] [Commented] (SPARK-26565) modify dev/create-release/release-build.sh to let jenkins build packages w/o publishing

2019-01-08 Thread shane knapp (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737463#comment-16737463 ] shane knapp commented on SPARK-26565: - test build running:

[jira] [Commented] (SPARK-26565) modify dev/create-release/release-build.sh to let jenkins build packages w/o publishing

2019-01-08 Thread shane knapp (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737420#comment-16737420 ] shane knapp commented on SPARK-26565: - ok, thanks for all of the clarification on this. i think i

[jira] [Assigned] (SPARK-24920) Spark should allow sharing netty's memory pools across all uses

2019-01-08 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reassigned SPARK-24920: - Assignee: Attila Zsolt Piros > Spark should allow sharing netty's memory pools across all uses

[jira] [Resolved] (SPARK-24920) Spark should allow sharing netty's memory pools across all uses

2019-01-08 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-24920. --- Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 23278

[jira] [Commented] (SPARK-26565) modify dev/create-release/release-build.sh to let jenkins build packages w/o publishing

2019-01-08 Thread shane knapp (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737453#comment-16737453 ] shane knapp commented on SPARK-26565: - see: https://github.com/apache/spark/pull/23492 i also have

[jira] [Resolved] (SPARK-26563) Quick Start documentation provides example that doesn't work (Java)

2019-01-08 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-26563. -- Resolution: Won't Fix > Quick Start documentation provides example that doesn't work (Java) >

[jira] [Assigned] (SPARK-26349) Pyspark should not accept insecure p4yj gateways

2019-01-08 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler reassigned SPARK-26349: Assignee: Imran Rashid > Pyspark should not accept insecure p4yj gateways >

[jira] [Resolved] (SPARK-26349) Pyspark should not accept insecure p4yj gateways

2019-01-08 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler resolved SPARK-26349. -- Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 23441

[jira] [Resolved] (SPARK-26529) Add debug logs for confArchive when preparing local resource

2019-01-08 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-26529. -- Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 23444

[jira] [Created] (SPARK-26574) Cloud sql stronge

2019-01-08 Thread Roufique Hossain (JIRA)
Roufique Hossain created SPARK-26574: Summary: Cloud sql stronge Key: SPARK-26574 URL: https://issues.apache.org/jira/browse/SPARK-26574 Project: Spark Issue Type: Bug

[jira] [Assigned] (SPARK-26529) Add debug logs for confArchive when preparing local resource

2019-01-08 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-26529: Assignee: liupengcheng > Add debug logs for confArchive when preparing local resource >

[jira] [Updated] (SPARK-26576) Broadcast hint not applied to partitioned Parquet table

2019-01-08 Thread John Zhuge (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Zhuge updated SPARK-26576: --- Affects Version/s: 2.3.2 > Broadcast hint not applied to partitioned Parquet table >

[jira] [Assigned] (SPARK-26577) Add input optimizer when reading Hive table by SparkSQL

2019-01-08 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26577: Assignee: Apache Spark > Add input optimizer when reading Hive table by SparkSQL >

[jira] [Assigned] (SPARK-26577) Add input optimizer when reading Hive table by SparkSQL

2019-01-08 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26577: Assignee: (was: Apache Spark) > Add input optimizer when reading Hive table by

[jira] [Issue Comment Deleted] (SPARK-20901) Feature parity for ORC with Parquet

2019-01-08 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-20901: -- Comment: was deleted (was: about SPARK-19019,I resolved it I meet this problem and resolved

[jira] [Assigned] (SPARK-26578) Synchronize putBytes's memory allocation and putting block on memoryManager

2019-01-08 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26578: Assignee: (was: Apache Spark) > Synchronize putBytes's memory allocation and putting

[jira] [Created] (SPARK-26579) SparkML DecisionTree, how does the algorithm identify categorical features?

2019-01-08 Thread Xufeng Wang (JIRA)
Xufeng Wang created SPARK-26579: --- Summary: SparkML DecisionTree, how does the algorithm identify categorical features? Key: SPARK-26579 URL: https://issues.apache.org/jira/browse/SPARK-26579 Project:

[jira] [Created] (SPARK-26576) Broadcast hint not applied to partitioned Parquet table

2019-01-08 Thread John Zhuge (JIRA)
John Zhuge created SPARK-26576: -- Summary: Broadcast hint not applied to partitioned Parquet table Key: SPARK-26576 URL: https://issues.apache.org/jira/browse/SPARK-26576 Project: Spark Issue

[jira] [Commented] (SPARK-24374) SPIP: Support Barrier Execution Mode in Apache Spark

2019-01-08 Thread luzengxiang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737849#comment-16737849 ] luzengxiang commented on SPARK-24374: - I am tryging embedding MPI in barrier mode to support

[jira] [Assigned] (SPARK-26578) Synchronize putBytes's memory allocation and putting block on memoryManager

2019-01-08 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26578: Assignee: Apache Spark > Synchronize putBytes's memory allocation and putting block on

[jira] [Created] (SPARK-26580) remove Scala 2.11 hack for Scala UDF

2019-01-08 Thread Wenchen Fan (JIRA)
Wenchen Fan created SPARK-26580: --- Summary: remove Scala 2.11 hack for Scala UDF Key: SPARK-26580 URL: https://issues.apache.org/jira/browse/SPARK-26580 Project: Spark Issue Type: Improvement

[jira] [Resolved] (SPARK-26571) Update Hive Serde mapping with canonical name of Parquet and Orc FileFormat

2019-01-08 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-26571. --- Resolution: Fixed Assignee: Gengliang Wang Fix Version/s: 3.0.0

[jira] [Commented] (SPARK-26576) Broadcast hint not applied to partitioned Parquet table

2019-01-08 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737769#comment-16737769 ] Wenchen Fan commented on SPARK-26576: - can you reproduce it with the master branch? There is a major

[jira] [Commented] (SPARK-25433) Add support for PEX in PySpark

2019-01-08 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737806#comment-16737806 ] Hyukjin Kwon commented on SPARK-25433: -- The blog is actually pretty cool > Add support for PEX in

[jira] [Commented] (SPARK-26573) Python worker not reused with mapPartitions if not consuming iterator

2019-01-08 Thread Yuanjian Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737815#comment-16737815 ] Yuanjian Li commented on SPARK-26573: - Leave some thoughts during the work in SPARK-26549.  It's

[jira] [Comment Edited] (SPARK-24374) SPIP: Support Barrier Execution Mode in Apache Spark

2019-01-08 Thread luzengxiang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737849#comment-16737849 ] luzengxiang edited comment on SPARK-24374 at 1/9/19 4:45 AM: - Hi [~mengxr]:

[jira] [Created] (SPARK-26578) Synchronize putBytes's memory allocation and putting block on memoryManager

2019-01-08 Thread SongYadong (JIRA)
SongYadong created SPARK-26578: -- Summary: Synchronize putBytes's memory allocation and putting block on memoryManager Key: SPARK-26578 URL: https://issues.apache.org/jira/browse/SPARK-26578 Project:

[jira] [Commented] (SPARK-26576) Broadcast hint not applied to partitioned Parquet table

2019-01-08 Thread John Zhuge (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737765#comment-16737765 ] John Zhuge commented on SPARK-26576: The "ResolvedHint" node is removed by the following code

[jira] [Assigned] (SPARK-26549) PySpark worker reuse take no effect for parallelize xrange

2019-01-08 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-26549: Assignee: Yuanjian Li > PySpark worker reuse take no effect for parallelize xrange >

[jira] [Resolved] (SPARK-26549) PySpark worker reuse take no effect for parallelize xrange

2019-01-08 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-26549. -- Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 23470

[jira] [Updated] (SPARK-26577) Add input optimizer when reading Hive table by SparkSQL

2019-01-08 Thread Deegue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deegue updated SPARK-26577: --- External issue URL: https://github.com/apache/spark/pull/23496 > Add input optimizer when reading Hive

[jira] [Created] (SPARK-26575) revisit the equality of NaN

2019-01-08 Thread Wenchen Fan (JIRA)
Wenchen Fan created SPARK-26575: --- Summary: revisit the equality of NaN Key: SPARK-26575 URL: https://issues.apache.org/jira/browse/SPARK-26575 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-26389) temp checkpoint folder at executor should be deleted on graceful shutdown

2019-01-08 Thread Fengyu Cao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737776#comment-16737776 ] Fengyu Cao commented on SPARK-26389: hmmm, we didn't configure HDFS after HDFS configured, temp

[jira] [Commented] (SPARK-25433) Add support for PEX in PySpark

2019-01-08 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737800#comment-16737800 ] Hyukjin Kwon commented on SPARK-25433: -- Thanks. > Add support for PEX in PySpark >

[jira] [Created] (SPARK-26577) Add input optimizer when reading Hive table by SparkSQL

2019-01-08 Thread Deegue (JIRA)
Deegue created SPARK-26577: -- Summary: Add input optimizer when reading Hive table by SparkSQL Key: SPARK-26577 URL: https://issues.apache.org/jira/browse/SPARK-26577 Project: Spark Issue Type:

[jira] [Resolved] (SPARK-26323) check input types of ScalaUDF even if some inputs are of Any type

2019-01-08 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-26323. -- Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 23275

[jira] [Updated] (SPARK-26572) Join on distinct column with monotonically_increasing_id produces wrong output

2019-01-08 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-26572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sören Reichardt updated SPARK-26572: Summary: Join on distinct column with monotonically_increasing_id produces wrong output

[jira] [Updated] (SPARK-26572) Join on distinct column with monotonically_increasing_id produced wrong output

2019-01-08 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-26572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sören Reichardt updated SPARK-26572: Description: When joining a table with projected monotonically_increasing_id column after

[jira] [Resolved] (SPARK-26172) Unify String Params' case-insensitivity in ML

2019-01-08 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26172. --- Resolution: Won't Fix > Unify String Params' case-insensitivity in ML >

[jira] [Commented] (SPARK-24009) spark2.3.0 INSERT OVERWRITE LOCAL DIRECTORY '/home/spark/aaaaab'

2019-01-08 Thread ant_nebula (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16736929#comment-16736929 ] ant_nebula commented on SPARK-24009: any progress here?  i met the same error too. > spark2.3.0

[jira] [Created] (SPARK-26568) Too many partitions may cause thriftServer frequently Full GC

2019-01-08 Thread zhoukang (JIRA)
zhoukang created SPARK-26568: Summary: Too many partitions may cause thriftServer frequently Full GC Key: SPARK-26568 URL: https://issues.apache.org/jira/browse/SPARK-26568 Project: Spark Issue

[jira] [Updated] (SPARK-26568) Too many partitions may cause thriftServer frequently Full GC

2019-01-08 Thread zhoukang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhoukang updated SPARK-26568: - Description: The reason is that: first we have a table with many partitions(may be several

[jira] [Commented] (SPARK-13446) Spark need to support reading data from Hive 2.0.0 metastore

2019-01-08 Thread Vikram Singh Chandel (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16736854#comment-16736854 ] Vikram Singh Chandel commented on SPARK-13446: -- Guys any plan to merge it to release

[jira] [Assigned] (SPARK-26002) SQL date operators calculates with incorrect dayOfYears for dates before 1500-03-01

2019-01-08 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-26002: --- Assignee: Attila Zsolt Piros > SQL date operators calculates with incorrect dayOfYears for

[jira] [Resolved] (SPARK-26002) SQL date operators calculates with incorrect dayOfYears for dates before 1500-03-01

2019-01-08 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-26002. - Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 23000

[jira] [Assigned] (SPARK-24522) Centralize code to deal with security-related HTTP features

2019-01-08 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Imran Rashid reassigned SPARK-24522: Assignee: Marcelo Vanzin > Centralize code to deal with security-related HTTP features >

[jira] [Resolved] (SPARK-24522) Centralize code to deal with security-related HTTP features

2019-01-08 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Imran Rashid resolved SPARK-24522. -- Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 23302

[jira] [Commented] (SPARK-26509) Parquet DELTA_BYTE_ARRAY is not supported in Spark 2.x's Vectorized Reader

2019-01-08 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737284#comment-16737284 ] Yuming Wang commented on SPARK-26509: - How to reproduce it? > Parquet DELTA_BYTE_ARRAY is not

[jira] [Updated] (SPARK-26440) Show total CPU time across all tasks on stage pages

2019-01-08 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-26440: -- [~seancxmao] this is not Major > Show total CPU time across all tasks on stage pages >

[jira] [Updated] (SPARK-26440) Show total CPU time across all tasks on stage pages

2019-01-08 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-26440: -- Priority: Minor (was: Major) > Show total CPU time across all tasks on stage pages >

[jira] [Closed] (SPARK-25433) Add support for PEX in PySpark

2019-01-08 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-25433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fabian Höring closed SPARK-25433. - > Add support for PEX in PySpark > -- > > Key:

[jira] [Commented] (SPARK-25433) Add support for PEX in PySpark

2019-01-08 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-25433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737263#comment-16737263 ] Fabian Höring commented on SPARK-25433: --- For more details I actually have written a blogpost about

[jira] [Commented] (SPARK-26570) Out of memory when InMemoryFileIndex bulkListLeafFiles

2019-01-08 Thread deshanxiao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737061#comment-16737061 ] deshanxiao commented on SPARK-26570: !screenshot-1.png! > Out of memory when InMemoryFileIndex

[jira] [Updated] (SPARK-26570) Out of memory when InMemoryFileIndex bulkListLeafFiles

2019-01-08 Thread deshanxiao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] deshanxiao updated SPARK-26570: --- Attachment: screenshot-1.png > Out of memory when InMemoryFileIndex bulkListLeafFiles >

[jira] [Created] (SPARK-26570) Out of memory when InMemoryFileIndex bulkListLeafFiles

2019-01-08 Thread deshanxiao (JIRA)
deshanxiao created SPARK-26570: -- Summary: Out of memory when InMemoryFileIndex bulkListLeafFiles Key: SPARK-26570 URL: https://issues.apache.org/jira/browse/SPARK-26570 Project: Spark Issue

[jira] [Updated] (SPARK-26570) Out of memory when InMemoryFileIndex bulkListLeafFiles

2019-01-08 Thread deshanxiao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] deshanxiao updated SPARK-26570: --- Description: The *bulkListLeafFiles* will collect all filestatus in memory for every query which

[jira] [Created] (SPARK-26569) Fixed point for batch Operator Optimizations never reached when optimize logicalPlan

2019-01-08 Thread Chen Fan (JIRA)
Chen Fan created SPARK-26569: Summary: Fixed point for batch Operator Optimizations never reached when optimize logicalPlan Key: SPARK-26569 URL: https://issues.apache.org/jira/browse/SPARK-26569

[jira] [Updated] (SPARK-26569) Fixed point for batch Operator Optimizations never reached when optimize logicalPlan

2019-01-08 Thread Chen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen Fan updated SPARK-26569: - Description: There is a bit complicated Spark App using DataSet api run once a day, and I noticed the

[jira] [Commented] (SPARK-26346) Upgrade parquet to 1.11.0

2019-01-08 Thread Ken Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16736917#comment-16736917 ] Ken Wang commented on SPARK-26346: -- +1 > Upgrade parquet to 1.11.0 > - > >

[jira] [Updated] (SPARK-26570) Out of memory when InMemoryFileIndex bulkListLeafFiles

2019-01-08 Thread deshanxiao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] deshanxiao updated SPARK-26570: --- Description: The *bulkListLeafFiles* will collect all filestatus in memory for every query which

[jira] [Created] (SPARK-26571) Update Hive Serde mapping with canonical name of Parquet and Orc FileFormat

2019-01-08 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-26571: -- Summary: Update Hive Serde mapping with canonical name of Parquet and Orc FileFormat Key: SPARK-26571 URL: https://issues.apache.org/jira/browse/SPARK-26571

[jira] [Updated] (SPARK-26571) Update Hive Serde mapping with canonical name of Parquet and Orc FileFormat

2019-01-08 Thread Gengliang Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang updated SPARK-26571: --- Description: Currently the following queries will lead to wrong Hive Serde:

[jira] [Assigned] (SPARK-26571) Update Hive Serde mapping with canonical name of Parquet and Orc FileFormat

2019-01-08 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26571: Assignee: (was: Apache Spark) > Update Hive Serde mapping with canonical name of

[jira] [Assigned] (SPARK-26571) Update Hive Serde mapping with canonical name of Parquet and Orc FileFormat

2019-01-08 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26571: Assignee: Apache Spark > Update Hive Serde mapping with canonical name of Parquet and

[jira] [Created] (SPARK-26572) Join on distinct column with monotonically_increasing_id produced wrong output

2019-01-08 Thread JIRA
Sören Reichardt created SPARK-26572: --- Summary: Join on distinct column with monotonically_increasing_id produced wrong output Key: SPARK-26572 URL: https://issues.apache.org/jira/browse/SPARK-26572