[jira] [Commented] (SPARK-15616) Metastore relation should fallback to HDFS size of partitions that are involved in Query if statistics are not available.

2016-10-26 Thread Franck Tago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15607872#comment-15607872 ] Franck Tago commented on SPARK-15616: - Hi Wondering if there are any updates on th

[jira] [Comment Edited] (SPARK-15616) Metastore relation should fallback to HDFS size of partitions that are involved in Query if statistics are not available.

2016-10-26 Thread Franck Tago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15607872#comment-15607872 ] Franck Tago edited comment on SPARK-15616 at 10/26/16 11:20 PM: ---

[jira] [Commented] (SPARK-15616) Metastore relation should fallback to HDFS size of partitions that are involved in Query if statistics are not available.

2016-10-26 Thread Franck Tago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15610024#comment-15610024 ] Franck Tago commented on SPARK-15616: - Does spark currently support pruning statist

[jira] [Commented] (SPARK-15616) Metastore relation should fallback to HDFS size of partitions that are involved in Query if statistics are not available.

2016-10-27 Thread Franck Tago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15612937#comment-15612937 ] Franck Tago commented on SPARK-15616: - Hi In my case the filter is on a partition

[jira] [Comment Edited] (SPARK-15616) Metastore relation should fallback to HDFS size of partitions that are involved in Query if statistics are not available.

2016-10-27 Thread Franck Tago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15612937#comment-15612937 ] Franck Tago edited comment on SPARK-15616 at 10/27/16 7:35 PM:

[jira] [Commented] (SPARK-17612) Support `DESCRIBE table PARTITION` SQL syntax

2016-10-28 Thread Franck Tago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15616376#comment-15616376 ] Franck Tago commented on SPARK-17612: - Hi Basically I have an issue where I am per

[jira] [Commented] (SPARK-15616) Metastore relation should fallback to HDFS size of partitions that are involved in Query if statistics are not available.

2016-10-28 Thread Franck Tago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15616385#comment-15616385 ] Franck Tago commented on SPARK-15616: - So I tried the changed that were made for this

[jira] [Commented] (SPARK-17612) Support `DESCRIBE table PARTITION` SQL syntax

2016-10-28 Thread Franck Tago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15616542#comment-15616542 ] Franck Tago commented on SPARK-17612: - Hi Thanks for the quick reply. So I was only

[jira] [Commented] (SPARK-15616) Metastore relation should fallback to HDFS size of partitions that are involved in Query if statistics are not available.

2016-10-30 Thread Franck Tago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15620750#comment-15620750 ] Franck Tago commented on SPARK-15616: - SO was not able to use the changes for the fo

[jira] [Commented] (SPARK-17982) Spark 2.0.0 CREATE VIEW statement fails :: java.lang.RuntimeException: Failed to analyze the canonicalized SQL. It is possible there is a bug in Spark.

2016-11-01 Thread Franck Tago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15627406#comment-15627406 ] Franck Tago commented on SPARK-17982: - Wanted to mention that I was able to successfu

[jira] [Commented] (SPARK-15214) Implement code generation for Generate

2022-05-02 Thread Franck Tago (Jira)
[ https://issues.apache.org/jira/browse/SPARK-15214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17531008#comment-17531008 ] Franck Tago commented on SPARK-15214: - it appears that code generation is not suppor

[jira] [Updated] (SPARK-23519) Create View Commands Fails with The view output (col1,col1) contains duplicate column name

2018-05-10 Thread Franck Tago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Franck Tago updated SPARK-23519: Attachment: image-2018-05-10-10-48-57-259.png > Create View Commands Fails with The view output (c

[jira] [Commented] (SPARK-23519) Create View Commands Fails with The view output (col1,col1) contains duplicate column name

2018-05-10 Thread Franck Tago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16470843#comment-16470843 ] Franck Tago commented on SPARK-23519: - I do not agree with the 'typical database' cla

[jira] [Created] (SPARK-17758) Spark Aggregate function LAST returns null on an empty partition

2016-10-01 Thread Franck Tago (JIRA)
Franck Tago created SPARK-17758: --- Summary: Spark Aggregate function LAST returns null on an empty partition Key: SPARK-17758 URL: https://issues.apache.org/jira/browse/SPARK-17758 Project: Spark

[jira] [Commented] (SPARK-17758) Spark Aggregate function LAST returns null on an empty partition

2016-10-03 Thread Franck Tago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15543097#comment-15543097 ] Franck Tago commented on SPARK-17758: - I tested the behavior of the min and max funct

[jira] [Commented] (SPARK-17758) Spark Aggregate function LAST returns null on an empty partition

2016-10-03 Thread Franck Tago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15543488#comment-15543488 ] Franck Tago commented on SPARK-17758: - [~hvanhovell] It should return on null in this

[jira] [Comment Edited] (SPARK-17758) Spark Aggregate function LAST returns null on an empty partition

2016-10-03 Thread Franck Tago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15543488#comment-15543488 ] Franck Tago edited comment on SPARK-17758 at 10/3/16 11:04 PM:

[jira] [Comment Edited] (SPARK-17758) Spark Aggregate function LAST returns null on an empty partition

2016-10-03 Thread Franck Tago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15543488#comment-15543488 ] Franck Tago edited comment on SPARK-17758 at 10/3/16 11:04 PM:

[jira] [Commented] (SPARK-17758) Spark Aggregate function LAST returns null on an empty partition

2016-10-03 Thread Franck Tago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15543926#comment-15543926 ] Franck Tago commented on SPARK-17758: - is there any workaround that you could think o

[jira] [Commented] (SPARK-17758) Spark Aggregate function LAST returns null on an empty partition

2016-10-03 Thread Franck Tago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15544003#comment-15544003 ] Franck Tago commented on SPARK-17758: - Thanks for the pointer Is it possible to writ

[jira] [Commented] (SPARK-17758) Spark Aggregate function LAST returns null on an empty partition

2016-10-04 Thread Franck Tago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15547292#comment-15547292 ] Franck Tago commented on SPARK-17758: - My first impression after taking a look at Fi

[jira] [Created] (SPARK-17859) persist should not impede with spark's ability to perform a broadcast join.

2016-10-10 Thread Franck Tago (JIRA)
Franck Tago created SPARK-17859: --- Summary: persist should not impede with spark's ability to perform a broadcast join. Key: SPARK-17859 URL: https://issues.apache.org/jira/browse/SPARK-17859 Project: Sp

[jira] [Created] (SPARK-17982) Spark 2.0.0 CREATE VIEW statement fails when select statement contains limit clause

2016-10-17 Thread Franck Tago (JIRA)
Franck Tago created SPARK-17982: --- Summary: Spark 2.0.0 CREATE VIEW statement fails when select statement contains limit clause Key: SPARK-17982 URL: https://issues.apache.org/jira/browse/SPARK-17982 Pr

[jira] [Commented] (SPARK-17982) Spark 2.0.0 CREATE VIEW statement fails when select statement contains limit clause

2016-10-17 Thread Franck Tago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15584178#comment-15584178 ] Franck Tago commented on SPARK-17982: - == SQL == SELECT `gen_attr_0` AS `WHERE_ID`, `

[jira] [Commented] (SPARK-17982) Spark 2.0.0 CREATE VIEW statement fails when select statement contains limit clause

2016-10-18 Thread Franck Tago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15586309#comment-15586309 ] Franck Tago commented on SPARK-17982: - Hi , couple of things that I want to point ou

[jira] [Comment Edited] (SPARK-17982) Spark 2.0.0 CREATE VIEW statement fails when select statement contains limit clause

2016-10-18 Thread Franck Tago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15586309#comment-15586309 ] Franck Tago edited comment on SPARK-17982 at 10/18/16 6:50 PM:

[jira] [Updated] (SPARK-17982) Spark 2.0.0 CREATE VIEW statement fails :: java.lang.RuntimeException: Failed to analyze the canonicalized SQL. It is possible there is a bug in Spark.

2016-10-18 Thread Franck Tago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Franck Tago updated SPARK-17982: Summary: Spark 2.0.0 CREATE VIEW statement fails :: java.lang.RuntimeException: Failed to analyze

[jira] [Commented] (SPARK-17982) Spark 2.0.0 CREATE VIEW statement fails :: java.lang.RuntimeException: Failed to analyze the canonicalized SQL. It is possible there is a bug in Spark.

2016-10-18 Thread Franck Tago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15586918#comment-15586918 ] Franck Tago commented on SPARK-17982: - Updated the Title of the Jira . I looked a

[jira] [Created] (SPARK-23519) Create View Commands Fails with The view output (col1,col1) contains duplicate column name

2018-02-26 Thread Franck Tago (JIRA)
Franck Tago created SPARK-23519: --- Summary: Create View Commands Fails with The view output (col1,col1) contains duplicate column name Key: SPARK-23519 URL: https://issues.apache.org/jira/browse/SPARK-23519

[jira] [Updated] (SPARK-19809) NullPointerException on zero-size ORC file

2018-02-26 Thread Franck Tago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Franck Tago updated SPARK-19809: Attachment: image-2018-02-26-20-29-49-410.png > NullPointerException on zero-size ORC file > --

[jira] [Updated] (SPARK-19809) NullPointerException on zero-size ORC file

2018-02-26 Thread Franck Tago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Franck Tago updated SPARK-19809: Attachment: spark.sql.hive.convertMetastoreOrc.txt > NullPointerException on zero-size ORC file > -

[jira] [Commented] (SPARK-19809) NullPointerException on zero-size ORC file

2018-02-26 Thread Franck Tago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16378018#comment-16378018 ] Franck Tago commented on SPARK-19809: - Need a pointer on the following.   Env : Spar

[jira] [Commented] (SPARK-19809) NullPointerException on zero-size ORC file

2018-02-26 Thread Franck Tago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16378028#comment-16378028 ] Franck Tago commented on SPARK-19809: - 1- I am kind of constrained to  spark 2.2.1  a

[jira] [Commented] (SPARK-19809) NullPointerException on zero-size ORC file

2018-02-26 Thread Franck Tago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16378046#comment-16378046 ] Franck Tago commented on SPARK-19809: - Just to confirm , your  earlier comment referr

[jira] [Commented] (SPARK-23519) Create View Commands Fails with The view output (col1,col1) contains duplicate column name

2018-03-06 Thread Franck Tago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16389150#comment-16389150 ] Franck Tago commented on SPARK-23519: - Any updates on this ? > Create View Commands

[jira] [Updated] (SPARK-23519) Create View Commands Fails with The view output (col1,col1) contains duplicate column name

2018-03-20 Thread Franck Tago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Franck Tago updated SPARK-23519: Component/s: SQL > Create View Commands Fails with The view output (col1,col1) contains > duplica

[jira] [Updated] (SPARK-23519) Create View Commands Fails with The view output (col1,col1) contains duplicate column name

2018-03-20 Thread Franck Tago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Franck Tago updated SPARK-23519: Description: 1- create and populate a hive table  . I did this in a hive cli session .[ not that t

[jira] [Comment Edited] (SPARK-23519) Create View Commands Fails with The view output (col1,col1) contains duplicate column name

2018-03-20 Thread Franck Tago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16389150#comment-16389150 ] Franck Tago edited comment on SPARK-23519 at 3/21/18 3:29 AM: -

[jira] [Commented] (SPARK-23519) Create View Commands Fails with The view output (col1,col1) contains duplicate column name

2018-04-18 Thread Franck Tago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16443154#comment-16443154 ] Franck Tago commented on SPARK-23519: - thanks for the suggestion [~shahid] The issue

[jira] [Reopened] (SPARK-23519) Create View Commands Fails with The view output (col1,col1) contains duplicate column name

2019-08-14 Thread Franck Tago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Franck Tago reopened SPARK-23519: - Ok Spark Community  I am sorry for being a pest about this , but I re-opening this Jira because I

[jira] [Commented] (SPARK-23519) Create View Commands Fails with The view output (col1,col1) contains duplicate column name

2019-08-26 Thread Franck Tago (Jira)
[ https://issues.apache.org/jira/browse/SPARK-23519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16916334#comment-16916334 ] Franck Tago commented on SPARK-23519: - [~viirya] My mistake , i tested it with Orac

[jira] [Comment Edited] (SPARK-23519) Create View Commands Fails with The view output (col1,col1) contains duplicate column name

2019-08-26 Thread Franck Tago (Jira)
[ https://issues.apache.org/jira/browse/SPARK-23519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16916334#comment-16916334 ] Franck Tago edited comment on SPARK-23519 at 8/27/19 4:10 AM:

[jira] [Created] (SPARK-44759) Do not combine multiple Generate nodes in the same WholeStageCodeGen because it can easily cause OOM failures

2023-08-10 Thread Franck Tago (Jira)
Franck Tago created SPARK-44759: --- Summary: Do not combine multiple Generate nodes in the same WholeStageCodeGen because it can easily cause OOM failures Key: SPARK-44759 URL: https://issues.apache.org/jira/browse/S

[jira] [Updated] (SPARK-44759) Do not combine multiple Generate nodes in the same WholeStageCodeGen because it can easily cause OOM failures

2023-08-10 Thread Franck Tago (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Franck Tago updated SPARK-44759: Attachment: spark-verbosewithcodegenenabled > Do not combine multiple Generate nodes in the same W

[jira] [Updated] (SPARK-44759) Do not combine multiple Generate nodes in the same WholeStageCodeGen because it can easily cause OOM failures

2023-08-10 Thread Franck Tago (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Franck Tago updated SPARK-44759: Attachment: wholestagecodegen_wc1_debug_wholecodegen_passed > Do not combine multiple Generate nod

[jira] [Updated] (SPARK-44759) Do not combine multiple Generate nodes in the same WholeStageCodeGen because it can easily cause OOM failures

2023-08-10 Thread Franck Tago (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Franck Tago updated SPARK-44759: Attachment: (was: spark-verbosewithcodegenenabled) > Do not combine multiple Generate nodes in

[jira] [Updated] (SPARK-44759) Do not combine multiple Generate nodes in the same WholeStageCodeGen because it can easily cause OOM failures

2023-08-10 Thread Franck Tago (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Franck Tago updated SPARK-44759: Description: The generate node used to flatten array generally  produces an amount of output rows

[jira] [Updated] (SPARK-44759) Do not combine multiple Generate nodes in the same WholeStageCodeGen nodebecause it can easily cause OOM failures if arrays are relatively large

2023-08-10 Thread Franck Tago (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Franck Tago updated SPARK-44759: Summary: Do not combine multiple Generate nodes in the same WholeStageCodeGen nodebecause it can

[jira] [Commented] (SPARK-44759) Do not combine multiple Generate nodes in the same WholeStageCodeGen nodebecause it can easily cause OOM failures if arrays are relatively large

2023-08-10 Thread Franck Tago (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17752864#comment-17752864 ] Franck Tago commented on SPARK-44759: - !image-2023-08-10-09-27-24-124.png! > Do not

[jira] [Updated] (SPARK-44759) Do not combine multiple Generate nodes in the same WholeStageCodeGen nodebecause it can easily cause OOM failures if arrays are relatively large

2023-08-10 Thread Franck Tago (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Franck Tago updated SPARK-44759: Attachment: image-2023-08-10-09-27-24-124.png > Do not combine multiple Generate nodes in the same

[jira] [Comment Edited] (SPARK-44759) Do not combine multiple Generate nodes in the same WholeStageCodeGen nodebecause it can easily cause OOM failures if arrays are relatively large

2023-08-10 Thread Franck Tago (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17752864#comment-17752864 ] Franck Tago edited comment on SPARK-44759 at 8/10/23 4:28 PM:

[jira] [Updated] (SPARK-44759) Do not combine multiple Generate nodes in the same WholeStageCodeGen nodebecause it can easily cause OOM failures if arrays are relatively large

2023-08-10 Thread Franck Tago (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Franck Tago updated SPARK-44759: Attachment: image-2023-08-10-09-29-24-804.png > Do not combine multiple Generate nodes in the same

[jira] [Commented] (SPARK-44759) Do not combine multiple Generate nodes in the same WholeStageCodeGen nodebecause it can easily cause OOM failures if arrays are relatively large

2023-08-10 Thread Franck Tago (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17752866#comment-17752866 ] Franck Tago commented on SPARK-44759: - WSCG  generated code for first Generate node 

[jira] [Updated] (SPARK-44759) Do not combine multiple Generate nodes in the same WholeStageCodeGen nodebecause it can easily cause OOM failures if arrays are relatively large

2023-08-10 Thread Franck Tago (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Franck Tago updated SPARK-44759: Attachment: image-2023-08-10-09-32-46-163.png > Do not combine multiple Generate nodes in the same

[jira] [Commented] (SPARK-44759) Do not combine multiple Generate nodes in the same WholeStageCodeGen nodebecause it can easily cause OOM failures if arrays are relatively large

2023-08-10 Thread Franck Tago (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17752868#comment-17752868 ] Franck Tago commented on SPARK-44759: - WSCG  generated code for second Generate node

[jira] [Updated] (SPARK-44759) Do not combine multiple Generate nodes in the same WholeStageCodeGen nodebecause it can easily cause OOM failures if arrays are relatively large

2023-08-10 Thread Franck Tago (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Franck Tago updated SPARK-44759: Attachment: image-2023-08-10-09-33-47-788.png > Do not combine multiple Generate nodes in the same

[jira] [Commented] (SPARK-44759) Do not combine multiple Generate nodes in the same WholeStageCodeGen nodebecause it can easily cause OOM failures if arrays are relatively large

2023-08-10 Thread Franck Tago (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17752870#comment-17752870 ] Franck Tago commented on SPARK-44759: - Spark Dag for the use case . The failure is f

[jira] [Updated] (SPARK-44759) Do not combine multiple Generate nodes in the same WholeStageCodeGen nodebecause it can easily cause OOM failures if arrays are relatively large

2023-08-10 Thread Franck Tago (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Franck Tago updated SPARK-44759: Description: The generate node used to flatten array generally  produces an amount of output rows

[jira] [Updated] (SPARK-44759) Do not combine multiple Generate nodes in the same WholeStageCodeGen nodebecause it can easily cause OOM failures if arrays are relatively large

2023-08-10 Thread Franck Tago (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Franck Tago updated SPARK-44759: Description: The generate node used to flatten array generally  produces an amount of output rows

[jira] [Updated] (SPARK-44759) Do not combine multiple Generate nodes in the same WholeStageCodeGen nodebecause it can easily cause OOM failures if arrays are relatively large

2023-08-10 Thread Franck Tago (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Franck Tago updated SPARK-44759: Affects Version/s: 3.4.1 3.4.0 3.3.2

[jira] [Updated] (SPARK-44759) Do not combine multiple Generate nodes in the same WholeStageCodeGen nodebecause it can easily cause OOM failures if arrays are relatively large

2023-08-10 Thread Franck Tago (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Franck Tago updated SPARK-44759: Description: This is an issue since the WSCG  implementation of the generate node.  The generate

[jira] [Updated] (SPARK-44759) Do not combine multiple Generate nodes in the same WholeStageCodeGen node because it can easily cause OOM failures if arrays are relatively large

2023-08-10 Thread Franck Tago (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Franck Tago updated SPARK-44759: Summary: Do not combine multiple Generate nodes in the same WholeStageCodeGen node because it can

[jira] [Updated] (SPARK-44759) Do not combine multiple Generate operators in the same WholeStageCodeGen node because it can easily cause OOM failures if arrays are relatively large

2023-08-10 Thread Franck Tago (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Franck Tago updated SPARK-44759: Summary: Do not combine multiple Generate operators in the same WholeStageCodeGen node because it

[jira] [Updated] (SPARK-44759) Do not combine multiple Generate operators in the same WholeStageCodeGen node because it can easily cause OOM failures if arrays are relatively large

2023-08-10 Thread Franck Tago (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Franck Tago updated SPARK-44759: Description: This is an issue since the WSCG  implementation of the generate node.  Because WSCG

[jira] [Created] (SPARK-44768) Improve WSCG handling of row buffer by accounting for executor memory

2023-08-10 Thread Franck Tago (Jira)
Franck Tago created SPARK-44768: --- Summary: Improve WSCG handling of row buffer by accounting for executor memory Key: SPARK-44768 URL: https://issues.apache.org/jira/browse/SPARK-44768 Project: Spark

[jira] [Updated] (SPARK-44768) Improve WSCG handling of row buffer by accounting for executor memory . Exploding nested arrays can easily lead to out of memory errors.

2023-08-10 Thread Franck Tago (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Franck Tago updated SPARK-44768: Summary: Improve WSCG handling of row buffer by accounting for executor memory . Exploding neste

[jira] [Updated] (SPARK-44768) Improve WSCG handling of row buffer by accounting for executor memory . Exploding nested arrays can easily lead to out of memory errors.

2023-08-10 Thread Franck Tago (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Franck Tago updated SPARK-44768: Attachment: spark-jira_wscg_code.txt > Improve WSCG handling of row buffer by accounting for execu

[jira] [Updated] (SPARK-44768) Improve WSCG handling of row buffer by accounting for executor memory . Exploding nested arrays can easily lead to out of memory errors.

2023-08-10 Thread Franck Tago (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Franck Tago updated SPARK-44768: Description: consider a scenario where you flatten  a nested array  // e.g you can use the follow

[jira] [Updated] (SPARK-44768) Improve WSCG handling of row buffer by accounting for executor memory . Exploding nested arrays can easily lead to out of memory errors.

2023-08-10 Thread Franck Tago (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Franck Tago updated SPARK-44768: Attachment: image-2023-08-10-20-32-55-684.png > Improve WSCG handling of row buffer by accounting

[jira] [Commented] (SPARK-44768) Improve WSCG handling of row buffer by accounting for executor memory . Exploding nested arrays can easily lead to out of memory errors.

2023-08-10 Thread Franck Tago (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17753035#comment-17753035 ] Franck Tago commented on SPARK-44768: - !image-2023-08-10-20-32-55-684.png! > Improv

[jira] [Updated] (SPARK-44768) Improve WSCG handling of row buffer by accounting for executor memory . Exploding nested arrays can easily lead to out of memory errors.

2023-08-10 Thread Franck Tago (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Franck Tago updated SPARK-44768: Description: The code sample below is to showcase the wholestagecodegen generated when exploding

[jira] [Updated] (SPARK-44759) Do not combine multiple Generate operators in the same WholeStageCodeGen node because it can easily cause OOM failures if arrays are relatively large

2023-08-12 Thread Franck Tago (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Franck Tago updated SPARK-44759: Component/s: Deploy Spark Core > Do not combine multiple Generate operators in th

[jira] [Created] (SPARK-20153) Support Multiple aws credentials in order to access multiple Hive on S3 table in spark application

2017-03-29 Thread Franck Tago (JIRA)
Franck Tago created SPARK-20153: --- Summary: Support Multiple aws credentials in order to access multiple Hive on S3 table in spark application Key: SPARK-20153 URL: https://issues.apache.org/jira/browse/SPARK-20153

[jira] [Commented] (SPARK-20153) Support Multiple aws credentials in order to access multiple Hive on S3 table in spark application

2017-04-04 Thread Franck Tago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15955695#comment-15955695 ] Franck Tago commented on SPARK-20153: - oh I would definitely not consider including t

[jira] [Commented] (SPARK-20153) Support Multiple aws credentials in order to access multiple Hive on S3 table in spark application

2017-04-04 Thread Franck Tago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15956346#comment-15956346 ] Franck Tago commented on SPARK-20153: - ok thanks for the tips. It appears that EMR 5

[jira] [Created] (SPARK-20235) Hive on S3 s3:sse and non S3:sse buckets

2017-04-05 Thread Franck Tago (JIRA)
Franck Tago created SPARK-20235: --- Summary: Hive on S3 s3:sse and non S3:sse buckets Key: SPARK-20235 URL: https://issues.apache.org/jira/browse/SPARK-20235 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-20235) Hive on S3 s3:sse and non S3:sse buckets

2017-05-16 Thread Franck Tago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16013290#comment-16013290 ] Franck Tago commented on SPARK-20235: - was this comment meant for me? what does that