[jira] [Commented] (SPARK-25364) a better way to handle vector index and sparsity in FeatureHasher implementation ?

2018-09-07 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16607019#comment-16607019 ] Marco Gaido commented on SPARK-25364: - Seems you created 2 JIRAs which are the same, if that is the

[jira] [Commented] (SPARK-25317) MemoryBlock performance regression

2018-09-05 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16604133#comment-16604133 ] Marco Gaido commented on SPARK-25317: - [~kiszk] sure, we can investigate further in the PR the root

[jira] [Commented] (SPARK-25317) MemoryBlock performance regression

2018-09-04 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16603174#comment-16603174 ] Marco Gaido commented on SPARK-25317: - I think I have a fix for this. I can submit a PR if you want,

[jira] [Created] (LIVY-506) Dedicated thread for timeout checker

2018-09-03 Thread Marco Gaido (JIRA)
Marco Gaido created LIVY-506: Summary: Dedicated thread for timeout checker Key: LIVY-506 URL: https://issues.apache.org/jira/browse/LIVY-506 Project: Livy Issue Type: Sub-task

[jira] [Commented] (SPARK-25265) Fix memory leak vulnerability in Barrier Execution Mode

2018-08-29 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596221#comment-16596221 ] Marco Gaido commented on SPARK-25265: - Isn't this a duplicate of the next one? > Fix memory leak

[jira] [Commented] (SPARK-25219) KMeans Clustering - Text Data - Results are incorrect

2018-08-29 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596109#comment-16596109 ] Marco Gaido commented on SPARK-25219: - Well, there are many differences between Spark ML and SKLearn

[jira] [Commented] (SPARK-23622) Flaky Test: HiveClientSuites

2018-08-28 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16595068#comment-16595068 ] Marco Gaido commented on SPARK-23622: - This failure became permanent in the last build (at least

[jira] [Created] (LIVY-503) More RPC classes used in thrifserver in a separate module

2018-08-28 Thread Marco Gaido (JIRA)
Marco Gaido created LIVY-503: Summary: More RPC classes used in thrifserver in a separate module Key: LIVY-503 URL: https://issues.apache.org/jira/browse/LIVY-503 Project: Livy Issue Type:

[jira] [Created] (LIVY-502) Cleanup Hive dependencies

2018-08-28 Thread Marco Gaido (JIRA)
Marco Gaido created LIVY-502: Summary: Cleanup Hive dependencies Key: LIVY-502 URL: https://issues.apache.org/jira/browse/LIVY-502 Project: Livy Issue Type: Sub-task Reporter: Marco

[jira] [Commented] (SPARK-25193) insert overwrite doesn't throw exception when drop old data fails

2018-08-25 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592609#comment-16592609 ] Marco Gaido commented on SPARK-25193: - Well, this I think is HIVE-12505. So it would need to be

[jira] [Commented] (SPARK-25219) KMeans Clustering - Text Data - Results are incorrect

2018-08-24 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16591423#comment-16591423 ] Marco Gaido commented on SPARK-25219: - Hi [~VVasanth], a JIRA like this is very difficult to work

[jira] [Updated] (SPARK-25219) KMeans Clustering - Text Data - Results are incorrect

2018-08-24 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido updated SPARK-25219: Component/s: (was: Spark Submit) ML > KMeans Clustering - Text Data -

[jira] [Commented] (SPARK-25146) avg() returns null on some decimals

2018-08-17 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584016#comment-16584016 ] Marco Gaido commented on SPARK-25146: - No problem, thanks for reporting this anyway. > avg()

[jira] [Commented] (SPARK-25146) avg() returns null on some decimals

2018-08-17 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16583936#comment-16583936 ] Marco Gaido commented on SPARK-25146: - This has been fixed by SPARK-24957. On the current master

[jira] [Resolved] (SPARK-25146) avg() returns null on some decimals

2018-08-17 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido resolved SPARK-25146. - Resolution: Duplicate > avg() returns null on some decimals >

[jira] [Commented] (SPARK-25145) Buffer size too small on spark.sql query with filterPushdown predicate=True

2018-08-17 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16583928#comment-16583928 ] Marco Gaido commented on SPARK-25145: - cc [~dongjoon] > Buffer size too small on spark.sql query

[jira] [Commented] (SPARK-25138) Spark Shell should show the Scala prompt after initialization is complete

2018-08-17 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16583585#comment-16583585 ] Marco Gaido commented on SPARK-25138: - [~smilegator] this is caused by SPARK-24418 and it is a

[jira] [Resolved] (SPARK-25138) Spark Shell should show the Scala prompt after initialization is complete

2018-08-17 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido resolved SPARK-25138. - Resolution: Duplicate > Spark Shell should show the Scala prompt after initialization is

[jira] [Commented] (SPARK-25093) CodeFormatter could avoid creating regex object again and again

2018-08-16 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16582543#comment-16582543 ] Marco Gaido commented on SPARK-25093: - [~igreenfi] do you want to submit a PR for this? Otherwise I

[jira] [Commented] (SPARK-25031) The schema of MapType can not be printed correctly

2018-08-16 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16582531#comment-16582531 ] Marco Gaido commented on SPARK-25031: - ^ kindly ping [~smilegator] > The schema of MapType can not

[jira] [Comment Edited] (SPARK-25125) Spark SQL percentile_approx takes longer than Hive version for large datasets

2018-08-16 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16582383#comment-16582383 ] Marco Gaido edited comment on SPARK-25125 at 8/16/18 1:07 PM: -- I think his

[jira] [Comment Edited] (SPARK-25125) Spark SQL percentile_approx takes longer than Hive version for large datasets

2018-08-16 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16582383#comment-16582383 ] Marco Gaido edited comment on SPARK-25125 at 8/16/18 1:07 PM: -- I think this

[jira] [Commented] (SPARK-25125) Spark SQL percentile_approx takes longer than Hive version for large datasets

2018-08-16 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16582383#comment-16582383 ] Marco Gaido commented on SPARK-25125: - I think his may be a duplicate of SPARK-25125. [~myali] may

[jira] [Commented] (SPARK-23908) High-order function: transform(array, function) → array

2018-08-15 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16581364#comment-16581364 ] Marco Gaido commented on SPARK-23908: - [~huaxingao] they are not exposed through the Scala API, so

[jira] [Commented] (LIVY-489) Expose a JDBC endpoint for Livy

2018-08-15 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/LIVY-489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16581075#comment-16581075 ] Marco Gaido commented on LIVY-489: -- Sure [~jerryshao], thank you. I am submitting the first PR for 2, 3,

[jira] [Created] (SPARK-25123) SimpleExprValue may cause the loss of a reference

2018-08-15 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-25123: --- Summary: SimpleExprValue may cause the loss of a reference Key: SPARK-25123 URL: https://issues.apache.org/jira/browse/SPARK-25123 Project: Spark Issue Type:

[jira] [Commented] (SPARK-25051) where clause on dataset gives AnalysisException

2018-08-14 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16579514#comment-16579514 ] Marco Gaido commented on SPARK-25051: - This was caused by the introduction of AnalysisBarrier. I

[jira] [Commented] (SPARK-25051) where clause on dataset gives AnalysisException

2018-08-14 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16579444#comment-16579444 ] Marco Gaido commented on SPARK-25051: - cc [~jerryshao] shall we set it as a blocker for 2.3.2? >

[jira] [Updated] (SPARK-25051) where clause on dataset gives AnalysisException

2018-08-14 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido updated SPARK-25051: Labels: correctness (was: ) > where clause on dataset gives AnalysisException >

[jira] [Commented] (SPARK-24928) spark sql cross join running time too long

2018-08-13 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16578447#comment-16578447 ] Marco Gaido commented on SPARK-24928: - Actually this is a duplicate of SPARK-11982, which solved the

[jira] [Resolved] (SPARK-24928) spark sql cross join running time too long

2018-08-13 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido resolved SPARK-24928. - Resolution: Duplicate > spark sql cross join running time too long >

[jira] [Commented] (SPARK-25094) proccesNext() failed to compile size is over 64kb

2018-08-13 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16578348#comment-16578348 ] Marco Gaido commented on SPARK-25094: - [~igreenfi] as I mentioned you, this is a known issue. You

[jira] [Commented] (LIVY-489) Expose a JDBC endpoint for Livy

2018-08-13 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/LIVY-489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16578237#comment-16578237 ] Marco Gaido commented on LIVY-489: -- [~jerryshao] I created 5 subtasks for this. Hope they are reasonable

[jira] [Created] (LIVY-495) Add basic UI for thriftserver

2018-08-13 Thread Marco Gaido (JIRA)
Marco Gaido created LIVY-495: Summary: Add basic UI for thriftserver Key: LIVY-495 URL: https://issues.apache.org/jira/browse/LIVY-495 Project: Livy Issue Type: Sub-task Reporter:

[jira] [Created] (LIVY-494) Add thriftserver to Livy server

2018-08-13 Thread Marco Gaido (JIRA)
Marco Gaido created LIVY-494: Summary: Add thriftserver to Livy server Key: LIVY-494 URL: https://issues.apache.org/jira/browse/LIVY-494 Project: Livy Issue Type: Sub-task Reporter:

[jira] [Updated] (LIVY-493) Add UTs to the thriftserver module

2018-08-13 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/LIVY-493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido updated LIVY-493: - Description: Tracks the implementation and addition of UT for the new Livy thriftserver. > Add UTs to the

[jira] [Created] (LIVY-493) Add UTs to the thriftserver module

2018-08-13 Thread Marco Gaido (JIRA)
Marco Gaido created LIVY-493: Summary: Add UTs to the thriftserver module Key: LIVY-493 URL: https://issues.apache.org/jira/browse/LIVY-493 Project: Livy Issue Type: Sub-task

[jira] [Created] (LIVY-492) Base implementation Livy thriftserver

2018-08-13 Thread Marco Gaido (JIRA)
Marco Gaido created LIVY-492: Summary: Base implementation Livy thriftserver Key: LIVY-492 URL: https://issues.apache.org/jira/browse/LIVY-492 Project: Livy Issue Type: Sub-task

[jira] [Closed] (LIVY-490) Add thriftserver module

2018-08-13 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/LIVY-490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido closed LIVY-490. Resolution: Duplicate > Add thriftserver module > --- > > Key: LIVY-490

[jira] [Created] (LIVY-491) Add thriftserver module

2018-08-13 Thread Marco Gaido (JIRA)
Marco Gaido created LIVY-491: Summary: Add thriftserver module Key: LIVY-491 URL: https://issues.apache.org/jira/browse/LIVY-491 Project: Livy Issue Type: Sub-task Components: Server

[jira] [Created] (LIVY-490) Add thriftserver module

2018-08-13 Thread Marco Gaido (JIRA)
Marco Gaido created LIVY-490: Summary: Add thriftserver module Key: LIVY-490 URL: https://issues.apache.org/jira/browse/LIVY-490 Project: Livy Issue Type: Sub-task Reporter: Marco

[jira] [Commented] (LIVY-489) Expose a JDBC endpoint for Livy

2018-08-13 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/LIVY-489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16578226#comment-16578226 ] Marco Gaido commented on LIVY-489: -- Sure [~jerryshao], the branch is

[jira] [Commented] (SPARK-25093) CodeFormatter could avoid creating regex object again and again

2018-08-13 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16578038#comment-16578038 ] Marco Gaido commented on SPARK-25093: - I just marked this as a minor priority ticket, anyway I agree

[jira] [Updated] (SPARK-25093) CodeFormatter could avoid creating regex object again and again

2018-08-13 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido updated SPARK-25093: Priority: Minor (was: Major) > CodeFormatter could avoid creating regex object again and again >

[jira] [Commented] (LIVY-489) Expose a JDBC endpoint for Livy

2018-08-13 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/LIVY-489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16577995#comment-16577995 ] Marco Gaido commented on LIVY-489: -- Hi [~jerryshao]. Thanks for your comment. Unfortunately I am not sure

[jira] [Commented] (SPARK-25094) proccesNext() failed to compile size is over 64kb

2018-08-12 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16577593#comment-16577593 ] Marco Gaido commented on SPARK-25094: - This is a duplicate of many. Unfortunately this problem has

[jira] [Updated] (LIVY-489) Expose a JDBC endpoint for Livy

2018-08-09 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/LIVY-489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido updated LIVY-489: - Description: Many users and BI tools use JDBC connections in order to retrieve data. As Livy exposes only

[jira] [Created] (LIVY-489) Expose a JDBC endpoint for Livy

2018-08-09 Thread Marco Gaido (JIRA)
Marco Gaido created LIVY-489: Summary: Expose a JDBC endpoint for Livy Key: LIVY-489 URL: https://issues.apache.org/jira/browse/LIVY-489 Project: Livy Issue Type: New Feature

[jira] [Commented] (SPARK-25031) The schema of MapType can not be printed correctly

2018-08-08 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16573299#comment-16573299 ] Marco Gaido commented on SPARK-25031: - [~smilegator] shall this be resolved as

[jira] [Created] (SPARK-25042) Flaky test: org.apache.spark.streaming.kafka010.KafkaRDDSuite.compacted topic

2018-08-07 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-25042: --- Summary: Flaky test: org.apache.spark.streaming.kafka010.KafkaRDDSuite.compacted topic Key: SPARK-25042 URL: https://issues.apache.org/jira/browse/SPARK-25042 Project:

[jira] [Comment Edited] (SPARK-24928) spark sql cross join running time too long

2018-08-06 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570289#comment-16570289 ] Marco Gaido edited comment on SPARK-24928 at 8/6/18 4:13 PM: -

[jira] [Comment Edited] (SPARK-24928) spark sql cross join running time too long

2018-08-06 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570289#comment-16570289 ] Marco Gaido edited comment on SPARK-24928 at 8/6/18 2:45 PM: -

[jira] [Commented] (SPARK-24928) spark sql cross join running time too long

2018-08-06 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570289#comment-16570289 ] Marco Gaido commented on SPARK-24928: - [~matthewnormyle] the fix you are proposing doesn't solve the

[jira] [Commented] (SPARK-25012) dataframe creation results in matcherror

2018-08-06 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570022#comment-16570022 ] Marco Gaido commented on SPARK-25012: - [~simm] you're right that the error message doesn't help and

[jira] [Commented] (SPARK-25012) dataframe creation results in matcherror

2018-08-06 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16569960#comment-16569960 ] Marco Gaido commented on SPARK-25012: - Seems the same as SPARK-24366. Seems anyway a problem in you

[jira] [Commented] (SPARK-23937) High-order function: map_filter(map, function) → MAP

2018-08-03 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16568152#comment-16568152 ] Marco Gaido commented on SPARK-23937: - I am working on this, thanks. > High-order function:

[jira] [Commented] (SPARK-24598) SPARK SQL:Datatype overflow conditions gives incorrect result

2018-08-03 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16568005#comment-16568005 ] Marco Gaido commented on SPARK-24598: - [~smilegator] as we just enhanced the doc, but we have not

[jira] [Commented] (SPARK-24975) Spark history server REST API /api/v1/version returns error 404

2018-07-31 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16563259#comment-16563259 ] Marco Gaido commented on SPARK-24975: - This seems a duplicate of SPARK-24188. Despite here I see

[jira] [Commented] (SPARK-24944) SparkUi build problem

2018-07-30 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16561587#comment-16561587 ] Marco Gaido commented on SPARK-24944: - Can you close this JIRA as invalid? Thanks. > SparkUi build

[jira] [Commented] (SPARK-24957) Decimal arithmetic can lead to wrong values using codegen

2018-07-29 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16561077#comment-16561077 ] Marco Gaido commented on SPARK-24957: - I am not sure what you mean by "When codegen is disabled all

[jira] [Created] (SPARK-24948) SHS filters wrongly some applications due to permission check

2018-07-27 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-24948: --- Summary: SHS filters wrongly some applications due to permission check Key: SPARK-24948 URL: https://issues.apache.org/jira/browse/SPARK-24948 Project: Spark

[jira] [Commented] (SPARK-24944) SparkUi build problem

2018-07-27 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559790#comment-16559790 ] Marco Gaido commented on SPARK-24944: - This seems more a problem in your project and your

[jira] [Commented] (SPARK-24928) spark sql cross join running time too long

2018-07-26 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558287#comment-16558287 ] Marco Gaido commented on SPARK-24928: - The affected version is pretty old, can you check a newer

[jira] [Comment Edited] (SPARK-24904) Join with broadcasted dataframe causes shuffle of redundant data

2018-07-25 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16555652#comment-16555652 ] Marco Gaido edited comment on SPARK-24904 at 7/25/18 1:28 PM: -- I see now

[jira] [Commented] (SPARK-24904) Join with broadcasted dataframe causes shuffle of redundant data

2018-07-25 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16555842#comment-16555842 ] Marco Gaido commented on SPARK-24904: - [~shay_elbaz] In the case I mentioned before the approach you

[jira] [Commented] (SPARK-24904) Join with broadcasted dataframe causes shuffle of redundant data

2018-07-25 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16555652#comment-16555652 ] Marco Gaido commented on SPARK-24904: - I see now what you mean, but yes, It think there is an

[jira] [Commented] (SPARK-24904) Join with broadcasted dataframe causes shuffle of redundant data

2018-07-25 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16555477#comment-16555477 ] Marco Gaido commented on SPARK-24904: - You cannot do a broadcast join when it is on the side of the

[jira] [Commented] (SPARK-24498) Add JDK compiler for runtime codegen

2018-07-13 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16543336#comment-16543336 ] Marco Gaido commented on SPARK-24498: - [~maropu] yes, I remembered I had some troubles compiling the

[jira] [Created] (SPARK-24782) Simplify conf access in expressions

2018-07-11 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-24782: --- Summary: Simplify conf access in expressions Key: SPARK-24782 URL: https://issues.apache.org/jira/browse/SPARK-24782 Project: Spark Issue Type: Improvement

[jira] [Updated] (SPARK-24268) DataType in error messages are not coherent

2018-07-10 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido updated SPARK-24268: Description: In SPARK-22893 there was a tentative to unify the way dataTypes are reported in

[jira] [Updated] (SPARK-24268) DataType in error messages are not coherent

2018-07-10 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido updated SPARK-24268: Description: In SPARK-22893 there was a tentative to unify the way dataTypes are reported in

[jira] [Updated] (SPARK-24268) DataType in error messages are not coherent

2018-07-10 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido updated SPARK-24268: Description: In SPARK-22893 there was a tentative to unify the way dataTypes are reported in

[jira] [Commented] (SPARK-24745) Map function does not keep rdd name

2018-07-10 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16538260#comment-16538260 ] Marco Gaido commented on SPARK-24745: - A RDD already has a unique ID. I think the name is just

[jira] [Commented] (SPARK-24745) Map function does not keep rdd name

2018-07-09 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16537199#comment-16537199 ] Marco Gaido commented on SPARK-24745: - This makes sense, as the map operation creates a new RDD. So

[jira] [Commented] (SPARK-24719) ClusteringEvaluator supports integer type labels

2018-07-09 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16536747#comment-16536747 ] Marco Gaido commented on SPARK-24719: - [~mengxr] any luck with this? Thanks. > ClusteringEvaluator

[jira] [Commented] (SPARK-24438) Empty strings and null strings are written to the same partition

2018-07-09 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16536682#comment-16536682 ] Marco Gaido commented on SPARK-24438: - IIRC, Hive has a placeholder string

[jira] [Comment Edited] (SPARK-24438) Empty strings and null strings are written to the same partition

2018-07-09 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16536682#comment-16536682 ] Marco Gaido edited comment on SPARK-24438 at 7/9/18 8:37 AM: - IIRC, Hive has

[jira] [Commented] (YARN-8385) Clean local directories when a container is killed

2018-07-09 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16536677#comment-16536677 ] Marco Gaido commented on YARN-8385: --- Thanks for your answer [~jlowe]. As it is stated in the question on

[jira] [Commented] (KNOX-1362) Add documentation for the interaction with Spark History Server (SHS)

2018-07-03 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/KNOX-1362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16531989#comment-16531989 ] Marco Gaido commented on KNOX-1362: --- Thanks for your work [~smore]. Sure, no worries. Thank you. > Add

[jira] [Commented] (SPARK-24719) ClusteringEvaluator supports integer type labels

2018-07-02 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16530493#comment-16530493 ] Marco Gaido commented on SPARK-24719: - [~mengxr] I tried to pass integer values in the prediction

[jira] [Commented] (SPARK-24719) ClusteringEvaluator supports integer type labels

2018-07-02 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16530264#comment-16530264 ] Marco Gaido commented on SPARK-24719: - Sure,thanks. I'll submit a PR ASAP. > ClusteringEvaluator

[jira] [Resolved] (SPARK-24712) TrainValidationSplit ignores label column name and forces to be "label"

2018-07-02 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido resolved SPARK-24712. - Resolution: Not A Problem > TrainValidationSplit ignores label column name and forces to be

[jira] [Commented] (SPARK-24712) TrainValidationSplit ignores label column name and forces to be "label"

2018-07-02 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16529746#comment-16529746 ] Marco Gaido commented on SPARK-24712: - The problem is that you have not set the label on the

[jira] [Commented] (SPARK-24208) Cannot resolve column in self join after applying Pandas UDF

2018-06-27 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16525210#comment-16525210 ] Marco Gaido commented on SPARK-24208: - I think this may be a duplicate of SPARK-24373. Can you try

[jira] [Created] (SPARK-24660) SHS is not showing properly errors when downloading logs

2018-06-26 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-24660: --- Summary: SHS is not showing properly errors when downloading logs Key: SPARK-24660 URL: https://issues.apache.org/jira/browse/SPARK-24660 Project: Spark Issue

[jira] [Updated] (KNOX-1362) Add documentation for the interaction with Spark History Server (SHS)

2018-06-22 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/KNOX-1362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido updated KNOX-1362: -- Attachment: KNOX-1362.patch > Add documentation for the interaction with Spark History Server (SHS) >

[jira] [Commented] (KNOX-1362) Add documentation for the interaction with Spark History Server (SHS)

2018-06-22 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/KNOX-1362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16520314#comment-16520314 ] Marco Gaido commented on KNOX-1362: --- Thank you [~smore]! > Add documentation for the interaction with

[jira] [Commented] (SPARK-24498) Add JDK compiler for runtime codegen

2018-06-22 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16520125#comment-16520125 ] Marco Gaido commented on SPARK-24498: - Thanks for your great analysis [~maropu]! Very interesting.

[jira] [Commented] (KNOX-1362) Add documentation for the interaction with SHS

2018-06-21 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/KNOX-1362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16519036#comment-16519036 ] Marco Gaido commented on KNOX-1362: --- [~lmccay] I created the issue as you suggested in KNOX-1354.

[jira] [Created] (KNOX-1362) Add documentation for the interaction with SHS

2018-06-21 Thread Marco Gaido (JIRA)
Marco Gaido created KNOX-1362: - Summary: Add documentation for the interaction with SHS Key: KNOX-1362 URL: https://issues.apache.org/jira/browse/KNOX-1362 Project: Apache Knox Issue Type:

[jira] [Commented] (KNOX-1315) Spark UI urls issue: Jobs, stdout/stderr and threadDump links

2018-06-21 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/KNOX-1315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16519032#comment-16519032 ] Marco Gaido commented on KNOX-1315: --- [~lmccay] this is actually a patch on YARN UI. As I am not an

[jira] [Commented] (SPARK-24607) Distribute by rand() can lead to data inconsistency

2018-06-20 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16518196#comment-16518196 ] Marco Gaido commented on SPARK-24607: - [~viirya] please check the description in the Hive ticket.

[jira] [Updated] (SPARK-24606) Decimals multiplication and division may be null due to the result precision overflow

2018-06-20 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido updated SPARK-24606: Priority: Major (was: Blocker) > Decimals multiplication and division may be null due to the

[jira] [Commented] (SPARK-24606) Decimals multiplication and division may be null due to the result precision overflow

2018-06-20 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16518130#comment-16518130 ] Marco Gaido commented on SPARK-24606: - Critical and Blocker are reserved for committers. Closing as

[jira] [Resolved] (SPARK-24606) Decimals multiplication and division may be null due to the result precision overflow

2018-06-20 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido resolved SPARK-24606. - Resolution: Duplicate > Decimals multiplication and division may be null due to the result

[jira] [Commented] (SPARK-23901) Data Masking Functions

2018-06-15 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16514687#comment-16514687 ] Marco Gaido commented on SPARK-23901: - These functions can be used as any other function in Hive,

[jira] [Updated] (KNOX-1358) Create new version definition for SHS

2018-06-15 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/KNOX-1358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido updated KNOX-1358: -- Attachment: KNOX-1358.patch > Create new version definition for SHS >

[jira] [Created] (KNOX-1358) Create new version definition for SHS

2018-06-15 Thread Marco Gaido (JIRA)
Marco Gaido created KNOX-1358: - Summary: Create new version definition for SHS Key: KNOX-1358 URL: https://issues.apache.org/jira/browse/KNOX-1358 Project: Apache Knox Issue Type: New Feature

[jira] [Commented] (KNOX-1353) SHS always showing link to incomplete applications

2018-06-15 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/KNOX-1353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16513503#comment-16513503 ] Marco Gaido commented on KNOX-1353: --- Sorry [~lmccay], I'll be more careful next time. Thanks. > SHS

[jira] [Created] (SPARK-24562) Allow running same tests with multiple configs in SQLQueryTestSuite

2018-06-14 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-24562: --- Summary: Allow running same tests with multiple configs in SQLQueryTestSuite Key: SPARK-24562 URL: https://issues.apache.org/jira/browse/SPARK-24562 Project: Spark

<    1   2   3   4   5   6   7   8   >