[jira] [Commented] (SPARK-17993) Spark prints an avalanche of warning messages from Parquet when reading parquet files written by older versions of Parquet-mr

2017-07-03 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16073141#comment-16073141 ] Sean Owen commented on SPARK-17993: --- [~jhpoelen] those are not files that this change touched? The

[jira] [Assigned] (SPARK-21285) VectorAssembler should report the column name when data type used is not supported

2017-07-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21285: Assignee: Apache Spark > VectorAssembler should report the column name when data type

[jira] [Assigned] (SPARK-21285) VectorAssembler should report the column name when data type used is not supported

2017-07-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21285: Assignee: (was: Apache Spark) > VectorAssembler should report the column name when

[jira] [Commented] (SPARK-21285) VectorAssembler should report the column name when data type used is not supported

2017-07-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16073134#comment-16073134 ] Apache Spark commented on SPARK-21285: -- User 'facaiy' has created a pull request for this issue:

[jira] [Comment Edited] (SPARK-18877) Unable to read given csv data. Excepion: java.lang.IllegalArgumentException: requirement failed: Decimal precision 28 exceeds max precision 20

2017-07-03 Thread Navya Krishnappa (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16072240#comment-16072240 ] Navya Krishnappa edited comment on SPARK-18877 at 7/4/17 5:42 AM: --

[jira] [Comment Edited] (SPARK-21289) Text and CSV formats do not support custom end-of-line delimiters

2017-07-03 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16073112#comment-16073112 ] Hyukjin Kwon edited comment on SPARK-21289 at 7/4/17 5:34 AM: -- There are all

[jira] [Commented] (SPARK-21285) VectorAssembler should report the column name when data type used is not supported

2017-07-03 Thread 颜发才
[ https://issues.apache.org/jira/browse/SPARK-21285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16073125#comment-16073125 ] Yan Facai (颜发才) commented on SPARK-21285: - It seems easy, and I can work on this. >

[jira] [Resolved] (SPARK-21277) Spark is invoking an incorrect serializer after UDAF completion

2017-07-03 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-21277. -- Resolution: Not A Bug > Spark is invoking an incorrect serializer after UDAF completion >

[jira] [Commented] (SPARK-17993) Spark prints an avalanche of warning messages from Parquet when reading parquet files written by older versions of Parquet-mr

2017-07-03 Thread Jorrit Poelen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16073121#comment-16073121 ] Jorrit Poelen commented on SPARK-17993: --- Please note that this fix is not included in spark 2.1.1 .

[jira] [Commented] (SPARK-21289) Text and CSV formats do not support custom end-of-line delimiters

2017-07-03 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16073112#comment-16073112 ] Hyukjin Kwon commented on SPARK-21289: -- There are all related information in the JIRA. Initially,

[jira] [Comment Edited] (SPARK-21288) Several files are missing in the results of the execution of the spark application.

2017-07-03 Thread Constantine Solovev (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16073108#comment-16073108 ] Constantine Solovev edited comment on SPARK-21288 at 7/4/17 5:05 AM: -

[jira] [Commented] (SPARK-21288) Several files are missing in the results of the execution of the spark application.

2017-07-03 Thread Constantine Solovev (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16073108#comment-16073108 ] Constantine Solovev commented on SPARK-21288: - Hello Sean Owen, I saw tickets

[jira] [Commented] (SPARK-21278) Upgrade to Py4J 0.10.5

2017-07-03 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16073097#comment-16073097 ] Dongjoon Hyun commented on SPARK-21278: --- Due to [another Py4J

[jira] [Commented] (SPARK-21109) union two dataset[A] don't work as expected if one of the datasets is originated from a dataframe

2017-07-03 Thread Jerry Lam (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16073077#comment-16073077 ] Jerry Lam commented on SPARK-21109: --- The schema of the Dataset[my_case] is defined by the case class

[jira] [Commented] (SPARK-21109) union two dataset[A] don't work as expected if one of the datasets is originated from a dataframe

2017-07-03 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16073072#comment-16073072 ] Liang-Chi Hsieh commented on SPARK-21109: - I'm not arguing anything...I just explain why the

[jira] [Commented] (SPARK-21109) union two dataset[A] don't work as expected if one of the datasets is originated from a dataframe

2017-07-03 Thread Jerry Lam (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16073071#comment-16073071 ] Jerry Lam commented on SPARK-21109: --- The update doc is unclear because case classes already ensure the

[jira] [Commented] (SPARK-21109) union two dataset[A] don't work as expected if one of the datasets is originated from a dataframe

2017-07-03 Thread Jerry Lam (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16073066#comment-16073066 ] Jerry Lam commented on SPARK-21109: --- Does the order of the columns part of the schema? It actually

[jira] [Commented] (SPARK-21109) union two dataset[A] don't work as expected if one of the datasets is originated from a dataframe

2017-07-03 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16073053#comment-16073053 ] Liang-Chi Hsieh commented on SPARK-21109: - Oh. I see. The document is fixed recently by

[jira] [Comment Edited] (SPARK-21109) union two dataset[A] don't work as expected if one of the datasets is originated from a dataframe

2017-07-03 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16073047#comment-16073047 ] Liang-Chi Hsieh edited comment on SPARK-21109 at 7/4/17 2:59 AM: - You

[jira] [Comment Edited] (SPARK-21109) union two dataset[A] don't work as expected if one of the datasets is originated from a dataframe

2017-07-03 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16073047#comment-16073047 ] Liang-Chi Hsieh edited comment on SPARK-21109 at 7/4/17 2:55 AM: - You

[jira] [Commented] (SPARK-21109) union two dataset[A] don't work as expected if one of the datasets is originated from a dataframe

2017-07-03 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16073048#comment-16073048 ] Liang-Chi Hsieh commented on SPARK-21109: - Not sure why the generated doc doesn't show it. But

[jira] [Commented] (SPARK-21109) union two dataset[A] don't work as expected if one of the datasets is originated from a dataframe

2017-07-03 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16073047#comment-16073047 ] Liang-Chi Hsieh commented on SPARK-21109: - You claim that they have the same schema if we print

[jira] [Created] (SPARK-21297) Add State in 'Session Statistics' table and add count in 'JDBC/ODBC Server' page.

2017-07-03 Thread guoxiaolongzte (JIRA)
guoxiaolongzte created SPARK-21297: -- Summary: Add State in 'Session Statistics' table and add count in 'JDBC/ODBC Server' page. Key: SPARK-21297 URL: https://issues.apache.org/jira/browse/SPARK-21297

[jira] [Commented] (SPARK-21109) union two dataset[A] don't work as expected if one of the datasets is originated from a dataframe

2017-07-03 Thread Jerry Lam (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16073046#comment-16073046 ] Jerry Lam commented on SPARK-21109: --- I checked the scala doc for Dataset about union and it says only:

[jira] [Resolved] (SPARK-21264) Omitting columns with 'how' specified in join in PySpark throws NPE

2017-07-03 Thread Takuya Ueshin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takuya Ueshin resolved SPARK-21264. --- Resolution: Fixed Fix Version/s: 2.3.0 Issue resolved by pull request 18484

[jira] [Assigned] (SPARK-21264) Omitting columns with 'how' specified in join in PySpark throws NPE

2017-07-03 Thread Takuya Ueshin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takuya Ueshin reassigned SPARK-21264: - Assignee: Hyukjin Kwon > Omitting columns with 'how' specified in join in PySpark

[jira] [Commented] (SPARK-21109) union two dataset[A] don't work as expected if one of the datasets is originated from a dataframe

2017-07-03 Thread Jerry Lam (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16073037#comment-16073037 ] Jerry Lam commented on SPARK-21109: --- When I said they have the same schema is that they both contain

[jira] [Comment Edited] (SPARK-21109) union two dataset[A] don't work as expected if one of the datasets is originated from a dataframe

2017-07-03 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16073005#comment-16073005 ] Liang-Chi Hsieh edited comment on SPARK-21109 at 7/4/17 1:29 AM: - They

[jira] [Comment Edited] (SPARK-21109) union two dataset[A] don't work as expected if one of the datasets is originated from a dataframe

2017-07-03 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16073005#comment-16073005 ] Liang-Chi Hsieh edited comment on SPARK-21109 at 7/4/17 1:29 AM: - They

[jira] [Commented] (SPARK-21109) union two dataset[A] don't work as expected if one of the datasets is originated from a dataframe

2017-07-03 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16073005#comment-16073005 ] Liang-Chi Hsieh commented on SPARK-21109: - They have the same schema? Let's print schema on the

[jira] [Resolved] (SPARK-21283) FileOutputStream should be created as append mode

2017-07-03 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-21283. - Resolution: Fixed Fix Version/s: 2.3.0 Issue resolved by pull request 18507

[jira] [Assigned] (SPARK-21283) FileOutputStream should be created as append mode

2017-07-03 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-21283: --- Assignee: liuxian > FileOutputStream should be created as append mode >

[jira] [Assigned] (SPARK-21296) Avoid per-record type dispatch in PySpark createDataFrame schema verification

2017-07-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21296: Assignee: Apache Spark > Avoid per-record type dispatch in PySpark createDataFrame schema

[jira] [Commented] (SPARK-19507) pyspark.sql.types._verify_type() exceptions too broad to debug collections or nested data

2017-07-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16073000#comment-16073000 ] Apache Spark commented on SPARK-19507: -- User 'HyukjinKwon' has created a pull request for this

[jira] [Commented] (SPARK-21296) Avoid per-record type dispatch in PySpark createDataFrame schema verification

2017-07-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16073001#comment-16073001 ] Apache Spark commented on SPARK-21296: -- User 'HyukjinKwon' has created a pull request for this

[jira] [Assigned] (SPARK-21296) Avoid per-record type dispatch in PySpark createDataFrame schema verification

2017-07-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21296: Assignee: (was: Apache Spark) > Avoid per-record type dispatch in PySpark

[jira] [Created] (SPARK-21296) Avoid per-record type dispatch in PySpark createDataFrame schema verification

2017-07-03 Thread Hyukjin Kwon (JIRA)
Hyukjin Kwon created SPARK-21296: Summary: Avoid per-record type dispatch in PySpark createDataFrame schema verification Key: SPARK-21296 URL: https://issues.apache.org/jira/browse/SPARK-21296

[jira] [Commented] (SPARK-21295) Confusing error message for missing references

2017-07-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16072993#comment-16072993 ] Apache Spark commented on SPARK-21295: -- User 'gatorsmile' has created a pull request for this issue:

[jira] [Assigned] (SPARK-21295) Confusing error message for missing references

2017-07-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21295: Assignee: Apache Spark (was: Xiao Li) > Confusing error message for missing references >

[jira] [Assigned] (SPARK-21295) Confusing error message for missing references

2017-07-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21295: Assignee: Xiao Li (was: Apache Spark) > Confusing error message for missing references >

[jira] [Updated] (SPARK-21295) Confusing error message for missing references

2017-07-03 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-21295: Summary: Confusing error message for missing references (was: Use qualified name in the error message for

[jira] [Created] (SPARK-21295) Use qualified name in the error message for missing references

2017-07-03 Thread Xiao Li (JIRA)
Xiao Li created SPARK-21295: --- Summary: Use qualified name in the error message for missing references Key: SPARK-21295 URL: https://issues.apache.org/jira/browse/SPARK-21295 Project: Spark Issue

[jira] [Comment Edited] (SPARK-21109) union two dataset[A] don't work as expected if one of the datasets is originated from a dataframe

2017-07-03 Thread Jerry Lam (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16072989#comment-16072989 ] Jerry Lam edited comment on SPARK-21109 at 7/4/17 12:24 AM: I'm not sure if I

[jira] [Reopened] (SPARK-21109) union two dataset[A] don't work as expected if one of the datasets is originated from a dataframe

2017-07-03 Thread Jerry Lam (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jerry Lam reopened SPARK-21109: --- I'm not sure if I understand your reply correctly but both data1 and data2 have the same schema if you

[jira] [Commented] (SPARK-16742) Kerberos support for Spark on Mesos

2017-07-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16072970#comment-16072970 ] Apache Spark commented on SPARK-16742: -- User 'mgummelt' has created a pull request for this issue:

[jira] [Assigned] (SPARK-21293) R document update structured streaming

2017-07-03 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung reassigned SPARK-21293: Assignee: Felix Cheung > R document update structured streaming >

[jira] [Resolved] (SPARK-21284) rename SessionCatalog.registerFunction parameter name

2017-07-03 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-21284. - Resolution: Fixed Fix Version/s: 2.3.0 > rename SessionCatalog.registerFunction parameter name >

[jira] [Commented] (SPARK-21293) R document update structured streaming

2017-07-03 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16072771#comment-16072771 ] Felix Cheung commented on SPARK-21293: -- and remove `(experimental).` from R guide > R document

[jira] [Commented] (SPARK-21288) Several files are missing in the results of the execution of the spark application.

2017-07-03 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16072763#comment-16072763 ] Sean Owen commented on SPARK-21288: --- I don't think this is a Spark problem. See things like

[jira] [Closed] (SPARK-21294) R document Decision Tree in ML guide

2017-07-03 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung closed SPARK-21294. > R document Decision Tree in ML guide > > > Key:

[jira] [Resolved] (SPARK-21294) R document Decision Tree in ML guide

2017-07-03 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung resolved SPARK-21294. -- Resolution: Cannot Reproduce it's in master > R document Decision Tree in ML guide >

[jira] [Updated] (SPARK-21290) R document Programmatically Specifying the Schema in SQL guide

2017-07-03 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung updated SPARK-21290: - Issue Type: Documentation (was: Bug) > R document Programmatically Specifying the Schema in SQL

[jira] [Updated] (SPARK-21291) R bucketBy partitionBy API

2017-07-03 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung updated SPARK-21291: - Issue Type: Improvement (was: Bug) > R bucketBy partitionBy API > -- >

[jira] [Updated] (SPARK-21292) R document Catalog function metadata refresh

2017-07-03 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung updated SPARK-21292: - Issue Type: Documentation (was: Bug) > R document Catalog function metadata refresh >

[jira] [Updated] (SPARK-21293) R document update structured streaming

2017-07-03 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung updated SPARK-21293: - Issue Type: Documentation (was: Bug) > R document update structured streaming >

[jira] [Created] (SPARK-21294) R document Decision Tree in ML guide

2017-07-03 Thread Felix Cheung (JIRA)
Felix Cheung created SPARK-21294: Summary: R document Decision Tree in ML guide Key: SPARK-21294 URL: https://issues.apache.org/jira/browse/SPARK-21294 Project: Spark Issue Type:

[jira] [Updated] (SPARK-21294) R document Decision Tree in ML guide

2017-07-03 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung updated SPARK-21294: - Description: 2.3.0 release Decision tree was: 2.3.0 release Decision tree classifier > R

[jira] [Commented] (SPARK-21287) Cannot use Int.MIN_VALUE as Spark SQL fetchsize

2017-07-03 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16072752#comment-16072752 ] Sean Owen commented on SPARK-21287: --- Yeah, maybe a method to validate or even set the fetch size that

[jira] [Created] (SPARK-21293) R document update structured streaming

2017-07-03 Thread Felix Cheung (JIRA)
Felix Cheung created SPARK-21293: Summary: R document update structured streaming Key: SPARK-21293 URL: https://issues.apache.org/jira/browse/SPARK-21293 Project: Spark Issue Type: Bug

[jira] [Created] (SPARK-21292) R document Catalog function metadata refresh

2017-07-03 Thread Felix Cheung (JIRA)
Felix Cheung created SPARK-21292: Summary: R document Catalog function metadata refresh Key: SPARK-21292 URL: https://issues.apache.org/jira/browse/SPARK-21292 Project: Spark Issue Type: Bug

[jira] [Assigned] (SPARK-21292) R document Catalog function metadata refresh

2017-07-03 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung reassigned SPARK-21292: Assignee: Felix Cheung > R document Catalog function metadata refresh >

[jira] [Commented] (SPARK-21291) R bucketBy partitionBy API

2017-07-03 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16072743#comment-16072743 ] Felix Cheung commented on SPARK-21291: -- and update the SQL programming guide > R bucketBy

[jira] [Created] (SPARK-21291) R bucketBy partitionBy API

2017-07-03 Thread Felix Cheung (JIRA)
Felix Cheung created SPARK-21291: Summary: R bucketBy partitionBy API Key: SPARK-21291 URL: https://issues.apache.org/jira/browse/SPARK-21291 Project: Spark Issue Type: Bug

[jira] [Created] (SPARK-21290) R document Programmatically Specifying the Schema in SQL guide

2017-07-03 Thread Felix Cheung (JIRA)
Felix Cheung created SPARK-21290: Summary: R document Programmatically Specifying the Schema in SQL guide Key: SPARK-21290 URL: https://issues.apache.org/jira/browse/SPARK-21290 Project: Spark

[jira] [Assigned] (SPARK-20073) Unexpected Cartesian product when using eqNullSafe in join with a derived table

2017-07-03 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li reassigned SPARK-20073: --- Assignee: Takeshi Yamamuro > Unexpected Cartesian product when using eqNullSafe in join with a

[jira] [Resolved] (SPARK-20073) Unexpected Cartesian product when using eqNullSafe in join with a derived table

2017-07-03 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-20073. - Resolution: Fixed Fix Version/s: 2.3.0 > Unexpected Cartesian product when using eqNullSafe in

[jira] [Commented] (SPARK-21280) org.apache.spark.util.sketch.BloomFilter not bean compliant

2017-07-03 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16072734#comment-16072734 ] Sean Owen commented on SPARK-21280: --- Looking at the class, I don't think it was ever intended to be

[jira] [Commented] (SPARK-21287) Cannot use Int.MIN_VALUE as Spark SQL fetchsize

2017-07-03 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16072700#comment-16072700 ] Xiao Li commented on SPARK-21287: - This value is very specific to MySQL. Since we are supporting

[jira] [Commented] (SPARK-21280) org.apache.spark.util.sketch.BloomFilter not bean compliant

2017-07-03 Thread Eran Moscovici (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16072686#comment-16072686 ] Eran Moscovici commented on SPARK-21280: One could add a public c'tor to BloomFilter, add getters

[jira] [Created] (SPARK-21289) Text and CSV formats do not support custom end-of-line delimiters

2017-07-03 Thread Yevgen Galchenko (JIRA)
Yevgen Galchenko created SPARK-21289: Summary: Text and CSV formats do not support custom end-of-line delimiters Key: SPARK-21289 URL: https://issues.apache.org/jira/browse/SPARK-21289 Project:

[jira] [Commented] (SPARK-21281) cannot create empty typed array column

2017-07-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16072623#comment-16072623 ] Apache Spark commented on SPARK-21281: -- User 'maropu' has created a pull request for this issue:

[jira] [Assigned] (SPARK-21281) cannot create empty typed array column

2017-07-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21281: Assignee: (was: Apache Spark) > cannot create empty typed array column >

[jira] [Assigned] (SPARK-21281) cannot create empty typed array column

2017-07-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21281: Assignee: Apache Spark > cannot create empty typed array column >

[jira] [Commented] (SPARK-21287) Cannot use Int.MIN_VALUE as Spark SQL fetchsize

2017-07-03 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16072506#comment-16072506 ] Sean Owen commented on SPARK-21287: --- I know, but this is what fetch size is supposed to control of

[jira] [Comment Edited] (SPARK-21287) Cannot use Int.MIN_VALUE as Spark SQL fetchsize

2017-07-03 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-21287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16072474#comment-16072474 ] Maciej Bryński edited comment on SPARK-21287 at 7/3/17 1:59 PM: Quote

[jira] [Commented] (SPARK-21287) Cannot use Int.MIN_VALUE as Spark SQL fetchsize

2017-07-03 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-21287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16072474#comment-16072474 ] Maciej Bryński commented on SPARK-21287: Quote {code} By default, ResultSets are completely

[jira] [Updated] (SPARK-21288) Several files are missing in the results of the execution of the spark application.

2017-07-03 Thread Constantin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Constantin updated SPARK-21288: --- Description: Spark application save into output folder not all files, for example only files from

[jira] [Updated] (SPARK-21288) Several files are missing in the results of the execution of the spark application.

2017-07-03 Thread Constantin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Constantin updated SPARK-21288: --- Description: Spark application save into output folder not all files, for example only files from

[jira] [Updated] (SPARK-21288) Several files are missing in the results of the execution of the spark application.

2017-07-03 Thread Constantin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Constantin updated SPARK-21288: --- Description: Spark application save into output folder not all files, for example only files from

[jira] [Created] (SPARK-21288) Several files are missing in the results of the execution of the spark application.

2017-07-03 Thread Constantin (JIRA)
Constantin created SPARK-21288: -- Summary: Several files are missing in the results of the execution of the spark application. Key: SPARK-21288 URL: https://issues.apache.org/jira/browse/SPARK-21288

[jira] [Commented] (SPARK-21190) SPIP: Vectorized UDFs in Python

2017-07-03 Thread Leif Walsh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16072429#comment-16072429 ] Leif Walsh commented on SPARK-21190: I believe we could also compute window indexes while we stream

[jira] [Commented] (SPARK-21190) SPIP: Vectorized UDFs in Python

2017-07-03 Thread Leif Walsh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16072424#comment-16072424 ] Leif Walsh commented on SPARK-21190: I figure we could address that by using shared memory, if we

[jira] [Commented] (SPARK-21287) Cannot use Int.MIN_VALUE as Spark SQL fetchsize

2017-07-03 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16072414#comment-16072414 ] Sean Owen commented on SPARK-21287: --- It's not supposed to do that right -- you're saying the MySQL

[jira] [Commented] (SPARK-21287) Cannot use Int.MIN_VALUE as Spark SQL fetchsize

2017-07-03 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-21287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16072404#comment-16072404 ] Maciej Bryński commented on SPARK-21287: No. It's not the same like setting 0 or 1. Every other

[jira] [Commented] (SPARK-21287) Cannot use Int.MIN_VALUE as Spark SQL fetchsize

2017-07-03 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16072388#comment-16072388 ] Sean Owen commented on SPARK-21287: --- Yeah, I'm familiar with this special value. Is it not equivalent

[jira] [Commented] (SPARK-21287) Cannot use Int.MIN_VALUE as Spark SQL fetchsize

2017-07-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16072385#comment-16072385 ] Apache Spark commented on SPARK-21287: -- User 'maver1ck' has created a pull request for this issue:

[jira] [Assigned] (SPARK-21287) Cannot use Int.MIN_VALUE as Spark SQL fetchsize

2017-07-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21287: Assignee: (was: Apache Spark) > Cannot use Int.MIN_VALUE as Spark SQL fetchsize >

[jira] [Assigned] (SPARK-21287) Cannot use Int.MIN_VALUE as Spark SQL fetchsize

2017-07-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21287: Assignee: Apache Spark > Cannot use Int.MIN_VALUE as Spark SQL fetchsize >

[jira] [Updated] (SPARK-21287) Cannot use Int.MIN_VALUE as Spark SQL fetchsize

2017-07-03 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-21287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Bryński updated SPARK-21287: --- Description: MySQL JDBC driver gives possibility to not store ResultSet in memory. We can do

[jira] [Updated] (SPARK-21287) Cannot use Int.MIN_VALUE as Spark SQL fetchsize

2017-07-03 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-21287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Bryński updated SPARK-21287: --- Summary: Cannot use Int.MIN_VALUE as Spark SQL fetchsize (was: Cannot use Iint.MIN_VALUE as

[jira] [Created] (SPARK-21287) Cannot use Iint.MIN_VALUE as Spark SQL fetchsize

2017-07-03 Thread JIRA
Maciej Bryński created SPARK-21287: -- Summary: Cannot use Iint.MIN_VALUE as Spark SQL fetchsize Key: SPARK-21287 URL: https://issues.apache.org/jira/browse/SPARK-21287 Project: Spark Issue

[jira] [Commented] (SPARK-21206) the window slice of Dstream is wrong

2017-07-03 Thread Fei Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16072315#comment-16072315 ] Fei Shao commented on SPARK-21206: -- [~srowen] Thanks for your time.I will check the code again. > the

[jira] [Resolved] (SPARK-21137) Spark reads many small files slowly off local filesystem

2017-07-03 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-21137. - Resolution: Fixed Fix Version/s: 2.3.0 Issue resolved by pull request 18441

[jira] [Assigned] (SPARK-21137) Spark reads many small files slowly off local filesystem

2017-07-03 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-21137: --- Assignee: Sean Owen > Spark reads many small files slowly off local filesystem >

[jira] [Updated] (SPARK-21285) VectorAssembler should report the column name when data type used is not supported

2017-07-03 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-21285: -- Issue Type: Improvement (was: Bug) Agree, seems like a good improvement to the error. >

[jira] [Updated] (SPARK-21241) Add intercept to StreamingLinearRegressionWithSGD

2017-07-03 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-21241: -- Flags: (was: Patch) Labels: (was: patch) > Add intercept to StreamingLinearRegressionWithSGD

[jira] [Updated] (SPARK-21241) Add intercept to StreamingLinearRegressionWithSGD

2017-07-03 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-21241: -- Target Version/s: (was: 2.3.0) > Add intercept to StreamingLinearRegressionWithSGD >

[jira] [Updated] (SPARK-21093) Multiple gapply execution occasionally failed in SparkR

2017-07-03 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-21093: -- Fix Version/s: (was: 2.3.0) > Multiple gapply execution occasionally failed in SparkR >

[jira] [Updated] (SPARK-21232) New built-in SQL function - Data_Type

2017-07-03 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-21232: -- Fix Version/s: (was: 2.2.0) > New built-in SQL function - Data_Type >

[jira] [Updated] (SPARK-21239) Support WAL recover in windows

2017-07-03 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-21239: -- Fix Version/s: (was: 2.2.1) (was: 2.1.2) > Support WAL recover in windows >

  1   2   >