[jira] [Commented] (SPARK-25232) Support Full-Text Search in Spark SQL

2018-08-24 Thread Lijie Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592471#comment-16592471 ] Lijie Xu commented on SPARK-25232: -- Compared to RLIKE, full text search has more powerful query that is

[jira] [Comment Edited] (SPARK-25206) Wrong data may be returned for Parquet

2018-08-24 Thread yucai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592453#comment-16592453 ] yucai edited comment on SPARK-25206 at 8/25/18 5:01 AM: {quote} # Vanilla Spark

[jira] [Commented] (SPARK-25206) Wrong data may be returned for Parquet

2018-08-24 Thread yucai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592460#comment-16592460 ] yucai commented on SPARK-25206: --- [~dongjoon] , thanks a lot for so many explanations, if we both agree to

[jira] [Commented] (SPARK-25206) Wrong data may be returned for Parquet

2018-08-24 Thread yucai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592453#comment-16592453 ] yucai commented on SPARK-25206: --- {quote} # Vanilla Spark 2.2.0 ~ 2.3.1 always returns NULL for Parquet

[jira] [Commented] (SPARK-25206) Wrong data may be returned for Parquet

2018-08-24 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592445#comment-16592445 ] Dongjoon Hyun commented on SPARK-25206: --- [~yucai]. First of all, I know your intention and support

[jira] [Commented] (SPARK-25232) Support Full-Text Search in Spark SQL

2018-08-24 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592442#comment-16592442 ] Takeshi Yamamuro commented on SPARK-25232: -- RLIKE is not enough? IMO they load data into

[jira] [Commented] (SPARK-25232) Support Full-Text Search in Spark SQL

2018-08-24 Thread Lijie Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592429#comment-16592429 ] Lijie Xu commented on SPARK-25232: -- [~maropu]  Full text search is available in popular relational

[jira] [Updated] (SPARK-25232) Support Full-Text Search in Spark SQL

2018-08-24 Thread Lijie Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lijie Xu updated SPARK-25232: - Description: Full-text search (i.e., keyword search) is widely used in search engines and relational

[jira] [Comment Edited] (SPARK-25206) Wrong data may be returned for Parquet

2018-08-24 Thread yucai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592425#comment-16592425 ] yucai edited comment on SPARK-25206 at 8/25/18 3:33 AM: [~dongjoon] , correct me

[jira] [Commented] (SPARK-25206) Wrong data may be returned for Parquet

2018-08-24 Thread yucai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592425#comment-16592425 ] yucai commented on SPARK-25206: --- [~dongjoon] , correct me if I am wrong. {code:java}

[jira] [Commented] (SPARK-25175) Case-insensitive field resolution when reading from ORC

2018-08-24 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592419#comment-16592419 ] Dongjoon Hyun commented on SPARK-25175: --- [~seancxmao]. I know you are working, but could you give

[jira] [Commented] (SPARK-25206) Wrong data may be returned for Parquet

2018-08-24 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592407#comment-16592407 ] Dongjoon Hyun commented on SPARK-25206: --- Let me put this way. Parquet returns `null` for all

[jira] [Commented] (SPARK-25206) Wrong data may be returned for Parquet

2018-08-24 Thread yucai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592406#comment-16592406 ] yucai commented on SPARK-25206: --- Not a simple duplication. Backport -SPARK-25132-, but without 

[jira] [Commented] (SPARK-25206) Wrong data may be returned for Parquet

2018-08-24 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592403#comment-16592403 ] Dongjoon Hyun commented on SPARK-25206: --- If this is only reporting SPARK-25132, we had better

[jira] [Updated] (SPARK-25132) Case-insensitive field resolution when reading from Parquet

2018-08-24 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-25132: -- Affects Version/s: 2.2.0 > Case-insensitive field resolution when reading from Parquet >

[jira] [Commented] (SPARK-25206) Wrong data may be returned for Parquet

2018-08-24 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592402#comment-16592402 ] Dongjoon Hyun commented on SPARK-25206: --- Yes. That's my point. This is a simple duplication of

[jira] [Commented] (SPARK-25225) Add support for "List"-Type columns

2018-08-24 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592397#comment-16592397 ] Takeshi Yamamuro commented on SPARK-25225: -- I don't understand exactly your secinario though,

[jira] [Commented] (SPARK-25206) Wrong data may be returned for Parquet

2018-08-24 Thread yucai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592392#comment-16592392 ] yucai commented on SPARK-25206: --- [~dongjoon] , the reason you see `null` without predicate pushdown, it is

[jira] [Commented] (SPARK-25206) Wrong data may be returned for Parquet

2018-08-24 Thread yucai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592390#comment-16592390 ] yucai commented on SPARK-25206: --- Link to SPARK-25132, this bug needs two PRs backport. > Wrong data may

[jira] [Commented] (SPARK-25206) Wrong data may be returned for Parquet

2018-08-24 Thread yucai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592384#comment-16592384 ] yucai commented on SPARK-25206: --- [~dongjoon], I still think this bug is related to pushdown, but

[jira] [Updated] (SPARK-25206) Wrong data may be returned for Parquet

2018-08-24 Thread yucai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yucai updated SPARK-25206: -- Attachment: image-2018-08-25-10-04-21-901.png > Wrong data may be returned for Parquet >

[jira] [Updated] (SPARK-25206) Wrong data may be returned for Parquet

2018-08-24 Thread yucai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yucai updated SPARK-25206: -- Attachment: image-2018-08-25-09-54-53-219.png > Wrong data may be returned for Parquet >

[jira] [Resolved] (SPARK-25229) ExternalCatalogUtils.prunePartitionsByFilter throw an AnalysisException when partition name contains upper letter

2018-08-24 Thread Xiaochen Ouyang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaochen Ouyang resolved SPARK-25229. - Resolution: Not A Bug > ExternalCatalogUtils.prunePartitionsByFilter throw an

[jira] [Commented] (SPARK-25229) ExternalCatalogUtils.prunePartitionsByFilter throw an AnalysisException when partition name contains upper letter

2018-08-24 Thread Xiaochen Ouyang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592379#comment-16592379 ] Xiaochen Ouyang commented on SPARK-25229: - please close it., I mistake it. >

[jira] [Commented] (SPARK-25232) Support Full-Text Search in Spark SQL

2018-08-24 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592375#comment-16592375 ] Takeshi Yamamuro commented on SPARK-25232: -- The other database-like systems other than mysql

[jira] [Resolved] (SPARK-25223) Use a map to store values for NamedLambdaVariable.

2018-08-24 Thread Takuya Ueshin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takuya Ueshin resolved SPARK-25223. --- Resolution: Won't Do > Use a map to store values for NamedLambdaVariable. >

[jira] [Updated] (SPARK-25206) Wrong data may be returned for Parquet

2018-08-24 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-25206: -- Summary: Wrong data may be returned for Parquet (was: Wrong data may be returned when enable

[jira] [Updated] (SPARK-25206) Wrong data may be returned when enable pushdown for Parquet

2018-08-24 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-25206: -- Description: In current Spark 2.3.1, below query returns wrong data silently. {code:java}

[jira] [Comment Edited] (SPARK-25206) Wrong data may be returned when enable pushdown for Parquet

2018-08-24 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592320#comment-16592320 ] Dongjoon Hyun edited comment on SPARK-25206 at 8/24/18 11:52 PM: - +1 for

[jira] [Comment Edited] (SPARK-25206) Wrong data may be returned when enable pushdown for Parquet

2018-08-24 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592320#comment-16592320 ] Dongjoon Hyun edited comment on SPARK-25206 at 8/24/18 11:48 PM: - +1 for

[jira] [Comment Edited] (SPARK-25206) Wrong data may be returned when enable pushdown for Parquet

2018-08-24 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592320#comment-16592320 ] Dongjoon Hyun edited comment on SPARK-25206 at 8/24/18 11:45 PM: - +1 for

[jira] [Commented] (SPARK-25206) Wrong data may be returned when enable pushdown for Parquet

2018-08-24 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592320#comment-16592320 ] Dongjoon Hyun commented on SPARK-25206: --- +1 for fixing this. However, actually, it isn't an issue

[jira] [Updated] (SPARK-25206) Wrong data may be returned when enable pushdown for Parquet

2018-08-24 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-25206: -- Affects Version/s: 2.2.2 > Wrong data may be returned when enable pushdown for Parquet >

[jira] [Updated] (SPARK-25206) Wrong data may be returned when enable pushdown for Parquet

2018-08-24 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-25206: -- Summary: Wrong data may be returned when enable pushdown for Parquet (was: Wrong data may be

[jira] [Resolved] (SPARK-25124) VectorSizeHint.size is buggy, breaking streaming pipeline

2018-08-24 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley resolved SPARK-25124. --- Resolution: Fixed Fix Version/s: 2.3.2 Issue resolved by pull request 8

[jira] [Resolved] (SPARK-25106) A new Kafka consumer gets created for every batch

2018-08-24 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-25106. -- Resolution: Duplicate Thanks for reporting this. I'm closing this as a duplicate of

[jira] [Updated] (SPARK-24987) Kafka Cached Consumer Leaking File Descriptors

2018-08-24 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-24987: - Affects Version/s: 2.2.2 2.3.0 > Kafka Cached Consumer Leaking File

[jira] [Updated] (SPARK-24987) Kafka Cached Consumer Leaking File Descriptors

2018-08-24 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-24987: - Affects Version/s: (was: 2.3.0) > Kafka Cached Consumer Leaking File Descriptors >

[jira] [Resolved] (SPARK-25234) SparkR:::parallelize doesn't handle integer overflow properly

2018-08-24 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-25234. --- Resolution: Fixed Fix Version/s: 2.3.2 2.4.0 Issue resolved by

[jira] [Commented] (SPARK-25230) Upper behavior incorrect for string contains "ß"

2018-08-24 Thread Nihar Sheth (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592216#comment-16592216 ] Nihar Sheth commented on SPARK-25230: - This seems to be a JVM thing 

[jira] [Commented] (SPARK-25214) Kafka v2 source may return duplicated records when `failOnDataLoss` is `false`

2018-08-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592213#comment-16592213 ] Apache Spark commented on SPARK-25214: -- User 'zsxwing' has created a pull request for this issue:

[jira] [Created] (SPARK-25236) Investigate using a logging library inside of PySpark on the workers instead of print

2018-08-24 Thread holdenk (JIRA)
holdenk created SPARK-25236: --- Summary: Investigate using a logging library inside of PySpark on the workers instead of print Key: SPARK-25236 URL: https://issues.apache.org/jira/browse/SPARK-25236 Project:

[jira] [Updated] (SPARK-9636) Treat $SPARK_HOME as write-only

2018-08-24 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk updated SPARK-9636: --- Labels: (was: easyfix) > Treat $SPARK_HOME as write-only > --- > >

[jira] [Resolved] (SPARK-19094) Plumb through logging/error messages from the JVM to Jupyter PySpark

2018-08-24 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk resolved SPARK-19094. - Resolution: Won't Fix No longer as important given other changes. > Plumb through logging/error

[jira] [Commented] (SPARK-25124) VectorSizeHint.size is buggy, breaking streaming pipeline

2018-08-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592174#comment-16592174 ] Apache Spark commented on SPARK-25124: -- User 'huaxingao' has created a pull request for this issue:

[jira] [Commented] (SPARK-7768) Make user-defined type (UDT) API public

2018-08-24 Thread Alexander (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592171#comment-16592171 ] Alexander commented on SPARK-7768: -- Ah, looking at this SO post what I am asking about (i.e. accessing

[jira] [Resolved] (SPARK-25214) Kafka v2 source may return duplicated records when `failOnDataLoss` is `false`

2018-08-24 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-25214. -- Resolution: Fixed Fix Version/s: 2.4.0 > Kafka v2 source may return duplicated records

[jira] [Updated] (SPARK-25214) Kafka v2 source may return duplicated records when `failOnDataLoss` is `false`

2018-08-24 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-25214: - Affects Version/s: (was: 2.3.1) (was: 2.3.0)

[jira] [Assigned] (SPARK-25202) SQL Function Split Should Respect Limit Argument

2018-08-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25202: Assignee: Apache Spark > SQL Function Split Should Respect Limit Argument >

[jira] [Commented] (SPARK-25202) SQL Function Split Should Respect Limit Argument

2018-08-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592156#comment-16592156 ] Apache Spark commented on SPARK-25202: -- User 'phegstrom' has created a pull request for this issue:

[jira] [Assigned] (SPARK-25202) SQL Function Split Should Respect Limit Argument

2018-08-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25202: Assignee: (was: Apache Spark) > SQL Function Split Should Respect Limit Argument >

[jira] [Assigned] (SPARK-25174) ApplicationMaster suspends when unregistering itself from RM with extreme large diagnostic message

2018-08-24 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin reassigned SPARK-25174: -- Assignee: Kent Yao > ApplicationMaster suspends when unregistering itself from RM

[jira] [Resolved] (SPARK-25174) ApplicationMaster suspends when unregistering itself from RM with extreme large diagnostic message

2018-08-24 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-25174. Resolution: Fixed Fix Version/s: 2.4.0 Issue resolved by pull request 22180

[jira] [Created] (SPARK-25235) Merge the REPL code in Scala 2.11 and 2.12 branches

2018-08-24 Thread DB Tsai (JIRA)
DB Tsai created SPARK-25235: --- Summary: Merge the REPL code in Scala 2.11 and 2.12 branches Key: SPARK-25235 URL: https://issues.apache.org/jira/browse/SPARK-25235 Project: Spark Issue Type:

[jira] [Comment Edited] (SPARK-7768) Make user-defined type (UDT) API public

2018-08-24 Thread Alexander (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592128#comment-16592128 ] Alexander edited comment on SPARK-7768 at 8/24/18 8:22 PM: --- I've also noticed

[jira] [Comment Edited] (SPARK-7768) Make user-defined type (UDT) API public

2018-08-24 Thread Alexander (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592128#comment-16592128 ] Alexander edited comment on SPARK-7768 at 8/24/18 8:22 PM: --- I've also noticed

[jira] [Comment Edited] (SPARK-7768) Make user-defined type (UDT) API public

2018-08-24 Thread Alexander (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592128#comment-16592128 ] Alexander edited comment on SPARK-7768 at 8/24/18 8:21 PM: --- I've also noticed

[jira] [Comment Edited] (SPARK-7768) Make user-defined type (UDT) API public

2018-08-24 Thread Alexander (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592128#comment-16592128 ] Alexander edited comment on SPARK-7768 at 8/24/18 8:21 PM: --- I've also noticed

[jira] [Commented] (SPARK-7768) Make user-defined type (UDT) API public

2018-08-24 Thread Alexander (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592128#comment-16592128 ] Alexander commented on SPARK-7768: -- I've also noticed that there are some idiosyncracies in the

[jira] [Commented] (SPARK-19335) Spark should support doing an efficient DataFrame Upsert via JDBC

2018-08-24 Thread kevin yu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592126#comment-16592126 ] kevin yu commented on SPARK-19335: -- [~drew222]: I am still working on it, right now, I am waiting for

[jira] [Comment Edited] (SPARK-7768) Make user-defined type (UDT) API public

2018-08-24 Thread Alexander (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592116#comment-16592116 ] Alexander edited comment on SPARK-7768 at 8/24/18 8:01 PM: --- [~pgrandjean], are

[jira] [Commented] (SPARK-7768) Make user-defined type (UDT) API public

2018-08-24 Thread Alexander (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592116#comment-16592116 ] Alexander commented on SPARK-7768: -- [~pgrandjean], are you thinking of writing a library for that? :) >

[jira] [Commented] (SPARK-24391) to_json/from_json should support arrays of primitives, and more generally all JSON

2018-08-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592083#comment-16592083 ] Apache Spark commented on SPARK-24391: -- User 'MaxGekk' has created a pull request for this issue:

[jira] [Commented] (SPARK-19335) Spark should support doing an efficient DataFrame Upsert via JDBC

2018-08-24 Thread drew zoellner (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592003#comment-16592003 ] drew zoellner commented on SPARK-19335: --- + 1 , is this still in progress? > Spark should support

[jira] [Commented] (SPARK-25234) SparkR:::parallelize doesn't handle integer overflow properly

2018-08-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592004#comment-16592004 ] Apache Spark commented on SPARK-25234: -- User 'mengxr' has created a pull request for this issue:

[jira] [Assigned] (SPARK-24090) Kubernetes Backend Hotlist for Spark 2.4

2018-08-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24090: Assignee: Anirudh Ramanathan (was: Apache Spark) > Kubernetes Backend Hotlist for Spark

[jira] [Assigned] (SPARK-24090) Kubernetes Backend Hotlist for Spark 2.4

2018-08-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24090: Assignee: Apache Spark (was: Anirudh Ramanathan) > Kubernetes Backend Hotlist for Spark

[jira] [Commented] (SPARK-24090) Kubernetes Backend Hotlist for Spark 2.4

2018-08-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592001#comment-16592001 ] Apache Spark commented on SPARK-24090: -- User 'liyinan926' has created a pull request for this

[jira] [Assigned] (SPARK-25234) SparkR:::parallelize doesn't handle integer overflow properly

2018-08-24 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng reassigned SPARK-25234: - Assignee: Xiangrui Meng > SparkR:::parallelize doesn't handle integer overflow

[jira] [Assigned] (SPARK-25233) Give the user the option of specifying a fixed minimum message per partition per batch when using kafka direct API with backpressure

2018-08-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25233: Assignee: (was: Apache Spark) > Give the user the option of specifying a fixed

[jira] [Commented] (SPARK-25233) Give the user the option of specifying a fixed minimum message per partition per batch when using kafka direct API with backpressure

2018-08-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16591904#comment-16591904 ] Apache Spark commented on SPARK-25233: -- User 'rezasafi' has created a pull request for this issue:

[jira] [Assigned] (SPARK-25233) Give the user the option of specifying a fixed minimum message per partition per batch when using kafka direct API with backpressure

2018-08-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25233: Assignee: Apache Spark > Give the user the option of specifying a fixed minimum message

[jira] [Updated] (SPARK-25234) SparkR:::parallelize doesn't handle integer overflow properly

2018-08-24 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-25234: -- Description: parallelize uses integer multiplication, which cannot handle size over ~47000.

[jira] [Created] (SPARK-25234) SparkR:::parallelize doesn't handle integer overflow properly

2018-08-24 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-25234: - Summary: SparkR:::parallelize doesn't handle integer overflow properly Key: SPARK-25234 URL: https://issues.apache.org/jira/browse/SPARK-25234 Project: Spark

[jira] [Created] (SPARK-25233) Give the user the option of specifying a fixed minimum message per partition per batch when using kafka direct API with backpressure

2018-08-24 Thread Reza Safi (JIRA)
Reza Safi created SPARK-25233: - Summary: Give the user the option of specifying a fixed minimum message per partition per batch when using kafka direct API with backpressure Key: SPARK-25233 URL:

[jira] [Commented] (SPARK-25233) Give the user the option of specifying a fixed minimum message per partition per batch when using kafka direct API with backpressure

2018-08-24 Thread Reza Safi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16591889#comment-16591889 ] Reza Safi commented on SPARK-25233: --- I will send a PR shortly for this. > Give the user the option of

[jira] [Commented] (SPARK-10795) FileNotFoundException while deploying pyspark job on cluster

2018-08-24 Thread Furcy Pin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16591868#comment-16591868 ] Furcy Pin commented on SPARK-10795: --- Hi, I came across this ticket with the same issue: my yarn job

[jira] [Assigned] (SPARK-25083) remove the type erasure hack in data source scan

2018-08-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25083: Assignee: (was: Apache Spark) > remove the type erasure hack in data source scan >

[jira] [Assigned] (SPARK-25083) remove the type erasure hack in data source scan

2018-08-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25083: Assignee: Apache Spark > remove the type erasure hack in data source scan >

[jira] [Commented] (SPARK-25083) remove the type erasure hack in data source scan

2018-08-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16591849#comment-16591849 ] Apache Spark commented on SPARK-25083: -- User 'xuanyuanking' has created a pull request for this

[jira] [Created] (SPARK-25232) Support Full-Text Search in Spark SQL

2018-08-24 Thread Lijie Xu (JIRA)
Lijie Xu created SPARK-25232: Summary: Support Full-Text Search in Spark SQL Key: SPARK-25232 URL: https://issues.apache.org/jira/browse/SPARK-25232 Project: Spark Issue Type: New Feature

[jira] [Commented] (SPARK-25106) A new Kafka consumer gets created for every batch

2018-08-24 Thread Alexis Seigneurin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16591834#comment-16591834 ] Alexis Seigneurin commented on SPARK-25106: --- I just built the code from

[jira] [Assigned] (SPARK-25231) Running a Large Job with Speculation On Causes Executor Heartbeats to Time Out on Driver

2018-08-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25231: Assignee: (was: Apache Spark) > Running a Large Job with Speculation On Causes

[jira] [Commented] (SPARK-25231) Running a Large Job with Speculation On Causes Executor Heartbeats to Time Out on Driver

2018-08-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16591818#comment-16591818 ] Apache Spark commented on SPARK-25231: -- User 'pgandhi999' has created a pull request for this

[jira] [Assigned] (SPARK-25231) Running a Large Job with Speculation On Causes Executor Heartbeats to Time Out on Driver

2018-08-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25231: Assignee: Apache Spark > Running a Large Job with Speculation On Causes Executor

[jira] [Created] (SPARK-25231) Running a Large Job with Speculation On Causes Executor Heartbeats to Time Out on Driver

2018-08-24 Thread Parth Gandhi (JIRA)
Parth Gandhi created SPARK-25231: Summary: Running a Large Job with Speculation On Causes Executor Heartbeats to Time Out on Driver Key: SPARK-25231 URL: https://issues.apache.org/jira/browse/SPARK-25231

[jira] [Updated] (SPARK-25230) Upper behavior incorrect for string contains "ß"

2018-08-24 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-25230: Description: How to reproduce: {code:sql} spark-sql> SELECT upper('Haßler'); HASSLER {code}

[jira] [Updated] (SPARK-25230) Upper behavior incorrect for string contains "ß"

2018-08-24 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-25230: Summary: Upper behavior incorrect for string contains "ß" (was: Upper behaves incorrect for

[jira] [Updated] (SPARK-25230) Upper behaves incorrect for string contains "ß"

2018-08-24 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-25230: Description: How to reproduce: {code:sql} spark-sql> SELECT upper('Haßler'); HASSLER {code}

[jira] [Updated] (SPARK-25230) Upper behaves incorrect for string contains "ß"

2018-08-24 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-25230: Description: How to reproduce: {code:sql} spark-sql> SELECT upper('Haßler'); HASSLER {code}

[jira] [Commented] (SPARK-25206) Wrong data may be returned when enable pushdown

2018-08-24 Thread yucai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16591756#comment-16591756 ] yucai commented on SPARK-25206: --- [~cloud_fan] , we need both [https://github.com/apache/spark/pull/21696] 

[jira] [Updated] (SPARK-25206) Wrong data may be returned when enable pushdown

2018-08-24 Thread yucai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yucai updated SPARK-25206: -- Attachment: image-2018-08-24-22-46-05-346.png > Wrong data may be returned when enable pushdown >

[jira] [Updated] (SPARK-25206) Wrong data may be returned when enable pushdown

2018-08-24 Thread yucai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yucai updated SPARK-25206: -- Attachment: image-2018-08-24-22-34-11-539.png > Wrong data may be returned when enable pushdown >

[jira] [Updated] (SPARK-25206) Wrong data may be returned when enable pushdown

2018-08-24 Thread yucai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yucai updated SPARK-25206: -- Attachment: image-2018-08-24-22-33-03-231.png > Wrong data may be returned when enable pushdown >

[jira] [Updated] (SPARK-25230) Upper behaves incorrect for string contains "ß"

2018-08-24 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-25230: Description: How to reproduce: {code:sql} spark-sql> SELECT upper('Haßler'); HASSLER {code}

[jira] [Updated] (SPARK-25230) Upper behaves incorrect for string contains "ß"

2018-08-24 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-25230: Description: How to reproduce: {code:sql} spark-sql> SELECT upper('Haßler'); HASSLER {code}

[jira] [Updated] (SPARK-25230) Upper behaves incorrect for string contains "ß"

2018-08-24 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-25230: Description: How to reproduce: {code:sql} spark-sql> SELECT upper('Haßler'); HASSLER {code}

[jira] [Updated] (SPARK-25230) Upper behaves incorrect for string contains "ß"

2018-08-24 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-25230: Description: How to reproduce: {code:sql} spark-sql> SELECT upper('Haßler'); HASSLER {code}

[jira] [Updated] (SPARK-25206) Wrong data may be returned when enable pushdown

2018-08-24 Thread yucai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yucai updated SPARK-25206: -- Attachment: pr22183.png > Wrong data may be returned when enable pushdown >

[jira] [Updated] (SPARK-25230) Upper behaves incorrect for string contains "ß"

2018-08-24 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-25230: Attachment: Oracle.png > Upper behaves incorrect for string contains "ß" >

  1   2   >