[jira] [Commented] (SPARK-23404) When the underlying buffers are already direct, we should copy them to the heap memory

2018-02-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361933#comment-16361933 ] Apache Spark commented on SPARK-23404: -- User '10110346' has created a pull request for this issue:

[jira] [Assigned] (SPARK-23404) When the underlying buffers are already direct, we should copy them to the heap memory

2018-02-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23404: Assignee: (was: Apache Spark) > When the underlying buffers are already direct, we

[jira] [Assigned] (SPARK-23404) When the underlying buffers are already direct, we should copy them to the heap memory

2018-02-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23404: Assignee: Apache Spark > When the underlying buffers are already direct, we should copy

[jira] [Updated] (SPARK-23404) When the underlying buffers are already direct, we should copy them to the heap memory

2018-02-12 Thread liuxian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liuxian updated SPARK-23404: Description: If the memory mode is _ON_HEAP_,when the underlying buffers are direct, we should copy them

[jira] [Updated] (SPARK-23404) When the underlying buffers are already direct, we should copy them to the heap memory

2018-02-12 Thread liuxian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liuxian updated SPARK-23404: Summary: When the underlying buffers are already direct, we should copy them to the heap memory (was:

[jira] [Created] (SPARK-23404) When the underlying buffers are already direct, we should copy it to the heap memory

2018-02-12 Thread liuxian (JIRA)
liuxian created SPARK-23404: --- Summary: When the underlying buffers are already direct, we should copy it to the heap memory Key: SPARK-23404 URL: https://issues.apache.org/jira/browse/SPARK-23404 Project:

[jira] [Updated] (SPARK-23403) java.lang.ArrayIndexOutOfBoundsException: 10

2018-02-12 Thread Naresh Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naresh Kumar updated SPARK-23403: - Docs Text: val

[jira] [Created] (SPARK-23403) java.lang.ArrayIndexOutOfBoundsException: 10

2018-02-12 Thread Naresh Kumar (JIRA)
Naresh Kumar created SPARK-23403: Summary: java.lang.ArrayIndexOutOfBoundsException: 10 Key: SPARK-23403 URL: https://issues.apache.org/jira/browse/SPARK-23403 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-23340) Empty float/double array columns in ORC file should not raise EOFException

2018-02-12 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-23340: -- Description: This issue updates Apache ORC dependencies to 1.4.3 released on February 9th.

[jira] [Updated] (SPARK-23340) Empty float/double array columns in ORC file should not raise EOFException

2018-02-12 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-23340: -- Description: This issue updates Apache ORC dependencies to 1.4.3 released on February 9th.

[jira] [Updated] (SPARK-23340) Empty float/double array columns in ORC file should not raise EOFException

2018-02-12 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-23340: -- Summary: Empty float/double array columns in ORC file should not raise EOFException (was:

[jira] [Updated] (SPARK-23340) Empty float/double array columns in ORC file raise EOFException

2018-02-12 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-23340: -- Summary: Empty float/double array columns in ORC file raise EOFException (was: Empty

[jira] [Updated] (SPARK-23340) Empty float/double array columns raise EOFException

2018-02-12 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-23340: -- Description: This issue updates Apache ORC dependencies to 1.4.3 released on February 9th.

[jira] [Updated] (SPARK-23340) Empty float/double array columns raise EOFException

2018-02-12 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-23340: -- Priority: Critical (was: Major) > Empty float/double array columns raise EOFException >

[jira] [Updated] (SPARK-23340) Empty float/double array columns raise EOFException

2018-02-12 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-23340: -- Summary: Empty float/double array columns raise EOFException (was: Update ORC to 1.4.3) >

[jira] [Updated] (SPARK-23340) Empty float/double array columns raise EOFException

2018-02-12 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-23340: -- Component/s: SQL > Empty float/double array columns raise EOFException >

[jira] [Updated] (SPARK-23340) Update ORC to 1.4.3

2018-02-12 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-23340: -- Description: This issue updates Apache ORC dependencies to 1.4.3 released on February 9th.

[jira] [Updated] (SPARK-23402) Dataset write method not working as expected for postgresql database

2018-02-12 Thread Pallapothu Jyothi Swaroop (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pallapothu Jyothi Swaroop updated SPARK-23402: -- Description: I am using spark dataset write to insert data on

[jira] [Updated] (SPARK-23402) Dataset write method not working as expected for postgresql database

2018-02-12 Thread Pallapothu Jyothi Swaroop (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pallapothu Jyothi Swaroop updated SPARK-23402: -- Description: I am using spark dataset write to insert data on

[jira] [Updated] (SPARK-23402) Dataset write method not working as expected for postgresql database

2018-02-12 Thread Pallapothu Jyothi Swaroop (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pallapothu Jyothi Swaroop updated SPARK-23402: -- Attachment: Emsku[1].jpg > Dataset write method not working as

[jira] [Created] (SPARK-23402) Dataset write method not working as expected for postgresql database

2018-02-12 Thread Pallapothu Jyothi Swaroop (JIRA)
Pallapothu Jyothi Swaroop created SPARK-23402: - Summary: Dataset write method not working as expected for postgresql database Key: SPARK-23402 URL: https://issues.apache.org/jira/browse/SPARK-23402

[jira] [Commented] (SPARK-23397) Scheduling delay causes Spark Streaming to miss batches.

2018-02-12 Thread Vivek Patangiwar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361851#comment-16361851 ] Vivek Patangiwar commented on SPARK-23397: -- Thanks for your response Sean. An example to

[jira] [Commented] (SPARK-20090) Add StructType.fieldNames to Python API

2018-02-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361838#comment-16361838 ] Apache Spark commented on SPARK-20090: -- User 'gatorsmile' has created a pull request for this issue:

[jira] [Updated] (SPARK-20090) Add StructType.fieldNames to Python API

2018-02-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-20090: Target Version/s: 2.3.0 > Add StructType.fieldNames to Python API >

[jira] [Resolved] (SPARK-23303) improve the explain result for data source v2 relations

2018-02-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-23303. - Resolution: Fixed Fix Version/s: 2.4.0 > improve the explain result for data source v2 relations

[jira] [Commented] (SPARK-23377) Bucketizer with multiple columns persistence bug

2018-02-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361830#comment-16361830 ] Apache Spark commented on SPARK-23377: -- User 'viirya' has created a pull request for this issue:

[jira] [Updated] (SPARK-23316) AnalysisException after max iteration reached for IN query

2018-02-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-23316: Target Version/s: 2.3.0 > AnalysisException after max iteration reached for IN query >

[jira] [Commented] (SPARK-23377) Bucketizer with multiple columns persistence bug

2018-02-12 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361824#comment-16361824 ] Joseph K. Bradley commented on SPARK-23377: --- Thanks for reconsidering here [~viirya]! I can

[jira] [Resolved] (SPARK-23379) remove redundant metastore access if the current database name is the same

2018-02-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-23379. - Resolution: Fixed Assignee: Feng Liu Fix Version/s: 2.4.0 > remove redundant metastore

[jira] [Assigned] (SPARK-23400) Add the extra constructors for ScalaUDF

2018-02-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23400: Assignee: Apache Spark (was: Xiao Li) > Add the extra constructors for ScalaUDF >

[jira] [Updated] (SPARK-23400) Add the extra constructors for ScalaUDF

2018-02-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-23400: Affects Version/s: (was: 2.2.1) (was: 2.1.2) > Add the extra constructors

[jira] [Assigned] (SPARK-23400) Add the extra constructors for ScalaUDF

2018-02-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23400: Assignee: Xiao Li (was: Apache Spark) > Add the extra constructors for ScalaUDF >

[jira] [Updated] (SPARK-23400) Add the extra constructors for ScalaUDF

2018-02-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-23400: Summary: Add the extra constructors for ScalaUDF (was: Add two extra constructors for ScalaUDF) > Add

[jira] [Updated] (SPARK-23400) Add the extra constructors for ScalaUDF

2018-02-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-23400: Description: The last few releases, we changed the interface of ScalaUDF. Unfortunately, some Spark

[jira] [Commented] (SPARK-23400) Add the extra constructors for ScalaUDF

2018-02-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361819#comment-16361819 ] Apache Spark commented on SPARK-23400: -- User 'gatorsmile' has created a pull request for this issue:

[jira] [Resolved] (SPARK-23323) DataSourceV2 should use the output commit coordinator.

2018-02-12 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-23323. - Resolution: Fixed Fix Version/s: 2.4.0 Issue resolved by pull request 20490

[jira] [Assigned] (SPARK-23323) DataSourceV2 should use the output commit coordinator.

2018-02-12 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-23323: --- Assignee: Ryan Blue > DataSourceV2 should use the output commit coordinator. >

[jira] [Commented] (SPARK-23377) Bucketizer with multiple columns persistence bug

2018-02-12 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361724#comment-16361724 ] Liang-Chi Hsieh commented on SPARK-23377: - For now, I think neither 3rd option or my current

[jira] [Commented] (SPARK-23377) Bucketizer with multiple columns persistence bug

2018-02-12 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361718#comment-16361718 ] Liang-Chi Hsieh commented on SPARK-23377: - I have no objection to [~josephkb]'s proposal (first

[jira] [Commented] (SPARK-23230) When hive.default.fileformat is other kinds of file types, create textfile table cause a serde error

2018-02-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361696#comment-16361696 ] Apache Spark commented on SPARK-23230: -- User 'cxzl25' has created a pull request for this issue:

[jira] [Comment Edited] (SPARK-23377) Bucketizer with multiple columns persistence bug

2018-02-12 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361154#comment-16361154 ] Joseph K. Bradley edited comment on SPARK-23377 at 2/13/18 1:10 AM:

[jira] [Updated] (SPARK-23352) Explicitly specify supported types in Pandas UDFs

2018-02-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-23352: Fix Version/s: 2.3.1 > Explicitly specify supported types in Pandas UDFs >

[jira] [Commented] (SPARK-23154) Document backwards compatibility guarantees for ML persistence

2018-02-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361654#comment-16361654 ] Apache Spark commented on SPARK-23154: -- User 'jkbradley' has created a pull request for this issue:

[jira] [Commented] (SPARK-20307) SparkR: pass on setHandleInvalid to spark.mllib functions that use StringIndexer

2018-02-12 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361631#comment-16361631 ] Miao Wang commented on SPARK-20307: --- [~felixcheung] I will do it during the Lunar New Year vacation. I

[jira] [Resolved] (SPARK-23230) When hive.default.fileformat is other kinds of file types, create textfile table cause a serde error

2018-02-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-23230. - Resolution: Fixed Assignee: dzcxzl Fix Version/s: 2.3.0 > When hive.default.fileformat

[jira] [Updated] (SPARK-22820) Spark 2.3 SQL API audit

2018-02-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-22820: Fix Version/s: 2.3.0 > Spark 2.3 SQL API audit > --- > > Key:

[jira] [Resolved] (SPARK-22820) Spark 2.3 SQL API audit

2018-02-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-22820. - Resolution: Fixed > Spark 2.3 SQL API audit > --- > > Key:

[jira] [Resolved] (SPARK-23313) Add a migration guide for ORC

2018-02-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-23313. - Resolution: Fixed Fix Version/s: 2.3.0 > Add a migration guide for ORC >

[jira] [Assigned] (SPARK-23313) Add a migration guide for ORC

2018-02-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li reassigned SPARK-23313: --- Assignee: Dongjoon Hyun > Add a migration guide for ORC > - > >

[jira] [Commented] (SPARK-23154) Document backwards compatibility guarantees for ML persistence

2018-02-12 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361575#comment-16361575 ] Joseph K. Bradley commented on SPARK-23154: --- I'd prefer to put it in the subsection on saving &

[jira] [Resolved] (SPARK-23378) move setCurrentDatabase from HiveExternalCatalog to HiveClientImpl

2018-02-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-23378. - Resolution: Fixed Assignee: Feng Liu Fix Version/s: 2.4.0 > move setCurrentDatabase from

[jira] [Created] (SPARK-23401) Improve test cases for all supported types and unsupported types

2018-02-12 Thread Hyukjin Kwon (JIRA)
Hyukjin Kwon created SPARK-23401: Summary: Improve test cases for all supported types and unsupported types Key: SPARK-23401 URL: https://issues.apache.org/jira/browse/SPARK-23401 Project: Spark

[jira] [Commented] (SPARK-23390) Flaky Test Suite: FileBasedDataSourceSuite in Spark 2.3/hadoop 2.7

2018-02-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361537#comment-16361537 ] Apache Spark commented on SPARK-23390: -- User 'gatorsmile' has created a pull request for this issue:

[jira] [Created] (SPARK-23400) Add two extra constructors for ScalaUDF

2018-02-12 Thread Xiao Li (JIRA)
Xiao Li created SPARK-23400: --- Summary: Add two extra constructors for ScalaUDF Key: SPARK-23400 URL: https://issues.apache.org/jira/browse/SPARK-23400 Project: Spark Issue Type: Bug

[jira] [Assigned] (SPARK-23399) Register a task completion listner first for OrcColumnarBatchReader

2018-02-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23399: Assignee: (was: Apache Spark) > Register a task completion listner first for

[jira] [Assigned] (SPARK-23399) Register a task completion listner first for OrcColumnarBatchReader

2018-02-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23399: Assignee: Apache Spark > Register a task completion listner first for

[jira] [Commented] (SPARK-23399) Register a task completion listner first for OrcColumnarBatchReader

2018-02-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361508#comment-16361508 ] Apache Spark commented on SPARK-23399: -- User 'dongjoon-hyun' has created a pull request for this

[jira] [Updated] (SPARK-23399) Register a task completion listner first for OrcColumnarBatchReader

2018-02-12 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-23399: -- Description: This is related with SPARK-23390. Currently, there was a opened file leak for

[jira] [Created] (SPARK-23399) Register a task completion listner first for OrcColumnarBatchReader

2018-02-12 Thread Dongjoon Hyun (JIRA)
Dongjoon Hyun created SPARK-23399: - Summary: Register a task completion listner first for OrcColumnarBatchReader Key: SPARK-23399 URL: https://issues.apache.org/jira/browse/SPARK-23399 Project: Spark

[jira] [Assigned] (SPARK-23394) Storage info's Cached Partitions doesn't consider the replications (but sc.getRDDStorageInfo does)

2018-02-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23394: Assignee: Apache Spark > Storage info's Cached Partitions doesn't consider the

[jira] [Assigned] (SPARK-23394) Storage info's Cached Partitions doesn't consider the replications (but sc.getRDDStorageInfo does)

2018-02-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23394: Assignee: (was: Apache Spark) > Storage info's Cached Partitions doesn't consider the

[jira] [Commented] (SPARK-23394) Storage info's Cached Partitions doesn't consider the replications (but sc.getRDDStorageInfo does)

2018-02-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361347#comment-16361347 ] Apache Spark commented on SPARK-23394: -- User 'attilapiros' has created a pull request for this

[jira] [Assigned] (SPARK-23388) Support for Parquet Binary DecimalType in VectorizedColumnReader

2018-02-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li reassigned SPARK-23388: --- Assignee: James Thompson > Support for Parquet Binary DecimalType in VectorizedColumnReader >

[jira] [Resolved] (SPARK-23388) Support for Parquet Binary DecimalType in VectorizedColumnReader

2018-02-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-23388. - Resolution: Fixed Fix Version/s: 2.3.0 > Support for Parquet Binary DecimalType in

[jira] [Comment Edited] (SPARK-23310) Perf regression introduced by SPARK-21113

2018-02-12 Thread Nicolas Poggi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361071#comment-16361071 ] Nicolas Poggi edited comment on SPARK-23310 at 2/12/18 6:35 PM: Q72 of

[jira] [Created] (SPARK-23398) DataSourceV2 should provide a way to get the source schema

2018-02-12 Thread Ryan Blue (JIRA)
Ryan Blue created SPARK-23398: - Summary: DataSourceV2 should provide a way to get the source schema Key: SPARK-23398 URL: https://issues.apache.org/jira/browse/SPARK-23398 Project: Spark Issue

[jira] [Updated] (SPARK-23398) DataSourceV2 should provide a way to get a source's schema.

2018-02-12 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Blue updated SPARK-23398: -- Summary: DataSourceV2 should provide a way to get a source's schema. (was: DataSourceV2 should

[jira] [Commented] (SPARK-23377) Bucketizer with multiple columns persistence bug

2018-02-12 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361154#comment-16361154 ] Joseph K. Bradley commented on SPARK-23377: --- [~viirya]'s patch currently changes

[jira] [Updated] (SPARK-23377) Bucketizer with multiple columns persistence bug

2018-02-12 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-23377: -- Priority: Critical (was: Major) > Bucketizer with multiple columns persistence bug >

[jira] [Commented] (SPARK-23310) Perf regression introduced by SPARK-21113

2018-02-12 Thread Nicolas Poggi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361071#comment-16361071 ] Nicolas Poggi commented on SPARK-23310: --- Q72 of TPC-DS is also affected around 30% at scale factor

[jira] [Resolved] (SPARK-23390) Flaky Test Suite: FileBasedDataSourceSuite in Spark 2.3/hadoop 2.7

2018-02-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-23390. - Resolution: Fixed Assignee: Wenchen Fan Fix Version/s: 2.3.0 > Flaky Test Suite:

[jira] [Commented] (SPARK-20327) Add CLI support for YARN custom resources, like GPUs

2018-02-12 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16360944#comment-16360944 ] Marcelo Vanzin commented on SPARK-20327: bq. I think the point is, without reflection, using 3.x+

[jira] [Commented] (SPARK-20327) Add CLI support for YARN custom resources, like GPUs

2018-02-12 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16360933#comment-16360933 ] Sean Owen commented on SPARK-20327: --- I think the point is, without reflection, using 3.x+ APIs with 2.x

[jira] [Comment Edited] (SPARK-20327) Add CLI support for YARN custom resources, like GPUs

2018-02-12 Thread Szilard Nemeth (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16360904#comment-16360904 ] Szilard Nemeth edited comment on SPARK-20327 at 2/12/18 3:43 PM: - Hey

[jira] [Commented] (SPARK-20327) Add CLI support for YARN custom resources, like GPUs

2018-02-12 Thread Szilard Nemeth (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16360904#comment-16360904 ] Szilard Nemeth commented on SPARK-20327: Hey [~vanzin]! I see what you said about compatibility.

[jira] [Resolved] (SPARK-23391) It may lead to overflow for some integer multiplication

2018-02-12 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-23391. --- Resolution: Fixed Fix Version/s: 2.3.0 2.2.2 Issue resolved by pull

[jira] [Assigned] (SPARK-23391) It may lead to overflow for some integer multiplication

2018-02-12 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reassigned SPARK-23391: - Assignee: liuxian > It may lead to overflow for some integer multiplication >

[jira] [Commented] (SPARK-23308) ignoreCorruptFiles should not ignore retryable IOException

2018-02-12 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16360809#comment-16360809 ] Steve Loughran commented on SPARK-23308: BTW bq I should get at least ~82k partitions, thus the

[jira] [Comment Edited] (SPARK-23397) Scheduling delay causes Spark Streaming to miss batches.

2018-02-12 Thread Shahbaz Hussain (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16360802#comment-16360802 ] Shahbaz Hussain edited comment on SPARK-23397 at 2/12/18 2:29 PM: -- can

[jira] [Commented] (SPARK-23397) Scheduling delay causes Spark Streaming to miss batches.

2018-02-12 Thread Shahbaz Hussain (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16360802#comment-16360802 ] Shahbaz Hussain commented on SPARK-23397: - can we be able to make job creation a only once and

[jira] [Commented] (SPARK-23397) Scheduling delay causes Spark Streaming to miss batches.

2018-02-12 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16360798#comment-16360798 ] Sean Owen commented on SPARK-23397: --- That sounds correct. The next batch executes as soon as possible.

[jira] [Commented] (SPARK-23397) Scheduling delay causes Spark Streaming to miss batches.

2018-02-12 Thread Shahbaz Hussain (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16360793#comment-16360793 ] Shahbaz Hussain commented on SPARK-23397: - Yes ,if current Batch Processing time is greater than

[jira] [Updated] (SPARK-23396) Spark HistoryServer will OMM if the event log is big

2018-02-12 Thread KaiXinXIaoLei (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] KaiXinXIaoLei updated SPARK-23396: -- Description: if the event log is  big, the historyServer web will be out of memory . My

[jira] [Updated] (SPARK-23392) Add some test case for images feature

2018-02-12 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-23392: -- Priority: Trivial (was: Major) > Add some test case for images feature >

[jira] [Updated] (SPARK-23396) Spark HistoryServer will OMM if the event log is big

2018-02-12 Thread KaiXinXIaoLei (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] KaiXinXIaoLei updated SPARK-23396: -- Description: if the event log is  big, the historyServer web will be out of memory . My

[jira] [Updated] (SPARK-23396) Spark HistoryServer will OMM if the event log is big

2018-02-12 Thread KaiXinXIaoLei (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] KaiXinXIaoLei updated SPARK-23396: -- Attachment: eventlog.png > Spark HistoryServer will OMM if the event log is big >

[jira] [Updated] (SPARK-23396) Spark HistoryServer will OMM if the event log is big

2018-02-12 Thread KaiXinXIaoLei (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] KaiXinXIaoLei updated SPARK-23396: -- Description: if the event log is  big, the historyServer web will be out of memory

[jira] [Updated] (SPARK-23396) Spark HistoryServer will OMM if the event log is big

2018-02-12 Thread KaiXinXIaoLei (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] KaiXinXIaoLei updated SPARK-23396: -- Description: if the event log is  big, the historyServer web will be out of memory

[jira] [Updated] (SPARK-23396) Spark HistoryServer will OMM if the event log is big

2018-02-12 Thread KaiXinXIaoLei (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] KaiXinXIaoLei updated SPARK-23396: -- Attachment: (was: eventlog.png) > Spark HistoryServer will OMM if the event log is big >

[jira] [Commented] (SPARK-23397) Scheduling delay causes Spark Streaming to miss batches.

2018-02-12 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16360789#comment-16360789 ] Sean Owen commented on SPARK-23397: --- This is how it's supposed to work. Batches don't overlap. If one

[jira] [Resolved] (SPARK-23343) Increase the exception test for the bind port

2018-02-12 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-23343. --- Resolution: Won't Fix > Increase the exception test for the bind port >

[jira] [Created] (SPARK-23397) Scheduling delay causes Spark Streaming to miss batches.

2018-02-12 Thread Shahbaz Hussain (JIRA)
Shahbaz Hussain created SPARK-23397: --- Summary: Scheduling delay causes Spark Streaming to miss batches. Key: SPARK-23397 URL: https://issues.apache.org/jira/browse/SPARK-23397 Project: Spark

[jira] [Updated] (SPARK-23396) Spark HistoryServer will OMM if the event log is big

2018-02-12 Thread KaiXinXIaoLei (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] KaiXinXIaoLei updated SPARK-23396: -- Attachment: eventlog.png > Spark HistoryServer will OMM if the event log is big >

[jira] [Updated] (SPARK-23396) Spark HistoryServer will OMM if the event log is big

2018-02-12 Thread KaiXinXIaoLei (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] KaiXinXIaoLei updated SPARK-23396: -- Attachment: (was: historyServer.png) > Spark HistoryServer will OMM if the event log is

[jira] [Updated] (SPARK-23396) Spark HistoryServer will OMM if the event log is big

2018-02-12 Thread KaiXinXIaoLei (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] KaiXinXIaoLei updated SPARK-23396: -- Description: if the event log is  big, the historyServer web will be out of memory  

[jira] [Commented] (SPARK-23396) Spark HistoryServer will OMM if the event log is big

2018-02-12 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16360784#comment-16360784 ] Sean Owen commented on SPARK-23396: --- This is far too vague. It seems to overlap with recent

[jira] [Updated] (SPARK-23396) Spark HistoryServer will OMM if the event log is big

2018-02-12 Thread KaiXinXIaoLei (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] KaiXinXIaoLei updated SPARK-23396: -- Attachment: historyServer.png > Spark HistoryServer will OMM if the event log is big >

[jira] [Updated] (SPARK-23396) Spark HistoryServer will OMM if the event log is big

2018-02-12 Thread KaiXinXIaoLei (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] KaiXinXIaoLei updated SPARK-23396: -- Attachment: historyServer.png > Spark HistoryServer will OMM if the event log is big >

[jira] [Updated] (SPARK-23396) Spark HistoryServer will OMM if the event log is big

2018-02-12 Thread KaiXinXIaoLei (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] KaiXinXIaoLei updated SPARK-23396: -- Attachment: (was: historyServer.png) > Spark HistoryServer will OMM if the event log is

[jira] [Created] (SPARK-23396) Spark HistoryServer will OMM if the event log is big

2018-02-12 Thread KaiXinXIaoLei (JIRA)
KaiXinXIaoLei created SPARK-23396: - Summary: Spark HistoryServer will OMM if the event log is big Key: SPARK-23396 URL: https://issues.apache.org/jira/browse/SPARK-23396 Project: Spark Issue

  1   2   >