[jira] [Commented] (SPARK-21945) pyspark --py-files doesn't work in yarn client mode

2018-05-17 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16480199#comment-16480199 ] Dongjoon Hyun commented on SPARK-21945: --- Thank you for verifying and including this in 2.3.1 RC2,

[jira] [Commented] (SPARK-16317) Add file filtering interface for FileFormat

2018-05-17 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16480127#comment-16480127 ] Xiao Li commented on SPARK-16317: - We will not improve FileFormat since we are migrating the

[jira] [Resolved] (SPARK-16317) Add file filtering interface for FileFormat

2018-05-17 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-16317. - Resolution: Won't Fix > Add file filtering interface for FileFormat >

[jira] [Updated] (SPARK-20758) Add Constant propagation optimization

2018-05-17 Thread Jinhua Fu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jinhua Fu updated SPARK-20758: -- Issue Type: New JIRA Project (was: Improvement) > Add Constant propagation optimization >

[jira] [Updated] (SPARK-22371) dag-scheduler-event-loop thread stopped with error Attempted to access garbage collected accumulator 5605982

2018-05-17 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-22371: Fix Version/s: 2.3.1 > dag-scheduler-event-loop thread stopped with error Attempted to access >

[jira] [Resolved] (SPARK-22884) ML test for StructuredStreaming: spark.ml.clustering

2018-05-17 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-22884. --- Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 21358

[jira] [Assigned] (SPARK-22884) ML test for StructuredStreaming: spark.ml.clustering

2018-05-17 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley reassigned SPARK-22884: - Assignee: Sandor Murakozi > ML test for StructuredStreaming:

[jira] [Commented] (SPARK-22884) ML test for StructuredStreaming: spark.ml.clustering

2018-05-17 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16479814#comment-16479814 ] Apache Spark commented on SPARK-22884: -- User 'jkbradley' has created a pull request for this issue:

[jira] [Updated] (SPARK-24309) AsyncEventQueue should handle an interrupt from a Listener

2018-05-17 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Imran Rashid updated SPARK-24309: - Target Version/s: 2.3.1 > AsyncEventQueue should handle an interrupt from a Listener >

[jira] [Updated] (SPARK-24309) AsyncEventQueue should handle an interrupt from a Listener

2018-05-17 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Imran Rashid updated SPARK-24309: - Priority: Blocker (was: Major) > AsyncEventQueue should handle an interrupt from a Listener >

[jira] [Updated] (SPARK-21945) pyspark --py-files doesn't work in yarn client mode

2018-05-17 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin updated SPARK-21945: --- Fix Version/s: 2.3.1 > pyspark --py-files doesn't work in yarn client mode >

[jira] [Commented] (SPARK-21945) pyspark --py-files doesn't work in yarn client mode

2018-05-17 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16479765#comment-16479765 ] Marcelo Vanzin commented on SPARK-21945: This works for cluster mode and pyspark (shell), but

[jira] [Assigned] (SPARK-24311) Refactor HDFSBackedStateStoreProvider to remove duplicated logic between operations on delta file and snapshot file

2018-05-17 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24311: Assignee: Apache Spark > Refactor HDFSBackedStateStoreProvider to remove duplicated logic

[jira] [Commented] (SPARK-24311) Refactor HDFSBackedStateStoreProvider to remove duplicated logic between operations on delta file and snapshot file

2018-05-17 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16479728#comment-16479728 ] Apache Spark commented on SPARK-24311: -- User 'HeartSaVioR' has created a pull request for this

[jira] [Updated] (SPARK-24311) Refactor HDFSBackedStateStoreProvider to remove duplicated logic between operations on delta file and snapshot file

2018-05-17 Thread Jungtaek Lim (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim updated SPARK-24311: - Description: The structure of delta file and snapshot file is same, but the operations are

[jira] [Assigned] (SPARK-24311) Refactor HDFSBackedStateStoreProvider to remove duplicated logic between operations on delta file and snapshot file

2018-05-17 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24311: Assignee: (was: Apache Spark) > Refactor HDFSBackedStateStoreProvider to remove

[jira] [Created] (SPARK-24311) Refactor HDFSBackedStateStoreProvide to remove duplicated logic between operations on delta file and operations on snapshot file

2018-05-17 Thread Jungtaek Lim (JIRA)
Jungtaek Lim created SPARK-24311: Summary: Refactor HDFSBackedStateStoreProvide to remove duplicated logic between operations on delta file and operations on snapshot file Key: SPARK-24311 URL:

[jira] [Updated] (SPARK-24311) Refactor HDFSBackedStateStoreProvider to remove duplicated logic between operations on delta file and snapshot file

2018-05-17 Thread Jungtaek Lim (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim updated SPARK-24311: - Summary: Refactor HDFSBackedStateStoreProvider to remove duplicated logic between operations on

[jira] [Updated] (SPARK-24311) Refactor HDFSBackedStateStoreProvide to remove duplicated logic between operations on delta file and snapshot file

2018-05-17 Thread Jungtaek Lim (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim updated SPARK-24311: - Summary: Refactor HDFSBackedStateStoreProvide to remove duplicated logic between operations on

[jira] [Assigned] (SPARK-24309) AsyncEventQueue should handle an interrupt from a Listener

2018-05-17 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24309: Assignee: (was: Apache Spark) > AsyncEventQueue should handle an interrupt from a

[jira] [Assigned] (SPARK-24309) AsyncEventQueue should handle an interrupt from a Listener

2018-05-17 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24309: Assignee: Apache Spark > AsyncEventQueue should handle an interrupt from a Listener >

[jira] [Commented] (SPARK-24309) AsyncEventQueue should handle an interrupt from a Listener

2018-05-17 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16479688#comment-16479688 ] Apache Spark commented on SPARK-24309: -- User 'squito' has created a pull request for this issue:

[jira] [Commented] (SPARK-24114) improve instrumentation for spark.ml.recommendation

2018-05-17 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16479686#comment-16479686 ] Apache Spark commented on SPARK-24114: -- User 'MrBago' has created a pull request for this issue:

[jira] [Issue Comment Deleted] (SPARK-24114) improve instrumentation for spark.ml.recommendation

2018-05-17 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-24114: -- Comment: was deleted (was: User 'MrBago' has created a pull request for this issue:

[jira] [Resolved] (SPARK-24310) Instrumentation for frequent pattern mining

2018-05-17 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley resolved SPARK-24310. --- Resolution: Fixed Fix Version/s: 2.4.0 > Instrumentation for frequent pattern

[jira] [Commented] (SPARK-24310) Instrumentation for frequent pattern mining

2018-05-17 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16479684#comment-16479684 ] Joseph K. Bradley commented on SPARK-24310: --- The PR for this was linked to the wrong JIRA, but

[jira] [Created] (SPARK-24310) Instrumentation for frequent pattern mining

2018-05-17 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-24310: - Summary: Instrumentation for frequent pattern mining Key: SPARK-24310 URL: https://issues.apache.org/jira/browse/SPARK-24310 Project: Spark Issue

[jira] [Assigned] (SPARK-24114) improve instrumentation for spark.ml.recommendation

2018-05-17 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley reassigned SPARK-24114: - Assignee: (was: Bago Amirbekian) > improve instrumentation for

[jira] [Updated] (SPARK-24114) improve instrumentation for spark.ml.recommendation

2018-05-17 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-24114: -- Shepherd: (was: Joseph K. Bradley) > improve instrumentation for

[jira] [Created] (SPARK-24309) AsyncEventQueue should handle an interrupt from a Listener

2018-05-17 Thread Imran Rashid (JIRA)
Imran Rashid created SPARK-24309: Summary: AsyncEventQueue should handle an interrupt from a Listener Key: SPARK-24309 URL: https://issues.apache.org/jira/browse/SPARK-24309 Project: Spark

[jira] [Updated] (SPARK-24114) improve instrumentation for spark.ml.recommendation

2018-05-17 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-24114: -- Shepherd: Joseph K. Bradley > improve instrumentation for spark.ml.recommendation >

[jira] [Assigned] (SPARK-24114) improve instrumentation for spark.ml.recommendation

2018-05-17 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley reassigned SPARK-24114: - Assignee: Bago Amirbekian > improve instrumentation for spark.ml.recommendation

[jira] [Commented] (SPARK-6238) Support shuffle where individual blocks might be > 2G

2018-05-17 Thread William Shen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16479645#comment-16479645 ] William Shen commented on SPARK-6238: - [~irashid], you marked this as a duplicate. Can you also mark

[jira] [Commented] (SPARK-21945) pyspark --py-files doesn't work in yarn client mode

2018-05-17 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16479610#comment-16479610 ] Marcelo Vanzin commented on SPARK-21945: Let me run some tests and if they're ok, I'll cherry

[jira] [Commented] (SPARK-21945) pyspark --py-files doesn't work in yarn client mode

2018-05-17 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16479574#comment-16479574 ] Dongjoon Hyun commented on SPARK-21945: --- Hi, [~vanzin], [~tgraves], [~hyukjin.kwon]. I'm wondering

[jira] [Resolved] (SPARK-23195) Hint of cached data is lost

2018-05-17 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-23195. - Resolution: Won't Fix > Hint of cached data is lost > --- > >

[jira] [Updated] (SPARK-23571) Delete auxiliary Kubernetes resources upon application completion

2018-05-17 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin updated SPARK-23571: --- Target Version/s: (was: 2.3.1) > Delete auxiliary Kubernetes resources upon application

[jira] [Commented] (SPARK-23571) Delete auxiliary Kubernetes resources upon application completion

2018-05-17 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16479541#comment-16479541 ] Marcelo Vanzin commented on SPARK-23571: I removed the target version since the PR seems stalled

[jira] [Commented] (SPARK-23486) LookupFunctions should not check the same function name more than once

2018-05-17 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16479539#comment-16479539 ] Marcelo Vanzin commented on SPARK-23486: I removed the target versions since the PR seems stalled

[jira] [Updated] (SPARK-23486) LookupFunctions should not check the same function name more than once

2018-05-17 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin updated SPARK-23486: --- Target Version/s: (was: 2.3.1, 2.4.0) > LookupFunctions should not check the same function

[jira] [Commented] (SPARK-23195) Hint of cached data is lost

2018-05-17 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16479535#comment-16479535 ] Marcelo Vanzin commented on SPARK-23195: [~smilegator] doesn't look like you're actively working

[jira] [Updated] (SPARK-23195) Hint of cached data is lost

2018-05-17 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin updated SPARK-23195: --- Target Version/s: (was: 2.3.1) > Hint of cached data is lost > ---

[jira] [Assigned] (SPARK-23195) Hint of cached data is lost

2018-05-17 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin reassigned SPARK-23195: -- Assignee: Marcelo Vanzin (was: Xiao Li) > Hint of cached data is lost >

[jira] [Assigned] (SPARK-23195) Hint of cached data is lost

2018-05-17 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin reassigned SPARK-23195: -- Assignee: (was: Marcelo Vanzin) > Hint of cached data is lost >

[jira] [Resolved] (SPARK-24115) improve instrumentation for spark.ml.tuning

2018-05-17 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-24115. --- Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 21340

[jira] [Assigned] (SPARK-24115) improve instrumentation for spark.ml.tuning

2018-05-17 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng reassigned SPARK-24115: - Assignee: Bago Amirbekian > improve instrumentation for spark.ml.tuning >

[jira] [Assigned] (SPARK-24308) Handle DataReaderFactory to InputPartition renames in left over classes

2018-05-17 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24308: Assignee: (was: Apache Spark) > Handle DataReaderFactory to InputPartition renames in

[jira] [Assigned] (SPARK-24308) Handle DataReaderFactory to InputPartition renames in left over classes

2018-05-17 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24308: Assignee: Apache Spark > Handle DataReaderFactory to InputPartition renames in left over

[jira] [Commented] (SPARK-24308) Handle DataReaderFactory to InputPartition renames in left over classes

2018-05-17 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16479485#comment-16479485 ] Apache Spark commented on SPARK-24308: -- User 'arunmahadevan' has created a pull request for this

[jira] [Created] (SPARK-24308) Handle DataReaderFactory to InputPartition renames in left over classes

2018-05-17 Thread Arun Mahadevan (JIRA)
Arun Mahadevan created SPARK-24308: -- Summary: Handle DataReaderFactory to InputPartition renames in left over classes Key: SPARK-24308 URL: https://issues.apache.org/jira/browse/SPARK-24308 Project:

[jira] [Commented] (SPARK-24307) Support sending messages over 2GB from memory

2018-05-17 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16479439#comment-16479439 ] Imran Rashid commented on SPARK-24307: -- I have a really hacky version of this now, I plan to clean

[jira] [Commented] (SPARK-24283) Make standard scaler work without legacy MLlib

2018-05-17 Thread Huaxin Gao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16479405#comment-16479405 ] Huaxin Gao commented on SPARK-24283: [~holdenk] Hi Holden, all I need to do is to change  {code:java}

[jira] [Created] (SPARK-24307) Support sending messages over 2GB from memory

2018-05-17 Thread Imran Rashid (JIRA)
Imran Rashid created SPARK-24307: Summary: Support sending messages over 2GB from memory Key: SPARK-24307 URL: https://issues.apache.org/jira/browse/SPARK-24307 Project: Spark Issue Type:

[jira] [Updated] (SPARK-24285) Flaky test: ContinuousSuite.query without test harness

2018-05-17 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-24285: -- Description: -

[jira] [Commented] (SPARK-24304) Scheduler changes for continuous processing shuffle support

2018-05-17 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16479133#comment-16479133 ] Apache Spark commented on SPARK-24304: -- User 'xuanyuanking' has created a pull request for this

[jira] [Assigned] (SPARK-24304) Scheduler changes for continuous processing shuffle support

2018-05-17 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24304: Assignee: (was: Apache Spark) > Scheduler changes for continuous processing shuffle

[jira] [Assigned] (SPARK-24304) Scheduler changes for continuous processing shuffle support

2018-05-17 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24304: Assignee: Apache Spark > Scheduler changes for continuous processing shuffle support >

[jira] [Assigned] (SPARK-24193) Sort by disk when number of limit is big in TakeOrderedAndProjectExec

2018-05-17 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-24193: --- Assignee: jin xing > Sort by disk when number of limit is big in TakeOrderedAndProjectExec

[jira] [Resolved] (SPARK-24193) Sort by disk when number of limit is big in TakeOrderedAndProjectExec

2018-05-17 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-24193. - Resolution: Fixed Fix Version/s: 2.4.0 Issue resolved by pull request 21252

[jira] [Updated] (SPARK-24002) Task not serializable caused by org.apache.parquet.io.api.Binary$ByteBufferBackedBinary.getBytes

2018-05-17 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-24002: Fix Version/s: 2.3.1 > Task not serializable caused by >

[jira] [Commented] (SPARK-24036) Stateful operators in continuous processing

2018-05-17 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16479122#comment-16479122 ] Apache Spark commented on SPARK-24036: -- User 'xuanyuanking' has created a pull request for this

[jira] [Assigned] (SPARK-24036) Stateful operators in continuous processing

2018-05-17 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24036: Assignee: (was: Apache Spark) > Stateful operators in continuous processing >

[jira] [Assigned] (SPARK-24036) Stateful operators in continuous processing

2018-05-17 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24036: Assignee: Apache Spark > Stateful operators in continuous processing >

[jira] [Updated] (SPARK-24306) Sort a Dataset with a lambda (like RDD.sortBy)

2018-05-17 Thread Al M (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Al M updated SPARK-24306: - Summary: Sort a Dataset with a lambda (like RDD.sortBy) (was: Sort a Dataset with a lambda (like RDD.sortBy()

[jira] [Updated] (SPARK-24306) Sort a Dataset with a lambda (like RDD.sortBy() )

2018-05-17 Thread Al M (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Al M updated SPARK-24306: - Summary: Sort a Dataset with a lambda (like RDD.sortBy() ) (was: Sort a Dataset with a lambda (like

[jira] [Assigned] (SPARK-23922) High-order function: arrays_overlap(x, y) → boolean

2018-05-17 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-23922: --- Assignee: Marco Gaido > High-order function: arrays_overlap(x, y) → boolean >

[jira] [Resolved] (SPARK-23922) High-order function: arrays_overlap(x, y) → boolean

2018-05-17 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-23922. - Resolution: Fixed Fix Version/s: 2.4.0 Issue resolved by pull request 21028

[jira] [Created] (SPARK-24306) Sort a Dataset with a lambda (like RDD.sortBy()

2018-05-17 Thread Al M (JIRA)
Al M created SPARK-24306: Summary: Sort a Dataset with a lambda (like RDD.sortBy() Key: SPARK-24306 URL: https://issues.apache.org/jira/browse/SPARK-24306 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-24305) Avoid serialization of private fields in new collection expressions

2018-05-17 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16478946#comment-16478946 ] Apache Spark commented on SPARK-24305: -- User 'mn-mikke' has created a pull request for this issue:

[jira] [Assigned] (SPARK-24305) Avoid serialization of private fields in new collection expressions

2018-05-17 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24305: Assignee: (was: Apache Spark) > Avoid serialization of private fields in new

[jira] [Assigned] (SPARK-24305) Avoid serialization of private fields in new collection expressions

2018-05-17 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24305: Assignee: Apache Spark > Avoid serialization of private fields in new collection

[jira] [Created] (SPARK-24305) Avoid serialization of private fields in new collection expressions

2018-05-17 Thread Marek Novotny (JIRA)
Marek Novotny created SPARK-24305: - Summary: Avoid serialization of private fields in new collection expressions Key: SPARK-24305 URL: https://issues.apache.org/jira/browse/SPARK-24305 Project: Spark

[jira] [Assigned] (SPARK-22371) dag-scheduler-event-loop thread stopped with error Attempted to access garbage collected accumulator 5605982

2018-05-17 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-22371: --- Assignee: Artem Rudoy > dag-scheduler-event-loop thread stopped with error Attempted to

[jira] [Resolved] (SPARK-22371) dag-scheduler-event-loop thread stopped with error Attempted to access garbage collected accumulator 5605982

2018-05-17 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-22371. - Resolution: Fixed Fix Version/s: 2.4.0 Issue resolved by pull request 21114

[jira] [Commented] (SPARK-19228) inferSchema function processed csv date column as string and "dateFormat" DataSource option is ignored

2018-05-17 Thread Sergey Rubtsov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16478811#comment-16478811 ] Sergey Rubtsov commented on SPARK-19228: Java 8 contains new java.time module, also it can fix an

[jira] [Created] (SPARK-24304) Scheduler changes for continuous processing shuffle support

2018-05-17 Thread Li Yuanjian (JIRA)
Li Yuanjian created SPARK-24304: --- Summary: Scheduler changes for continuous processing shuffle support Key: SPARK-24304 URL: https://issues.apache.org/jira/browse/SPARK-24304 Project: Spark

[jira] [Commented] (SPARK-24298) PCAModel Memory in Pipeline

2018-05-17 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16478802#comment-16478802 ] Marco Gaido commented on SPARK-24298: - May you please provide a small program/simple list of steps to

[jira] [Commented] (SPARK-24288) Enable preventing predicate pushdown

2018-05-17 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16478671#comment-16478671 ] Wenchen Fan commented on SPARK-24288: - You can take a look at `ResolveReferences#dedupRight`, which

[jira] [Commented] (SPARK-23928) High-order function: shuffle(x) → array

2018-05-17 Thread H Lu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16478619#comment-16478619 ] H Lu commented on SPARK-23928: -- Update: running unit tests and having a spark contributor in the team to

[jira] [Updated] (SPARK-24065) Issue with the property IgnoreLeadingWhiteSpace

2018-05-17 Thread Varsha Chandrashekar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varsha Chandrashekar updated SPARK-24065: - Component/s: Spark Shell > Issue with the property IgnoreLeadingWhiteSpace >

[jira] [Commented] (SPARK-24002) Task not serializable caused by org.apache.parquet.io.api.Binary$ByteBufferBackedBinary.getBytes

2018-05-17 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16478596#comment-16478596 ] Apache Spark commented on SPARK-24002: -- User 'gatorsmile' has created a pull request for this issue: