[jira] [Commented] (SPARK-2048) Optimizations to CPU usage of external spilling code

2014-09-07 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14125247#comment-14125247 ] Matei Zaharia commented on SPARK-2048: -- Yeah, sounds good, thanks for pointing that o

[jira] [Resolved] (SPARK-2048) Optimizations to CPU usage of external spilling code

2014-09-07 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia resolved SPARK-2048. -- Resolution: Fixed > Optimizations to CPU usage of external spilling code > -

[jira] [Assigned] (SPARK-2048) Optimizations to CPU usage of external spilling code

2014-09-07 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia reassigned SPARK-2048: Assignee: Matei Zaharia > Optimizations to CPU usage of external spilling code > --

[jira] [Resolved] (SPARK-3394) TakeOrdered crashes when limit is 0

2014-09-07 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia resolved SPARK-3394. -- Resolution: Fixed Fix Version/s: 1.0.3 1.2.0 1.1.1

[jira] [Updated] (SPARK-3394) TakeOrdered crashes when limit is 0

2014-09-07 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia updated SPARK-3394: - Component/s: Spark Core > TakeOrdered crashes when limit is 0 > --

[jira] [Resolved] (SPARK-3353) Stage id monotonicity (parent stage should have lower stage id)

2014-09-06 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia resolved SPARK-3353. -- Resolution: Fixed Fix Version/s: 1.2.0 > Stage id monotonicity (parent stage should have

[jira] [Resolved] (SPARK-3211) .take() is OOM-prone when there are empty partitions

2014-09-05 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia resolved SPARK-3211. -- Resolution: Fixed Fix Version/s: 1.2.0 1.1.1 > .take() is OOM-prone wh

[jira] [Updated] (SPARK-3211) .take() is OOM-prone when there are empty partitions

2014-09-05 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia updated SPARK-3211: - Assignee: Andrew Ash > .take() is OOM-prone when there are empty partitions >

[jira] [Updated] (SPARK-3211) .take() is OOM-prone when there are empty partitions

2014-09-05 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia updated SPARK-3211: - Target Version/s: 1.1.1, 1.2.0 > .take() is OOM-prone when there are empty partitions > --

[jira] [Commented] (SPARK-640) Update Hadoop 1 version to 1.1.0 (especially on AMIs)

2014-09-04 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14121883#comment-14121883 ] Matei Zaharia commented on SPARK-640: - [~pwendell] what is our Hadoop 1 version on AMIs

[jira] [Commented] (SPARK-3215) Add remote interface for SparkContext

2014-09-03 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120949#comment-14120949 ] Matei Zaharia commented on SPARK-3215: -- With SparkConf, the problem is that some of t

[jira] [Commented] (SPARK-3215) Add remote interface for SparkContext

2014-09-03 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120657#comment-14120657 ] Matei Zaharia commented on SPARK-3215: -- Thanks Marcelo! Just a few notes on the API:

[jira] [Resolved] (SPARK-3098) In some cases, operation zipWithIndex get a wrong results

2014-09-02 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia resolved SPARK-3098. -- Resolution: Won't Fix > In some cases, operation zipWithIndex get a wrong results > ---

[jira] [Created] (SPARK-3356) Document when RDD elements' ordering within partitions is nondeterministic

2014-09-02 Thread Matei Zaharia (JIRA)
Matei Zaharia created SPARK-3356: Summary: Document when RDD elements' ordering within partitions is nondeterministic Key: SPARK-3356 URL: https://issues.apache.org/jira/browse/SPARK-3356 Project: Spa

[jira] [Commented] (SPARK-3098) In some cases, operation zipWithIndex get a wrong results

2014-09-02 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14118918#comment-14118918 ] Matei Zaharia commented on SPARK-3098: -- Created SPARK-3356 to track this. > In some

[jira] [Commented] (SPARK-3098) In some cases, operation zipWithIndex get a wrong results

2014-09-02 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14118913#comment-14118913 ] Matei Zaharia commented on SPARK-3098: -- Yup, let's maybe document this for now. I'll

[jira] [Resolved] (SPARK-3052) Misleading and spurious FileSystem closed errors whenever a job fails while reading from Hadoop

2014-09-02 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia resolved SPARK-3052. -- Resolution: Fixed Fix Version/s: 1.2.0 > Misleading and spurious FileSystem closed errors

[jira] [Updated] (SPARK-3052) Misleading and spurious FileSystem closed errors whenever a job fails while reading from Hadoop

2014-09-02 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia updated SPARK-3052: - Assignee: Sandy Ryza > Misleading and spurious FileSystem closed errors whenever a job fails while

[jira] [Resolved] (SPARK-3342) m3 instances don't get local SSDs

2014-09-01 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia resolved SPARK-3342. -- Resolution: Fixed Fix Version/s: 1.1.0 > m3 instances don't get local SSDs >

[jira] [Commented] (SPARK-3342) m3 instances don't get local SSDs

2014-09-01 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14117900#comment-14117900 ] Matei Zaharia commented on SPARK-3342: -- In particular see http://docs.aws.amazon.com

[jira] [Created] (SPARK-3342) m3 instances don't get local SSDs

2014-09-01 Thread Matei Zaharia (JIRA)
Matei Zaharia created SPARK-3342: Summary: m3 instances don't get local SSDs Key: SPARK-3342 URL: https://issues.apache.org/jira/browse/SPARK-3342 Project: Spark Issue Type: Bug Com

[jira] [Commented] (SPARK-3098) In some cases, operation zipWithIndex get a wrong results

2014-09-01 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14117622#comment-14117622 ] Matei Zaharia commented on SPARK-3098: -- It's true that the ordering of values after a

[jira] [Commented] (SPARK-3333) Large number of partitions causes OOM

2014-08-31 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14116942#comment-14116942 ] Matei Zaharia commented on SPARK-: -- I see, that makes sense. > Large number of p

[jira] [Commented] (SPARK-3333) Large number of partitions causes OOM

2014-08-31 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14116923#comment-14116923 ] Matei Zaharia commented on SPARK-: -- The slowdown might be partly due to adding ex

[jira] [Updated] (SPARK-3010) fix redundant conditional

2014-08-31 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia updated SPARK-3010: - Priority: Trivial (was: Major) > fix redundant conditional > - > >

[jira] [Resolved] (SPARK-3010) fix redundant conditional

2014-08-31 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia resolved SPARK-3010. -- Resolution: Fixed Fix Version/s: (was: 1.1.0) 1.2.0 Ta

[jira] [Updated] (SPARK-3010) fix redundant conditional

2014-08-31 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia updated SPARK-3010: - Assignee: wangfei > fix redundant conditional > - > > Key:

[jira] [Resolved] (SPARK-3318) The documentation for addFiles is wrong

2014-08-30 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia resolved SPARK-3318. -- Resolution: Fixed Fix Version/s: 1.2.0 > The documentation for addFiles is wrong > -

[jira] [Updated] (SPARK-3318) The documentation for addFiles is wrong

2014-08-30 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia updated SPARK-3318: - Assignee: Holden Karau > The documentation for addFiles is wrong > --

[jira] [Resolved] (SPARK-2889) Spark creates Hadoop Configuration objects inconsistently

2014-08-30 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia resolved SPARK-2889. -- Resolution: Fixed Fix Version/s: 1.2.0 > Spark creates Hadoop Configuration objects inco

[jira] [Updated] (SPARK-3257) Enable :cp to add JARs in spark-shell (Scala 2.11)

2014-08-29 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia updated SPARK-3257: - Assignee: Heather Miller > Enable :cp to add JARs in spark-shell (Scala 2.11) > -

[jira] [Commented] (SPARK-3277) LZ4 compression cause the the ExternalSort exception

2014-08-28 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14114247#comment-14114247 ] Matei Zaharia commented on SPARK-3277: -- Thanks Mridul -- I think Andrew and Patrick h

[jira] [Commented] (SPARK-3215) Add remote interface for SparkContext

2014-08-27 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14113294#comment-14113294 ] Matei Zaharia commented on SPARK-3215: -- Okay, so my suggestion is do it separately fi

[jira] [Updated] (SPARK-3271) Delete unused methods in Utils

2014-08-27 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia updated SPARK-3271: - Assignee: wangfei > Delete unused methods in Utils > -- > >

[jira] [Updated] (SPARK-3265) Allow using custom ipython executable with pyspark

2014-08-27 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia updated SPARK-3265: - Affects Version/s: 1.1.0 > Allow using custom ipython executable with pyspark > -

[jira] [Resolved] (SPARK-3265) Allow using custom ipython executable with pyspark

2014-08-27 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia resolved SPARK-3265. -- Resolution: Fixed Fix Version/s: (was: 1.0.2) 1.2.0

[jira] [Resolved] (SPARK-3271) Delete unused methods in Utils

2014-08-27 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia resolved SPARK-3271. -- Resolution: Fixed > Delete unused methods in Utils > -- > >

[jira] [Created] (SPARK-3271) Delete unused methods in Utils

2014-08-27 Thread Matei Zaharia (JIRA)
Matei Zaharia created SPARK-3271: Summary: Delete unused methods in Utils Key: SPARK-3271 URL: https://issues.apache.org/jira/browse/SPARK-3271 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-3215) Add remote interface for SparkContext

2014-08-27 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14113121#comment-14113121 ] Matei Zaharia commented on SPARK-3215: -- The problem is just how different future or a

[jira] [Commented] (SPARK-3215) Add remote interface for SparkContext

2014-08-27 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14113055#comment-14113055 ] Matei Zaharia commented on SPARK-3215: -- As I mentioned above, there's more to it than

[jira] [Commented] (SPARK-3215) Add remote interface for SparkContext

2014-08-27 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14112943#comment-14112943 ] Matei Zaharia commented on SPARK-3215: -- I think we should try this externally first a

[jira] [Commented] (SPARK-3215) Add remote interface for SparkContext

2014-08-27 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14112791#comment-14112791 ] Matei Zaharia commented on SPARK-3215: -- Hey Marcelo, while this could be useful for S

[jira] [Resolved] (SPARK-3256) Enable :cp to add JARs in spark-shell (Scala 2.10)

2014-08-27 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia resolved SPARK-3256. -- Resolution: Fixed Fix Version/s: 1.2.0 > Enable :cp to add JARs in spark-shell (Scala 2.

[jira] [Updated] (SPARK-3256) Enable :cp to add JARs in spark-shell (Scala 2.10)

2014-08-27 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia updated SPARK-3256: - Assignee: Chip Senkbeil > Enable :cp to add JARs in spark-shell (Scala 2.10) > --

[jira] [Updated] (SPARK-3256) Enable :cp to add JARs in spark-shell (Scala 2.10)

2014-08-27 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia updated SPARK-3256: - Fix Version/s: (was: 1.2.0) > Enable :cp to add JARs in spark-shell (Scala 2.10) > --

[jira] [Updated] (SPARK-3256) Enable :cp to add JARs in spark-shell (Scala 2.10)

2014-08-27 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia updated SPARK-3256: - Summary: Enable :cp to add JARs in spark-shell (Scala 2.10) (was: Enable :cp to add JARs in spar

[jira] [Created] (SPARK-3257) Enable :cp to add JARs in spark-shell (Scala 2.11)

2014-08-27 Thread Matei Zaharia (JIRA)
Matei Zaharia created SPARK-3257: Summary: Enable :cp to add JARs in spark-shell (Scala 2.11) Key: SPARK-3257 URL: https://issues.apache.org/jira/browse/SPARK-3257 Project: Spark Issue Type:

[jira] [Created] (SPARK-3256) Enable :cp to add JARs in spark-shell

2014-08-27 Thread Matei Zaharia (JIRA)
Matei Zaharia created SPARK-3256: Summary: Enable :cp to add JARs in spark-shell Key: SPARK-3256 URL: https://issues.apache.org/jira/browse/SPARK-3256 Project: Spark Issue Type: New Feature

[jira] [Resolved] (SPARK-3239) Choose disks for spilling randomly

2014-08-27 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia resolved SPARK-3239. -- Resolution: Fixed Fix Version/s: 1.1.0 > Choose disks for spilling randomly > --

[jira] [Updated] (SPARK-3239) Choose disks for spilling randomly in PySpark

2014-08-27 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia updated SPARK-3239: - Summary: Choose disks for spilling randomly in PySpark (was: Choose disks for spilling randomly)

[jira] [Resolved] (SPARK-3240) Document workaround for MESOS-1688

2014-08-26 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia resolved SPARK-3240. -- Resolution: Fixed > Document workaround for MESOS-1688 > -- > >

[jira] [Created] (SPARK-3240) Document workaround for MESOS-1688

2014-08-26 Thread Matei Zaharia (JIRA)
Matei Zaharia created SPARK-3240: Summary: Document workaround for MESOS-1688 Key: SPARK-3240 URL: https://issues.apache.org/jira/browse/SPARK-3240 Project: Spark Issue Type: Documentation

[jira] [Updated] (SPARK-3240) Document workaround for MESOS-1688

2014-08-26 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia updated SPARK-3240: - Assignee: Martin Weindel > Document workaround for MESOS-1688 > -

[jira] [Resolved] (SPARK-3225) Typo in script

2014-08-26 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia resolved SPARK-3225. -- Resolution: Fixed Fix Version/s: 1.2.0 > Typo in script > -- > >

[jira] [Updated] (SPARK-3225) Typo in script

2014-08-26 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia updated SPARK-3225: - Assignee: WangTaoTheTonic > Typo in script > -- > > Key: SPARK-3225 >

[jira] [Updated] (SPARK-3225) Typo in script

2014-08-26 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia updated SPARK-3225: - Priority: Trivial (was: Minor) > Typo in script > -- > > Key: SPARK-

[jira] [Commented] (SPARK-2926) Add MR-style (merge-sort) SortShuffleReader for sort-based shuffle

2014-08-26 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14111631#comment-14111631 ] Matei Zaharia commented on SPARK-2926: -- I see, thanks for posting the benchmarks. Thi

[jira] [Resolved] (SPARK-3073) improve large sort (external sort) for PySpark

2014-08-26 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia resolved SPARK-3073. -- Resolution: Fixed Fix Version/s: 1.2.0 > improve large sort (external sort) for PySpark

[jira] [Resolved] (SPARK-2976) Replace tabs with spaces

2014-08-25 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia resolved SPARK-2976. -- Resolution: Fixed Fix Version/s: 1.2.0 > Replace tabs with spaces >

[jira] [Updated] (SPARK-2976) Replace tabs with spaces

2014-08-25 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia updated SPARK-2976: - Summary: Replace tabs with spaces (was: Too many ugly tabs instead of white spaces) > Replace t

[jira] [Updated] (SPARK-2976) Replace tabs with spaces

2014-08-25 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia updated SPARK-2976: - Assignee: Kousuke Saruta > Replace tabs with spaces > > >

[jira] [Commented] (SPARK-3098) In some cases, operation zipWithIndex get a wrong results

2014-08-25 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14110183#comment-14110183 ] Matei Zaharia commented on SPARK-3098: -- Sorry, I don't understand -- what exactly is

[jira] [Updated] (SPARK-3084) Collect broadcasted tables in parallel in joins

2014-08-17 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia updated SPARK-3084: - Target Version/s: 1.1.0 > Collect broadcasted tables in parallel in joins > -

[jira] [Created] (SPARK-3091) Add support for caching metadata on Parquet files

2014-08-17 Thread Matei Zaharia (JIRA)
Matei Zaharia created SPARK-3091: Summary: Add support for caching metadata on Parquet files Key: SPARK-3091 URL: https://issues.apache.org/jira/browse/SPARK-3091 Project: Spark Issue Type: N

[jira] [Updated] (SPARK-3085) Use compact data structures in SQL joins

2014-08-17 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia updated SPARK-3085: - Target Version/s: 1.1.0 > Use compact data structures in SQL joins >

[jira] [Updated] (SPARK-3085) Use compact data structures in SQL joins

2014-08-16 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia updated SPARK-3085: - Description: We can reuse the CompactBuffer from Spark Core. (was: We can reuse the CompactBuffe

[jira] [Created] (SPARK-3085) Use compact data structures in SQL joins

2014-08-16 Thread Matei Zaharia (JIRA)
Matei Zaharia created SPARK-3085: Summary: Use compact data structures in SQL joins Key: SPARK-3085 URL: https://issues.apache.org/jira/browse/SPARK-3085 Project: Spark Issue Type: Improvemen

[jira] [Created] (SPARK-3084) Collect broadcasted tables in parallel in joins

2014-08-16 Thread Matei Zaharia (JIRA)
Matei Zaharia created SPARK-3084: Summary: Collect broadcasted tables in parallel in joins Key: SPARK-3084 URL: https://issues.apache.org/jira/browse/SPARK-3084 Project: Spark Issue Type: Bug

[jira] [Resolved] (SPARK-2736) Create PySpark RDD from Apache Avro File

2014-08-14 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia resolved SPARK-2736. -- Resolution: Fixed Fix Version/s: 1.1.0 > Create PySpark RDD from Apache Avro File >

[jira] [Commented] (SPARK-2736) Create PySpark RDD from Apache Avro File

2014-08-14 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14097701#comment-14097701 ] Matei Zaharia commented on SPARK-2736: -- I bumped this up to "Major" because the PR al

[jira] [Updated] (SPARK-2736) Create PySpark RDD from Apache Avro File

2014-08-14 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia updated SPARK-2736: - Priority: Major (was: Minor) > Create PySpark RDD from Apache Avro File > --

[jira] [Resolved] (SPARK-2983) improve performance of sortByKey()

2014-08-13 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia resolved SPARK-2983. -- Resolution: Fixed Fix Version/s: 1.1.0 > improve performance of sortByKey() > --

[jira] [Commented] (SPARK-2967) Several SQL unit test failed when sort-based shuffle is enabled

2014-08-12 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093845#comment-14093845 ] Matei Zaharia commented on SPARK-2967: -- Good catch, this is a difference in behavior

[jira] [Commented] (SPARK-2962) Suboptimal scheduling in spark

2014-08-10 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14092418#comment-14092418 ] Matei Zaharia commented on SPARK-2962: -- I thought this was fixed in https://github.co

[jira] [Commented] (SPARK-2926) Add MR-style (merge-sort) SortShuffleReader for sort-based shuffle

2014-08-09 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14091693#comment-14091693 ] Matei Zaharia commented on SPARK-2926: -- Basically because of these things, it would b

[jira] [Commented] (SPARK-2926) Add MR-style (merge-sort) SortShuffleReader for sort-based shuffle

2014-08-09 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14091685#comment-14091685 ] Matei Zaharia commented on SPARK-2926: -- Hey Saisai, a couple of questions about this:

[jira] [Resolved] (SPARK-2787) Make sort-based shuffle write files directly when there is no sorting / aggregation and # of partitions is small

2014-08-07 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia resolved SPARK-2787. -- Resolution: Fixed Fix Version/s: 1.1.0 > Make sort-based shuffle write files directly wh

[jira] [Updated] (SPARK-2887) RDD.countApproxDistinct() is wrong when RDD has more one partition

2014-08-06 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia updated SPARK-2887: - Assignee: Davies Liu > RDD.countApproxDistinct() is wrong when RDD has more one partition > -

[jira] [Resolved] (SPARK-2294) TaskSchedulerImpl and TaskSetManager do not properly prioritize which tasks get assigned to an executor

2014-08-05 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia resolved SPARK-2294. -- Resolution: Fixed Fix Version/s: 1.1.0 > TaskSchedulerImpl and TaskSetManager do not pro

[jira] [Resolved] (SPARK-2711) Create a ShuffleMemoryManager that allocates across spilling collections in the same task

2014-08-05 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia resolved SPARK-2711. -- Resolution: Fixed Fix Version/s: 1.1.0 > Create a ShuffleMemoryManager that allocates ac

[jira] [Resolved] (SPARK-2685) Update ExternalAppendOnlyMap to avoid buffer.remove()

2014-08-05 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia resolved SPARK-2685. -- Resolution: Fixed Fix Version/s: 1.1.0 > Update ExternalAppendOnlyMap to avoid buffer.re

[jira] [Created] (SPARK-2856) Decrease initial buffer size for Kryo now that SPARK-2543 is in

2014-08-04 Thread Matei Zaharia (JIRA)
Matei Zaharia created SPARK-2856: Summary: Decrease initial buffer size for Kryo now that SPARK-2543 is in Key: SPARK-2856 URL: https://issues.apache.org/jira/browse/SPARK-2856 Project: Spark

[jira] [Resolved] (SPARK-1811) Support resizable output buffer for kryo serializer

2014-08-04 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia resolved SPARK-1811. -- Resolution: Duplicate Fix Version/s: 1.1.0 > Support resizable output buffer for kryo se

[jira] [Commented] (SPARK-1811) Support resizable output buffer for kryo serializer

2014-08-04 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14085716#comment-14085716 ] Matei Zaharia commented on SPARK-1811: -- Closed this as a duplicate of https://issues.

[jira] [Updated] (SPARK-2685) Update ExternalAppendOnlyMap to avoid buffer.remove()

2014-08-04 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia updated SPARK-2685: - Target Version/s: 1.1.0 > Update ExternalAppendOnlyMap to avoid buffer.remove() > ---

[jira] [Assigned] (SPARK-2685) Update ExternalAppendOnlyMap to avoid buffer.remove()

2014-08-04 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia reassigned SPARK-2685: Assignee: Matei Zaharia > Update ExternalAppendOnlyMap to avoid buffer.remove() > -

[jira] [Updated] (SPARK-2792) Fix reading too much or too little data from each stream in ExternalMap / Sorter

2014-08-04 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia updated SPARK-2792: - Assignee: Mridul Muralidharan > Fix reading too much or too little data from each stream in Exter

[jira] [Resolved] (SPARK-2792) Fix reading too much or too little data from each stream in ExternalMap / Sorter

2014-08-04 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia resolved SPARK-2792. -- Resolution: Fixed > Fix reading too much or too little data from each stream in ExternalMap /

[jira] [Assigned] (SPARK-2787) Make sort-based shuffle write files directly when there is no sorting / aggregation and # of partitions is small

2014-08-04 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia reassigned SPARK-2787: Assignee: Matei Zaharia > Make sort-based shuffle write files directly when there is no sor

[jira] [Updated] (SPARK-2787) Make sort-based shuffle write files directly when there is no sorting / aggregation and # of partitions is small

2014-08-04 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia updated SPARK-2787: - Target Version/s: 1.1.0 > Make sort-based shuffle write files directly when there is no sorting /

[jira] [Resolved] (SPARK-2116) Load spark-defaults.conf from directory specified by SPARK_CONF_DIR

2014-08-01 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia resolved SPARK-2116. -- Resolution: Fixed Fix Version/s: 1.1.0 > Load spark-defaults.conf from directory specifi

[jira] [Updated] (SPARK-2116) Load spark-defaults.conf from directory specified by SPARK_CONF_DIR

2014-08-01 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia updated SPARK-2116: - Assignee: Albert Chu > Load spark-defaults.conf from directory specified by SPARK_CONF_DIR >

[jira] [Updated] (SPARK-2792) Fix reading too much or too little data from each stream in ExternalMap / Sorter

2014-08-01 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia updated SPARK-2792: - Assignee: (was: Matei Zaharia) > Fix reading too much or too little data from each stream in

[jira] [Resolved] (SPARK-2684) Update ExternalAppendOnlyMap to take an iterator as input

2014-08-01 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia resolved SPARK-2684. -- Resolution: Fixed Fix Version/s: 1.1.0 > Update ExternalAppendOnlyMap to take an iterato

[jira] [Updated] (SPARK-2532) Fix issues with consolidated shuffle

2014-08-01 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia updated SPARK-2532: - Description: Will file PR with changes as soon as merge is done (earlier merge became outda

[jira] [Commented] (SPARK-2791) Fix committing, reverting and state tracking in shuffle file consolidation

2014-08-01 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14082953#comment-14082953 ] Matei Zaharia commented on SPARK-2791: -- Ported from Mridul's patch by Aaron in https:

[jira] [Resolved] (SPARK-2791) Fix committing, reverting and state tracking in shuffle file consolidation

2014-08-01 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia resolved SPARK-2791. -- Resolution: Fixed > Fix committing, reverting and state tracking in shuffle file consolidation

[jira] [Updated] (SPARK-2791) Fix committing, reverting and state tracking in shuffle file consolidation

2014-08-01 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia updated SPARK-2791: - Assignee: Mridul Muralidharan > Fix committing, reverting and state tracking in shuffle file cons

[jira] [Updated] (SPARK-2532) Fix issues with consolidated shuffle

2014-08-01 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia updated SPARK-2532: - Fix Version/s: (was: 1.1.0) > Fix issues with consolidated shuffle >

[jira] [Resolved] (SPARK-1612) Potential resource leaks in Utils.copyStream and Utils.offsetBytes

2014-08-01 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia resolved SPARK-1612. -- Resolution: Fixed Fix Version/s: 1.1.0 > Potential resource leaks in Utils.copyStream an

<    1   2   3   4   5   6   7   >