[jira] [Created] (SPARK-6050) Spark on YARN does not work --executor-cores is specified

2015-02-26 Thread Mridul Muralidharan (JIRA)
Mridul Muralidharan created SPARK-6050: -- Summary: Spark on YARN does not work --executor-cores is specified Key: SPARK-6050 URL: https://issues.apache.org/jira/browse/SPARK-6050 Project: Spark

Re: 2GB limit for partitions?

2015-02-04 Thread Mridul Muralidharan
to LargeByteBuffer, seems > promising. > > thanks, > Imran > > On Tue, Feb 3, 2015 at 7:32 PM, Mridul Muralidharan > wrote: > >> That is fairly out of date (we used to run some of our jobs on it ... But >> that is forked off 1.1 actually). >> >> Regard

Re: 2GB limit for partitions?

2015-02-04 Thread Mridul Muralidharan
to LargeByteBuffer, seems > promising. > > thanks, > Imran > > On Tue, Feb 3, 2015 at 7:32 PM, Mridul Muralidharan > wrote: > >> That is fairly out of date (we used to run some of our jobs on it ... But >> that is forked off 1.1 actually). >> >> Regard

Re: 2GB limit for partitions?

2015-02-03 Thread Mridul Muralidharan
That is fairly out of date (we used to run some of our jobs on it ... But that is forked off 1.1 actually). Regards Mridul On Tuesday, February 3, 2015, Imran Rashid wrote: > Thanks for the explanations, makes sense. For the record looks like this > was worked on a while back (and maybe the wo

Re: 2GB limit for partitions?

2015-02-03 Thread Mridul Muralidharan
That is fairly out of date (we used to run some of our jobs on it ... But that is forked off 1.1 actually). Regards Mridul On Tuesday, February 3, 2015, Imran Rashid wrote: > Thanks for the explanations, makes sense. For the record looks like this > was worked on a while back (and maybe the wo

Re: Welcoming three new committers

2015-02-03 Thread Mridul Muralidharan
Congratulations ! Keep up the good work :-) Regards Mridul On Tuesday, February 3, 2015, Matei Zaharia wrote: > Hi all, > > The PMC recently voted to add three new committers: Cheng Lian, Joseph > Bradley and Sean Owen. All three have been major contributors to Spark in > the past year: Cheng

Re: keeping PR titles / descriptions up to date

2014-12-02 Thread Mridul Muralidharan
I second that ! Would also be great if the JIRA was updated accordingly too. Regards, Mridul On Wed, Dec 3, 2014 at 1:53 AM, Kay Ousterhout wrote: > Hi all, > > I've noticed a bunch of times lately where a pull request changes to be > pretty different from the original pull request, and the tit

Re: Problems with spark.locality.wait

2014-11-13 Thread Mridul Muralidharan
CAL machine. Of > course, this would add a bunch of complexity to the TSM, hence the earlier > decision that the added complexity may not be worth it. > > -Kay > > On Thu, Nov 13, 2014 at 12:11 PM, Mridul Muralidharan > wrote: >> >> Instead of setting spark.loc

Re: Problems with spark.locality.wait

2014-11-13 Thread Mridul Muralidharan
Instead of setting spark.locality.wait, try setting individual locality waits specifically. Namely, spark.locality.wait.PROCESS_LOCAL to high value (so that process local tasks are always scheduled in case the task set has process local tasks). Set spark.locality.wait.NODE_LOCAL and spark.locality

[jira] [Commented] (SPARK-4030) `destroy` method in Broadcast should be public

2014-10-21 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14178162#comment-14178162 ] Mridul Muralidharan commented on SPARK-4030: We have also needed to

[jira] [Comment Edited] (SPARK-3948) Sort-based shuffle can lead to assorted stream-corruption exceptions

2014-10-16 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14173489#comment-14173489 ] Mridul Muralidharan edited comment on SPARK-3948 at 10/16/14 7:3

[jira] [Commented] (SPARK-3948) Sort-based shuffle can lead to assorted stream-corruption exceptions

2014-10-16 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14173489#comment-14173489 ] Mridul Muralidharan commented on SPARK-3948: Damn, this sucks :

[jira] [Commented] (SPARK-3948) Sort-based shuffle can lead to assorted stream-corruption exceptions

2014-10-15 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14173413#comment-14173413 ] Mridul Muralidharan commented on SPARK-3948: Not exactly, what I

[jira] [Commented] (SPARK-3948) Sort-based shuffle can lead to assorted stream-corruption exceptions

2014-10-15 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14172585#comment-14172585 ] Mridul Muralidharan commented on SPARK-3948: [~jerryshao] great work

[jira] [Commented] (SPARK-3948) Sort-based shuffle can lead to assorted stream-corruption exceptions

2014-10-15 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14172226#comment-14172226 ] Mridul Muralidharan commented on SPARK-3948: Note, "t" is just

[jira] [Commented] (SPARK-3948) Sort-based shuffle can lead to assorted stream-corruption exceptions

2014-10-15 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14172225#comment-14172225 ] Mridul Muralidharan commented on SPARK-3948: That is weird, I tri

[jira] [Commented] (SPARK-3948) Sort-based shuffle can lead to assorted stream-corruption exceptions

2014-10-15 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14172134#comment-14172134 ] Mridul Muralidharan commented on SPARK-3948: [~jerryshao] Just to cla

[jira] [Commented] (SPARK-3948) Sort-based shuffle can lead to assorted stream-corruption exceptions

2014-10-15 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14172116#comment-14172116 ] Mridul Muralidharan commented on SPARK-3948: [~joshrosen] Assuming there

[jira] [Commented] (SPARK-3889) JVM dies with SIGBUS, resulting in ConnectionManager failed ACK

2014-10-10 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14167303#comment-14167303 ] Mridul Muralidharan commented on SPARK-3889: The status says fixed - what

Re: Breaking the previous large-scale sort record with Spark

2014-10-10 Thread Mridul Muralidharan
Brilliant stuff ! Congrats all :-) This is indeed really heartening news ! Regards, Mridul On Fri, Oct 10, 2014 at 8:24 PM, Matei Zaharia wrote: > Hi folks, > > I interrupt your regularly scheduled user / dev list to bring you some pretty > cool news for the project, which is that we've been a

Re: Breaking the previous large-scale sort record with Spark

2014-10-10 Thread Mridul Muralidharan
Brilliant stuff ! Congrats all :-) This is indeed really heartening news ! Regards, Mridul On Fri, Oct 10, 2014 at 8:24 PM, Matei Zaharia wrote: > Hi folks, > > I interrupt your regularly scheduled user / dev list to bring you some pretty > cool news for the project, which is that we've been a

[jira] [Commented] (SPARK-3847) Enum.hashCode is only consistent within the same JVM

2014-10-08 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14164775#comment-14164775 ] Mridul Muralidharan commented on SPARK-3847: [~joshrosen] array hash

[jira] [Commented] (SPARK-3561) Allow for pluggable execution contexts in Spark

2014-10-08 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14164095#comment-14164095 ] Mridul Muralidharan commented on SPARK-3561: I agree with [~pwendell]

[jira] [Commented] (SPARK-3847) Enum.hashCode is only consistent within the same JVM

2014-10-08 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14163407#comment-14163407 ] Mridul Muralidharan commented on SPARK-3847: Wow, nice bug ! Thi

[jira] [Commented] (SPARK-3561) Allow for pluggable execution contexts in Spark

2014-10-08 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14163376#comment-14163376 ] Mridul Muralidharan commented on SPARK-3561: [~ozhurakousky] I think

[jira] [Commented] (SPARK-3561) Allow for pluggable execution contexts in Spark

2014-10-08 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14163351#comment-14163351 ] Mridul Muralidharan commented on SPARK-3561: [~pwendell] If I understood

[jira] [Commented] (SPARK-3785) Support off-loading computations to a GPU

2014-10-08 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14163326#comment-14163326 ] Mridul Muralidharan commented on SPARK-3785: [~sowen] We had prototyp

[jira] [Commented] (SPARK-3714) Spark workflow scheduler

2014-09-29 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14151486#comment-14151486 ] Mridul Muralidharan commented on SPARK-3714: Most of the drawbacks menti

[jira] [Commented] (SPARK-3714) Spark workflow scheduler

2014-09-28 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14151211#comment-14151211 ] Mridul Muralidharan commented on SPARK-3714: Have you tried using oozie

[jira] [Commented] (SPARK-1956) Enable shuffle consolidation by default

2014-09-07 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14125027#comment-14125027 ] Mridul Muralidharan commented on SPARK-1956: The recent change

[jira] [Commented] (SPARK-1476) 2GB limit in spark for blocks

2014-09-02 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14119385#comment-14119385 ] Mridul Muralidharan commented on SPARK-1476: WIP version pushed to h

[jira] [Commented] (SPARK-3019) Pluggable block transfer (data plane communication) interface

2014-09-02 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14119365#comment-14119365 ] Mridul Muralidharan commented on SPARK-3019: I will try to push the ver

[jira] [Commented] (SPARK-3019) Pluggable block transfer (data plane communication) interface

2014-09-02 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14119362#comment-14119362 ] Mridul Muralidharan commented on SPARK-3019: Just went over the proposa

Re: [VOTE] Release Apache Spark 1.1.0 (RC1)

2014-08-28 Thread Mridul Muralidharan
and we'll patch it > and spin a new RC. We can also update the test coverage to cover LZ4. > > - Patrick > > On Thu, Aug 28, 2014 at 9:27 AM, Mridul Muralidharan > wrote: > > Is SPARK-3277 applicable to 1.1 ? > > If yes, until it is fixed, I am -1 on the releas

[jira] [Commented] (SPARK-3277) LZ4 compression cause the the ExternalSort exception

2014-08-28 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14114484#comment-14114484 ] Mridul Muralidharan commented on SPARK-3277: Sounds great, thx ! I suspec

[jira] [Commented] (SPARK-3277) LZ4 compression cause the the ExternalSort exception

2014-08-28 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14114026#comment-14114026 ] Mridul Muralidharan commented on SPARK-3277: [~hzw] did you notice

[jira] [Comment Edited] (SPARK-3277) LZ4 compression cause the the ExternalSort exception

2014-08-28 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14114022#comment-14114022 ] Mridul Muralidharan edited comment on SPARK-3277 at 8/28/14 5:3

[jira] [Updated] (SPARK-3277) LZ4 compression cause the the ExternalSort exception

2014-08-28 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mridul Muralidharan updated SPARK-3277: --- Attachment: test_lz4_bug.patch Against master, though I noticed similar changes in

[jira] [Commented] (SPARK-3277) LZ4 compression cause the the ExternalSort exception

2014-08-28 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14114014#comment-14114014 ] Mridul Muralidharan commented on SPARK-3277: [~matei] Attaching a patch w

[jira] [Updated] (SPARK-3277) LZ4 compression cause the the ExternalSort exception

2014-08-28 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mridul Muralidharan updated SPARK-3277: --- Priority: Blocker (was: Major) > LZ4 compression cause the the ExternalS

[jira] [Updated] (SPARK-3277) LZ4 compression cause the the ExternalSort exception

2014-08-28 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mridul Muralidharan updated SPARK-3277: --- Affects Version/s: 1.2.0 1.1.0 > LZ4 compression cause

Re: [VOTE] Release Apache Spark 1.1.0 (RC1)

2014-08-28 Thread Mridul Muralidharan
Is SPARK-3277 applicable to 1.1 ? If yes, until it is fixed, I am -1 on the release (I am on break, so can't verify or help fix, sorry). Regards Mridul On 28-Aug-2014 9:33 pm, "Patrick Wendell" wrote: > Please vote on releasing the following candidate as Apache Spark version > 1.1.0! > > The ta

[jira] [Commented] (SPARK-3277) LZ4 compression cause the the ExternalSort exception

2014-08-28 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14113741#comment-14113741 ] Mridul Muralidharan commented on SPARK-3277: This looks like unrel

[jira] [Commented] (SPARK-3175) Branch-1.1 SBT build failed for Yarn-Alpha

2014-08-23 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14107923#comment-14107923 ] Mridul Muralidharan commented on SPARK-3175: Please add more informatio

Re: is Branch-1.1 SBT build broken for yarn-alpha ?

2014-08-21 Thread Mridul Muralidharan
Weird that Patrick did not face this while creating the RC. Essentially the yarn alpha pom.xml has not been updated properly in the 1.1 branch. Just change version to '1.1.1-SNAPSHOT' for yarn/alpha/pom.xml (to make it same as any other pom). Regards, Mridul On Thu, Aug 21, 2014 at 5:09 AM, Ch

[jira] [Commented] (SPARK-3115) Improve task broadcast latency for small tasks

2014-08-19 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14102142#comment-14102142 ] Mridul Muralidharan commented on SPARK-3115: I had a tab open with pr

[jira] [Commented] (SPARK-3019) Pluggable block transfer (data plane communication) interface

2014-08-17 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14100043#comment-14100043 ] Mridul Muralidharan commented on SPARK-3019: Unfortunately, I never went

[jira] [Commented] (SPARK-3019) Pluggable block transfer (data plane communication) interface

2014-08-17 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14099910#comment-14099910 ] Mridul Muralidharan commented on SPARK-3019: Btw, can we do something a

[jira] [Commented] (SPARK-3019) Pluggable block transfer (data plane communication) interface

2014-08-17 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14099909#comment-14099909 ] Mridul Muralidharan commented on SPARK-3019: I am yet to go through

[jira] [Commented] (SPARK-2089) With YARN, preferredNodeLocalityData isn't honored

2014-08-15 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14099115#comment-14099115 ] Mridul Muralidharan commented on SPARK-2089: For a general case,

[jira] [Commented] (SPARK-1476) 2GB limit in spark for blocks

2014-08-15 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14099072#comment-14099072 ] Mridul Muralidharan commented on SPARK-1476: Based on discussions we had

[jira] [Commented] (SPARK-2089) With YARN, preferredNodeLocalityData isn't honored

2014-08-13 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14095247#comment-14095247 ] Mridul Muralidharan commented on SPARK-2089: Since I am not maintaining

[jira] [Commented] (SPARK-2962) Suboptimal scheduling in spark

2014-08-11 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14092807#comment-14092807 ] Mridul Muralidharan commented on SPARK-2962: On further investigation

[jira] [Comment Edited] (SPARK-2962) Suboptimal scheduling in spark

2014-08-10 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14092427#comment-14092427 ] Mridul Muralidharan edited comment on SPARK-2962 at 8/11/14 4:3

[jira] [Commented] (SPARK-2962) Suboptimal scheduling in spark

2014-08-10 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14092431#comment-14092431 ] Mridul Muralidharan commented on SPARK-2962: Note, I dont think this

[jira] [Commented] (SPARK-2962) Suboptimal scheduling in spark

2014-08-10 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14092430#comment-14092430 ] Mridul Muralidharan commented on SPARK-2962: Hi [~matei], I am referen

[jira] [Commented] (SPARK-2962) Suboptimal scheduling in spark

2014-08-10 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14092427#comment-14092427 ] Mridul Muralidharan commented on SPARK-2962: To give more context; a)

[jira] [Created] (SPARK-2962) Suboptimal scheduling in spark

2014-08-10 Thread Mridul Muralidharan (JIRA)
Mridul Muralidharan created SPARK-2962: -- Summary: Suboptimal scheduling in spark Key: SPARK-2962 URL: https://issues.apache.org/jira/browse/SPARK-2962 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-2931) getAllowedLocalityLevel() throws ArrayIndexOutOfBoundsException

2014-08-09 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14091881#comment-14091881 ] Mridul Muralidharan commented on SPARK-2931: [~joshrosen] [~kayouster

[jira] [Updated] (SPARK-2931) getAllowedLocalityLevel() throws ArrayIndexOutOfBoundsException

2014-08-09 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mridul Muralidharan updated SPARK-2931: --- Attachment: test.patch A patch to showcase the exception > getAllowedLocalityLe

[jira] [Commented] (SPARK-2931) getAllowedLocalityLevel() throws ArrayIndexOutOfBoundsException

2014-08-09 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14091849#comment-14091849 ] Mridul Muralidharan commented on SPARK-2931: Checking more, it might

Re: Unit tests in < 5 minutes

2014-08-09 Thread Mridul Muralidharan
Issue with supporting this imo is the fact that scala-test uses the same vm for all the tests (surefire plugin supports fork, but scala-test ignores it iirc). So different tests would initialize different spark context, and can potentially step on each others toes. Regards, Mridul On Fri, Aug 8,

[jira] [Commented] (SPARK-2931) getAllowedLocalityLevel() throws ArrayIndexOutOfBoundsException

2014-08-09 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14091746#comment-14091746 ] Mridul Muralidharan commented on SPARK-2931: [~kayousterhout] this is w

[jira] [Comment Edited] (SPARK-2881) Snappy is now default codec - could lead to conflicts since uses /tmp

2014-08-06 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14088018#comment-14088018 ] Mridul Muralidharan edited comment on SPARK-2881 at 8/6/14 6:4

[jira] [Commented] (SPARK-2881) Snappy is now default codec - could lead to conflicts since uses /tmp

2014-08-06 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14088018#comment-14088018 ] Mridul Muralidharan commented on SPARK-2881: To add, this will affect s

Re: -1s on pull requests?

2014-08-05 Thread Mridul Muralidharan
Just came across this mail, thanks for initiating this discussion Kay. To add; another issue which recurs is very rapid commit's: before most contributors have had a chance to even look at the changes proposed. There is not much prior discussion on the jira or pr, and the time between submitting th

[jira] [Commented] (SPARK-2685) Update ExternalAppendOnlyMap to avoid buffer.remove()

2014-07-25 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14074186#comment-14074186 ] Mridul Muralidharan commented on SPARK-2685: We moved to u

[jira] [Created] (SPARK-2532) Fix issues with consolidated shuffle

2014-07-16 Thread Mridul Muralidharan (JIRA)
Mridul Muralidharan created SPARK-2532: -- Summary: Fix issues with consolidated shuffle Key: SPARK-2532 URL: https://issues.apache.org/jira/browse/SPARK-2532 Project: Spark Issue Type

[jira] [Commented] (SPARK-2468) zero-copy shuffle network communication

2014-07-14 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14061094#comment-14061094 ] Mridul Muralidharan commented on SPARK-2468: Ah, small files - those

Re: better compression codecs for shuffle blocks?

2014-07-14 Thread Mridul Muralidharan
We tried with lower block size for lzf, but it barfed all over the place. Snappy was the way to go for our jobs. Regards, Mridul On Mon, Jul 14, 2014 at 12:31 PM, Reynold Xin wrote: > Hi Spark devs, > > I was looking into the memory usage of shuffle and one annoying thing is > the default comp

[jira] [Commented] (SPARK-2468) zero-copy shuffle network communication

2014-07-14 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14060545#comment-14060545 ] Mridul Muralidharan commented on SPARK-2468: Writing mmap'ed bu

[jira] [Commented] (SPARK-2468) zero-copy shuffle network communication

2014-07-14 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14060543#comment-14060543 ] Mridul Muralidharan commented on SPARK-2468: We map the file content

[jira] [Commented] (SPARK-2398) Trouble running Spark 1.0 on Yarn

2014-07-13 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14060113#comment-14060113 ] Mridul Muralidharan commented on SPARK-2398: As discussed in the PR,

Re: [GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-07-13 Thread Mridul Muralidharan
You are lucky :-) for some of our jobs, in a 8gb container, overhead is 1.8gb ! On 13-Jul-2014 2:40 pm, "nishkamravi2" wrote: > Github user nishkamravi2 commented on the pull request: > > https://github.com/apache/spark/pull/1391#issuecomment-48835560 > > Sean, the memory_overhead is fair

Re: CPU/Disk/network performance instrumentation

2014-07-09 Thread Mridul Muralidharan
+1 on advanced mode ! Regards. Mridul On Thu, Jul 10, 2014 at 12:55 AM, Reynold Xin wrote: > Maybe it's time to create an advanced mode in the ui. > > > On Wed, Jul 9, 2014 at 12:23 PM, Kay Ousterhout > wrote: > >> Hi all, >> >> I've been doing a bunch of performance measurement of Spark and, a

Unresponsive to PR/jira changes

2014-07-09 Thread Mridul Muralidharan
Hi, I noticed today that gmail has been marking most of the mails from spark github/jira I was receiving to spam folder; and I was assuming it was lull in activity due to spark summit for past few weeks ! In case I have commented on specific PR/JIRA issues and not followed up, apologies for th

Re: on shark, is tachyon less efficient than memory_only cache strategy ?

2014-07-08 Thread Mridul Muralidharan
You are ignoring serde costs :-) - Mridul On Tue, Jul 8, 2014 at 8:48 PM, Aaron Davidson wrote: > Tachyon should only be marginally less performant than memory_only, because > we mmap the data from Tachyon's ramdisk. We do not have to, say, transfer > the data over a pipe from Tachyon; we can di

[jira] [Commented] (SPARK-2390) Files in staging directory cannot be deleted and wastes the space of HDFS

2014-07-07 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14054162#comment-14054162 ] Mridul Muralidharan commented on SPARK-2390: Here, and a bunch of o

[jira] [Commented] (SPARK-2017) web ui stage page becomes unresponsive when the number of tasks is large

2014-07-04 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14052679#comment-14052679 ] Mridul Muralidharan commented on SPARK-2017: Sounds great, ability to ge

Re: Eliminate copy while sending data : any Akka experts here ?

2014-07-04 Thread Mridul Muralidharan
size = 0 using a compressed bitmap. That way we can still avoid > requests for zero-sized blocks. > > > > On Thu, Jul 3, 2014 at 3:12 PM, Reynold Xin wrote: > >> Yes, that number is likely == 0 in any real workload ... >> >> >> On Thu, Jul 3, 2014 at 8:01 AM, Mr

[jira] [Commented] (SPARK-2017) web ui stage page becomes unresponsive when the number of tasks is large

2014-07-04 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14052289#comment-14052289 ] Mridul Muralidharan commented on SPARK-2017: With aggregated metrics

[jira] [Commented] (SPARK-2277) Make TaskScheduler track whether there's host on a rack

2014-07-04 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14052275#comment-14052275 ] Mridul Muralidharan commented on SPARK-2277: Hmm, good point - that PR

[jira] [Updated] (SPARK-2353) ArrayIndexOutOfBoundsException in scheduler

2014-07-03 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mridul Muralidharan updated SPARK-2353: --- Description: I suspect the recent changes from SPARK-1937 to compute valid locality

[jira] [Commented] (SPARK-2277) Make TaskScheduler track whether there's host on a rack

2014-07-03 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14051575#comment-14051575 ] Mridul Muralidharan commented on SPARK-2277: I have not rechecked that

Re: Eliminate copy while sending data : any Akka experts here ?

2014-07-03 Thread Mridul Muralidharan
On Thu, Jul 3, 2014 at 11:32 AM, Reynold Xin wrote: > On Wed, Jul 2, 2014 at 3:44 AM, Mridul Muralidharan > wrote: > >> >> > >> > The other thing we do need is the location of blocks. This is actually >> just >> > O(n) because we just ne

[jira] [Created] (SPARK-2353) ArrayIndexOutOfBoundsException in scheduler

2014-07-03 Thread Mridul Muralidharan (JIRA)
Mridul Muralidharan created SPARK-2353: -- Summary: ArrayIndexOutOfBoundsException in scheduler Key: SPARK-2353 URL: https://issues.apache.org/jira/browse/SPARK-2353 Project: Spark Issue

[jira] [Commented] (SPARK-2277) Make TaskScheduler track whether there's host on a rack

2014-07-02 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14050886#comment-14050886 ] Mridul Muralidharan commented on SPARK-2277: I am not sure I follow

Re: Eliminate copy while sending data : any Akka experts here ?

2014-07-02 Thread Mridul Muralidharan
r a reducer (and lack of ability to throttle). Regards, Mridul > > > On Tue, Jul 1, 2014 at 2:51 AM, Mridul Muralidharan > wrote: > >> We had considered both approaches (if I understood the suggestions right) : >> a) Pulling only map output states for tasks which run on t

Re: Eliminate copy while sending data : any Akka experts here ?

2014-07-02 Thread Mridul Muralidharan
Hi Patrick, Please see inline. Regards, Mridul On Wed, Jul 2, 2014 at 10:52 AM, Patrick Wendell wrote: >> b) Instead of pulling this information, push it to executors as part >> of task submission. (What Patrick mentioned ?) >> (1) a.1 from above is still an issue for this. > > I don't under

Re: Eliminate copy while sending data : any Akka experts here ?

2014-07-01 Thread Mridul Muralidharan
Do note that your solution of using broadcast to send the map tasks is very >> similar to how the executor returns the result of a task when it's too big >> for akka. We were thinking of refactoring this too, as using the block >> manager has much higher latency than a dir

Re: Eliminate copy while sending data : any Akka experts here ?

2014-06-30 Thread Mridul Muralidharan
different workers requesting for the output statuses for shuffle (after map) - so not sure if back pressure buffers, etc would help. Regards, Mridul On Mon, Jun 30, 2014 at 11:07 PM, Mridul Muralidharan wrote: > Hi, > > While sending map output tracker result, the same serialized byte &

Eliminate copy while sending data : any Akka experts here ?

2014-06-30 Thread Mridul Muralidharan
Hi, While sending map output tracker result, the same serialized byte array is sent multiple times - but the akka implementation copies it to a private byte array within ByteString for each send. Caching a ByteString instead of Array[Byte] did not help, since akka does not support special casing

[jira] [Commented] (SPARK-2294) TaskSchedulerImpl and TaskSetManager do not properly prioritize which tasks get assigned to an executor

2014-06-26 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14045433#comment-14045433 ] Mridul Muralidharan commented on SPARK-2294: I agree; We should bum

[jira] [Commented] (SPARK-2268) Utils.createTempDir() creates race with HDFS at shutdown

2014-06-24 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14043088#comment-14043088 ] Mridul Muralidharan commented on SPARK-2268: That is not because of this

[jira] [Commented] (SPARK-2268) Utils.createTempDir() creates race with HDFS at shutdown

2014-06-24 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14043071#comment-14043071 ] Mridul Muralidharan commented on SPARK-2268: Setting priority for shut

[jira] [Updated] (SPARK-1476) 2GB limit in spark for blocks

2014-06-24 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mridul Muralidharan updated SPARK-1476: --- Attachment: 2g_fix_proposal.pdf Proposal detailing the work we have done on this

Re: [jira] [Created] (SPARK-1867) Spark Documentation Error causes java.lang.IllegalStateException: unread block data

2014-06-23 Thread Mridul Muralidharan
dul, > > Can you comment a little bit more on this issue? We are running into the > same stack trace but not sure whether it is just different Spark versions > on each cluster (doesn't seem likely) or a bug in Spark. > > Thanks. > > > > On Sat, May 17, 2014 at 4:41 AM

[jira] [Commented] (SPARK-2223) Building and running tests with maven is extremely slow

2014-06-22 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14040094#comment-14040094 ] Mridul Muralidharan commented on SPARK-2223: I usually do : $ mvn -P

[jira] [Commented] (SPARK-2089) With YARN, preferredNodeLocalityData isn't honored

2014-06-22 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14040085#comment-14040085 ] Mridul Muralidharan commented on SPARK-2089: A few of things to be kep

[jira] [Comment Edited] (SPARK-704) ConnectionManager sometimes cannot detect loss of sending connections

2014-06-21 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14039742#comment-14039742 ] Mridul Muralidharan edited comment on SPARK-704 at 6/21/14 9:1

<    4   5   6   7   8   9   10   11   12   13   >