[jira] [Commented] (TEZ-2119) Counter for launched containers

2020-03-06 Thread Bikas Saha (Jira)
[ https://issues.apache.org/jira/browse/TEZ-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17053860#comment-17053860 ] Bikas Saha commented on TEZ-2119: - Been a while. The intent of total_used might have been to maintain the

[jira] [Commented] (TEZ-1786) Support for speculation of slow tasks

2017-09-06 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-1786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16156018#comment-16156018 ] Bikas Saha commented on TEZ-1786: - Thats correct. > Support for speculation of slow tasks >

[jira] [Commented] (TEZ-3770) DAG-aware YARN task scheduler

2017-06-27 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065756#comment-16065756 ] Bikas Saha commented on TEZ-3770: - Just clarifying that the original scheduler was not made dag aware by

[jira] [Commented] (TEZ-394) Better scheduling for uneven DAGs

2017-05-30 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16030365#comment-16030365 ] Bikas Saha commented on TEZ-394: Not sure I understood this correctly. bq.V1->V3->V4->V5 bq.V2->V5 bq.V6->V7

[jira] [Commented] (TEZ-3696) Jobs can hang when both concurrency and speculation are enabled

2017-05-04 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15997706#comment-15997706 ] Bikas Saha commented on TEZ-3696: - Thanks [~ebadger]! I missed that part of the code. Makes sense. > Jobs

[jira] [Commented] (TEZ-3696) Jobs can hang when both concurrency and speculation are enabled

2017-05-02 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993419#comment-15993419 ] Bikas Saha commented on TEZ-3696: - Thanks for the ping. Looking at the code again, I am not sure why I had

[jira] [Comment Edited] (TEZ-394) Better scheduling for uneven DAGs

2017-02-14 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15865631#comment-15865631 ] Bikas Saha edited comment on TEZ-394 at 2/14/17 12:30 PM: -- Thanks for doing this! I

[jira] [Commented] (TEZ-394) Better scheduling for uneven DAGs

2017-02-14 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15865631#comment-15865631 ] Bikas Saha commented on TEZ-394: Thanks for doing this! I regret not having done this right from the start.

[jira] [Commented] (TEZ-3512) Update EdgePlan proto for named edge

2016-12-21 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15768737#comment-15768737 ] Bikas Saha commented on TEZ-3512: - How can we be sure that SrcDest or DestSrc set by the AM will not

[jira] [Commented] (TEZ-3512) Update EdgePlan proto for named edge

2016-12-21 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15768614#comment-15768614 ] Bikas Saha commented on TEZ-3512: - I can see that in the patch :) But what will the value be for these null

[jira] [Commented] (TEZ-3512) Update EdgePlan proto for named edge

2016-12-20 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15765960#comment-15765960 ] Bikas Saha commented on TEZ-3512: - bq. Default value is inappropriate because any default value may also be

[jira] [Commented] (TEZ-3512) Update EdgePlan proto for named edge

2016-12-10 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15738699#comment-15738699 ] Bikas Saha commented on TEZ-3512: - When the DAG is being compiled on the client side, a default value could

[jira] [Commented] (TEZ-3222) Reduce messaging overhead for auto-reduce parallelism case

2016-11-30 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15709837#comment-15709837 ] Bikas Saha commented on TEZ-3222: - bq. routeInputSourceTaskFailedEventToDestination I think this could be

[jira] [Commented] (TEZ-3222) Reduce messaging overhead for auto-reduce parallelism case

2016-11-16 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15671995#comment-15671995 ] Bikas Saha commented on TEZ-3222: - Sounds good! Thanks! > Reduce messaging overhead for auto-reduce

[jira] [Commented] (TEZ-1190) Allow multiple edges between two vertices

2016-11-16 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-1190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15671986#comment-15671986 ] Bikas Saha commented on TEZ-1190: - Still don't understand why making named/unnamed exclusive is going to

[jira] [Commented] (TEZ-1190) Allow multiple edges between two vertices

2016-11-09 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-1190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15652811#comment-15652811 ] Bikas Saha commented on TEZ-1190: - How is the restriction of either all named or unnamed helpful? How about

[jira] [Comment Edited] (TEZ-1190) Allow multiple edges between two vertices

2016-10-31 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-1190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15623227#comment-15623227 ] Bikas Saha edited comment on TEZ-1190 at 10/31/16 8:15 PM: --- +1 for design doc. A

[jira] [Commented] (TEZ-1190) Allow multiple edges between two vertices

2016-10-31 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-1190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15623227#comment-15623227 ] Bikas Saha commented on TEZ-1190: - +1 for design doc. A while back we had discussed about this and thought

[jira] [Commented] (TEZ-3222) Reduce messaging overhead for auto-reduce parallelism case

2016-10-19 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15590836#comment-15590836 ] Bikas Saha commented on TEZ-3222: - {code} -return commonRouteMeta[sourceTaskIndex]; +return

[jira] [Commented] (TEZ-3163) Reuse and tune Inflaters and Deflaters to speed DME processing

2016-09-19 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15504112#comment-15504112 ] Bikas Saha commented on TEZ-3163: - /cc [~hitesh] [~aplusplus] > Reuse and tune Inflaters and Deflaters to

[jira] [Updated] (TEZ-3388) Provide error information in shuffle response header

2016-07-29 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bikas Saha updated TEZ-3388: Description: In MR shuffle, if any partition has an error then the reader gets an exception while reading the

[jira] [Created] (TEZ-3388) Provide error information in shuffle response header

2016-07-29 Thread Bikas Saha (JIRA)
Bikas Saha created TEZ-3388: --- Summary: Provide error information in shuffle response header Key: TEZ-3388 URL: https://issues.apache.org/jira/browse/TEZ-3388 Project: Apache Tez Issue Type:

[jira] [Commented] (TEZ-3317) Speculative execution starts too early due to 0 progress

2016-07-18 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15383436#comment-15383436 ] Bikas Saha commented on TEZ-3317: - Sorry I did not understand whats the issue here from the above comment.

[jira] [Commented] (TEZ-3334) Tez Custom Shuffle Handler

2016-07-12 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15374071#comment-15374071 ] Bikas Saha commented on TEZ-3334: - Also reporting errors properly in the response such that 1 error does not

[jira] [Commented] (TEZ-3334) Tez Custom Shuffle Handler

2016-07-12 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15374065#comment-15374065 ] Bikas Saha commented on TEZ-3334: - YARN-4577 for classpath isolation of aux services. Perhaps the first

[jira] [Comment Edited] (TEZ-1248) Reduce slow-start should special case 1 reducer runs

2016-07-11 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-1248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15371643#comment-15371643 ] Bikas Saha edited comment on TEZ-1248 at 7/11/16 9:16 PM: -- lgtm. seems like a

[jira] [Commented] (TEZ-1248) Reduce slow-start should special case 1 reducer runs

2016-07-11 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-1248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15371643#comment-15371643 ] Bikas Saha commented on TEZ-1248: - lgtm. seems like a simple code change whose side-effect produces the

[jira] [Commented] (TEZ-3334) Tez Custom Shuffle Handler

2016-07-11 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15371277#comment-15371277 ] Bikas Saha commented on TEZ-3334: - +1. The new YARN aux service isolation work should make this easier to

[jira] [Commented] (TEZ-3287) Have UnorderedPartitionedKVWriter honor tez.runtime.empty.partitions.info-via-events.enabled

2016-06-27 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15351682#comment-15351682 ] Bikas Saha commented on TEZ-3287: - [~rajesh.balamohan] [~sseth] please help review > Have

[jira] [Commented] (TEZ-3291) Optimize splits grouping when locality information is not available

2016-06-21 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15342583#comment-15342583 ] Bikas Saha commented on TEZ-3291: - Sure. lets create a follow up jira. > Optimize splits grouping when

[jira] [Commented] (TEZ-3296) Tez job can hang if two vertices at the same root distance have different task requirements

2016-06-16 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15335081#comment-15335081 ] Bikas Saha commented on TEZ-3296: - Thanks! Its clear now. > Tez job can hang if two vertices at the same

[jira] [Commented] (TEZ-3296) Tez job can hang if two vertices at the same root distance have different task requirements

2016-06-16 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15334623#comment-15334623 ] Bikas Saha commented on TEZ-3296: - Ah. Looks like a result of using priority as a key for unique requests vs

[jira] [Comment Edited] (TEZ-3296) Tez job can hang if two vertices at the same root distance have different task requirements

2016-06-16 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15334623#comment-15334623 ] Bikas Saha edited comment on TEZ-3296 at 6/16/16 8:29 PM: -- Ah. Looks like a result

[jira] [Comment Edited] (TEZ-3296) Tez job can hang if two vertices at the same root distance have different task requirements

2016-06-16 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15334623#comment-15334623 ] Bikas Saha edited comment on TEZ-3296 at 6/16/16 8:29 PM: -- Ah. Looks like a result

[jira] [Comment Edited] (TEZ-3296) Tez job can hang if two vertices at the same root distance have different task requirements

2016-06-16 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15334492#comment-15334492 ] Bikas Saha edited comment on TEZ-3296 at 6/16/16 7:20 PM: -- Sure. Lets commit this

[jira] [Comment Edited] (TEZ-3296) Tez job can hang if two vertices at the same root distance have different task requirements

2016-06-16 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15334492#comment-15334492 ] Bikas Saha edited comment on TEZ-3296 at 6/16/16 7:20 PM: -- Sure. Lets commit this

[jira] [Commented] (TEZ-3296) Tez job can hang if two vertices at the same root distance have different task requirements

2016-06-16 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15334492#comment-15334492 ] Bikas Saha commented on TEZ-3296: - Sure. Lets commit this patch. Could you please attach the task scheduler

[jira] [Commented] (TEZ-3296) Tez job can hang if two vertices at the same root distance have different task requirements

2016-06-13 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15328921#comment-15328921 ] Bikas Saha commented on TEZ-3296: - Sorry. My bad. I even used a calculator for that :P If this is urgent I

[jira] [Commented] (TEZ-3291) Optimize splits grouping when locality information is not available

2016-06-13 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15328326#comment-15328326 ] Bikas Saha commented on TEZ-3291: - I am with Gopal on the fragility of this workaround. Single machine is

[jira] [Commented] (TEZ-3291) Optimize splits grouping when locality information is not available

2016-06-12 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15326689#comment-15326689 ] Bikas Saha commented on TEZ-3291: - The comment could be more explicit like "this is a workaround for systems

[jira] [Commented] (TEZ-3291) Optimize splits grouping when locality information is not available

2016-06-12 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15326684#comment-15326684 ] Bikas Saha commented on TEZ-3291: - Would the split not have the URLs with S3 in them? Wondering how ORC

[jira] [Commented] (TEZ-3296) Tez job can hang if two vertices at the same root distance have different task requirements

2016-06-12 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15326673#comment-15326673 ] Bikas Saha commented on TEZ-3296: - bq. Today each vertex uses a set of three priority values, the low, the

[jira] [Commented] (TEZ-3297) Deadlock scenario in AM during ShuffleVertexManager auto reduce

2016-06-12 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15326660#comment-15326660 ] Bikas Saha commented on TEZ-3297: - looking at the code further, looks like the crucial change is not holding

[jira] [Commented] (TEZ-3216) Support for more precise partition stats in VertexManagerEvent

2016-06-12 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15326650#comment-15326650 ] Bikas Saha commented on TEZ-3216: - /cc [~rajesh.balamohan] in case he is interested in this optimization. >

[jira] [Commented] (TEZ-3291) Optimize splits grouping when locality information is not available

2016-06-12 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15326649#comment-15326649 ] Bikas Saha commented on TEZ-3291: - Why the numLoc=1 check only in the size < min case? A comment before the

[jira] [Commented] (TEZ-3300) Tez UI: A wiki must be created with info about each page in Tez UI

2016-06-12 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15326638#comment-15326638 ] Bikas Saha commented on TEZ-3300: - Could pages to the wiki be linked directly from the UI page for quick

[jira] [Comment Edited] (TEZ-3300) Tez UI: A wiki must be created with info about each page in Tez UI

2016-06-12 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15326638#comment-15326638 ] Bikas Saha edited comment on TEZ-3300 at 6/12/16 9:22 PM: -- Could pages to the wiki

[jira] [Commented] (TEZ-3291) Optimize splits grouping when locality information is not available

2016-06-11 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15325748#comment-15325748 ] Bikas Saha commented on TEZ-3291: - [~rajesh.balamohan] Is the patch still WIP or ready for final review? >

[jira] [Commented] (TEZ-3296) Tez job can hang if two vertices at the same root distance have different task requirements

2016-06-10 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15324989#comment-15324989 ] Bikas Saha commented on TEZ-3296: - Could you please help me understand the logic to make these unique. I am

[jira] [Commented] (TEZ-3297) Deadlock scenario in AM during ShuffleVertexManager auto reduce

2016-06-10 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15324981#comment-15324981 ] Bikas Saha commented on TEZ-3297: - I am not sure we can simply remove the lock since it may affect

[jira] [Commented] (TEZ-3291) Optimize splits grouping when locality information is not available

2016-06-09 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15323264#comment-15323264 ] Bikas Saha commented on TEZ-3291: - I will take a quick look at the patch by EOD. Looks like the main issue

[jira] [Commented] (TEZ-3291) Optimize splits grouping when locality information is not available

2016-06-07 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15319263#comment-15319263 ] Bikas Saha commented on TEZ-3291: - Then that would be a bug to fix. Hopefully thats what the patch is doing.

[jira] [Commented] (TEZ-3291) Optimize splits grouping when locality information is not available

2016-06-07 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15318941#comment-15318941 ] Bikas Saha commented on TEZ-3291: - IIRC they should because localhost will be treated as a valid machine

[jira] [Commented] (TEZ-3291) Optimize splits grouping when locality information is not available

2016-06-06 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15316870#comment-15316870 ] Bikas Saha commented on TEZ-3291: - Since the data fits within the max size for a grouped split its creating

[jira] [Commented] (TEZ-3271) Provide mapreduce failures.maxpercent equivalent

2016-06-01 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1536#comment-1536 ] Bikas Saha commented on TEZ-3271: - It will help if there is a bit more detail on whats the objective herein?

[jira] [Commented] (TEZ-3274) Vertex with MRInput and shuffle input does not respect slow start

2016-05-26 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15302549#comment-15302549 ] Bikas Saha commented on TEZ-3274: - There probably isnt. We could use this one. Or if you need an urgent

[jira] [Commented] (TEZ-3274) Vertex with MRInput and shuffle input does not respect slow start

2016-05-25 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15300753#comment-15300753 ] Bikas Saha commented on TEZ-3274: - This is a known limitation. The ideal solution is to split the

[jira] [Commented] (TEZ-2950) Poor performance of UnorderedPartitionedKVWriter

2016-05-19 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-2950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15291917#comment-15291917 ] Bikas Saha commented on TEZ-2950: - bq. 2. Rely on pipelined shuffle to avoid the final merge. Per old

[jira] [Commented] (TEZ-3222) Reduce messaging overhead for auto-reduce parallelism case

2016-05-18 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15290228#comment-15290228 ] Bikas Saha commented on TEZ-3222: - Thanks for the update! And sorry for the delayed response. {code}@@

[jira] [Commented] (TEZ-3242) Reduce bytearray copy with TezEvent Serialization and deserialization

2016-05-11 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15280723#comment-15280723 ] Bikas Saha commented on TEZ-3242: - lgtm. > Reduce bytearray copy with TezEvent Serialization and

[jira] [Commented] (TEZ-3244) Allow overlap of input and output memory when they are not concurrent

2016-05-06 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15274621#comment-15274621 ] Bikas Saha commented on TEZ-3244: - Nice idea! This definitely works for the case where the processor is

[jira] [Commented] (TEZ-3239) ShuffleVertexManager recovery issue when auto parallelism is enabled

2016-05-02 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15267913#comment-15267913 ] Bikas Saha commented on TEZ-3239: - Barring a bug, this should not be happening in the new recovery design.

[jira] [Commented] (TEZ-3203) DAG hangs when one of the upstream vertices has zero tasks

2016-04-27 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15261376#comment-15261376 ] Bikas Saha commented on TEZ-3203: - Now looking at the full code based on the findbugs I think I dont know

[jira] [Commented] (TEZ-3203) DAG hangs when one of the upstream vertices has zero tasks

2016-04-27 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15261136#comment-15261136 ] Bikas Saha commented on TEZ-3203: - Uploaded new patch. Credit for the jira and patch goes to Jason entirely.

[jira] [Updated] (TEZ-3203) DAG hangs when one of the upstream vertices has zero tasks

2016-04-27 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bikas Saha updated TEZ-3203: Attachment: TEZ-3203.3.patch > DAG hangs when one of the upstream vertices has zero tasks >

[jira] [Commented] (TEZ-3203) DAG hangs when one of the upstream vertices has zero tasks

2016-04-27 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15261125#comment-15261125 ] Bikas Saha commented on TEZ-3203: - My bad. I should have been more clear. The following would be safer than

[jira] [Commented] (TEZ-2104) A CrossProductEdge which produces synthetic cross-product parallelism

2016-04-27 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-2104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15260681#comment-15260681 ] Bikas Saha commented on TEZ-2104: - bq. Sorry for the inconsistency. Slow start only make sense for

[jira] [Commented] (TEZ-3232) Disable randomFailingInputs in testFaulttolerance to unblock other tests

2016-04-26 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15258790#comment-15258790 ] Bikas Saha commented on TEZ-3232: - lgtm > Disable randomFailingInputs in testFaulttolerance to unblock

[jira] [Commented] (TEZ-3219) Allow service plugins to define log locations link for remotely run task attempts

2016-04-25 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15256846#comment-15256846 ] Bikas Saha commented on TEZ-3219: - Is there any issue in having the YARN based plugins provide the existing

[jira] [Comment Edited] (TEZ-3222) Reduce messaging overhead for auto-reduce parallelism case

2016-04-21 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15252975#comment-15252975 ] Bikas Saha edited comment on TEZ-3222 at 4/21/16 11:12 PM: --- ShuffleVertexManager,

[jira] [Commented] (TEZ-3222) Reduce messaging overhead for auto-reduce parallelism case

2016-04-21 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15252975#comment-15252975 ] Bikas Saha commented on TEZ-3222: - ShuffleVertexManager, in theory, is a user land object and packaged in

[jira] [Commented] (TEZ-3203) DAG hangs when one of the upstream vertices has zero tasks

2016-04-07 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15231460#comment-15231460 ] Bikas Saha commented on TEZ-3203: - Good catch! Maybe we can get away with removing the numpendingtasks

[jira] [Commented] (TEZ-3198) Shuffle failures for the trailing task in a vertex are often fatal to the entire DAG

2016-04-07 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15230773#comment-15230773 ] Bikas Saha commented on TEZ-3198: - Yes. Looks like our defaults can be better for real life workloads. >

[jira] [Commented] (TEZ-3161) Allow task to report different kinds of errors - fatal / kill

2016-04-05 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15227749#comment-15227749 ] Bikas Saha commented on TEZ-3161: - Does a fatal error affect the recovery code path? E.g. fatal error got

[jira] [Commented] (TEZ-2442) Support DFS based shuffle in addition to HTTP shuffle

2016-03-23 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-2442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15208891#comment-15208891 ] Bikas Saha commented on TEZ-2442: - We typically use fs instead of dfs and DistributedFileSystem is actually

[jira] [Commented] (TEZ-2442) Support DFS based shuffle in addition to HTTP shuffle

2016-03-23 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-2442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1520#comment-1520 ] Bikas Saha commented on TEZ-2442: - IIRC, this is the same for both kinds of shuffle. Because consumers can

[jira] [Commented] (TEZ-2442) Support DFS based shuffle in addition to HTTP shuffle

2016-03-22 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-2442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207668#comment-15207668 ] Bikas Saha commented on TEZ-2442: - Its important to keep in mind that for significant perf gains, final

[jira] [Comment Edited] (TEZ-2442) Support DFS based shuffle in addition to HTTP shuffle

2016-03-22 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-2442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207647#comment-15207647 ] Bikas Saha edited comment on TEZ-2442 at 3/23/16 12:59 AM: --- Should this config

[jira] [Commented] (TEZ-3181) History parser : Handle invalid/unsupported history event types gracefully

2016-03-21 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15205820#comment-15205820 ] Bikas Saha commented on TEZ-3181: - I understand. But after that when this incomplete data is passed to 0.8

[jira] [Commented] (TEZ-3181) History parser : Handle invalid/unsupported history event types gracefully

2016-03-21 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15205794#comment-15205794 ] Bikas Saha commented on TEZ-3181: - Do we need this? Could we use 0.7 parser for 0.7 jobs and 0.8 parser for

[jira] [Commented] (TEZ-3168) Provide a more predictable approach for total resource guidance for wave/split calculation

2016-03-19 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15200619#comment-15200619 ] Bikas Saha commented on TEZ-3168: - For all of the problems with queue capacity, IMO cluster capacity is a

[jira] [Created] (TEZ-3164) Surface error histograms from the AM

2016-03-14 Thread Bikas Saha (JIRA)
Bikas Saha created TEZ-3164: --- Summary: Surface error histograms from the AM Key: TEZ-3164 URL: https://issues.apache.org/jira/browse/TEZ-3164 Project: Apache Tez Issue Type: Improvement

[jira] [Commented] (TEZ-3085) In session mode, the credentials passed via the Tez client constructor is not available to all the tasks

2016-03-04 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15181432#comment-15181432 ] Bikas Saha commented on TEZ-3085: - Yes. And looks like its already mentioned in the first comment of this

[jira] [Commented] (TEZ-3085) In session mode, the credentials passed via the Tez client constructor is not available to all the tasks

2016-03-04 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15181338#comment-15181338 ] Bikas Saha commented on TEZ-3085: - IIRC, didn't we recently start passing AM credentials to the DAG? > In

[jira] [Commented] (TEZ-1210) TezClientUtils.localizeDagPlanAsText() needs to be fixed for session mode

2016-03-04 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-1210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15180180#comment-15180180 ] Bikas Saha commented on TEZ-1210: - The DAGPlan is downloaded as a local resource to be used to run the DAG.

[jira] [Commented] (TEZ-3149) Tez-tools: Add username in DagInfo

2016-02-29 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15172648#comment-15172648 ] Bikas Saha commented on TEZ-3149: - lgtm . backport to 0.7 would be good. thanks! > Tez-tools: Add username

[jira] [Commented] (TEZ-3014) OOM during Shuffle in JDK 8

2016-02-28 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15171245#comment-15171245 ] Bikas Saha commented on TEZ-3014: - [~jeagles] [~jlowe] Is this still an issue? OOM + JDK 8. If not, then we

[jira] [Commented] (TEZ-2580) Remove VertexManagerPlugin#setVertexParallelism with VertexManagerPlugin#reconfigureVertex

2016-02-28 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-2580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15171239#comment-15171239 ] Bikas Saha commented on TEZ-2580: - We can only change this if dependent projects like Hive stop using it or

[jira] [Commented] (TEZ-3124) Running task hangs due to missing event to initialize input in recovery

2016-02-23 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15160058#comment-15160058 ] Bikas Saha commented on TEZ-3124: - So in this case task needed event to start and so it hung. If

[jira] [Commented] (TEZ-3124) Running task hangs due to missing event to initialize input in recovery

2016-02-23 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15160059#comment-15160059 ] Bikas Saha commented on TEZ-3124: - lgtm. +1. Thanks! > Running task hangs due to missing event to

[jira] [Commented] (TEZ-3124) Running task hangs due to missing event to initialize input in recovery

2016-02-23 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159946#comment-15159946 ] Bikas Saha commented on TEZ-3124: - Then the fix should be restricted to not logging VertexInitializedEvent

[jira] [Commented] (TEZ-3102) Fetch failure of a speculated task causes job hang

2016-02-23 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159835#comment-15159835 ] Bikas Saha commented on TEZ-3102: - +1. I think testTaskSucceedAndRetroActiveFailure() should be covering

[jira] [Comment Edited] (TEZ-3102) Fetch failure of a speculated task causes job hang

2016-02-23 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159835#comment-15159835 ] Bikas Saha edited comment on TEZ-3102 at 2/23/16 11:09 PM: --- +1. I think

[jira] [Commented] (TEZ-3124) Running task hangs due to missing event to initialize input in recovery

2016-02-23 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159548#comment-15159548 ] Bikas Saha commented on TEZ-3124: - Lets say shouldSkipInit() is false because VertexInitializedEvent !=null

[jira] [Commented] (TEZ-2962) Use per partition stats in shuffle vertex manager auto parallelism

2016-02-23 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-2962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159414#comment-15159414 ] Bikas Saha commented on TEZ-2962: - The downside of partition stats is that the values are approximate in

[jira] [Commented] (TEZ-3126) Log reason for not reducing parallelism

2016-02-22 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15158125#comment-15158125 ] Bikas Saha commented on TEZ-3126: - lgtm > Log reason for not reducing parallelism >

[jira] [Commented] (TEZ-3131) Support a way to override test_root_dir for FaultToleranceTestRunner

2016-02-22 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15157572#comment-15157572 ] Bikas Saha commented on TEZ-3131: - Sure. Please go ahead. +1. > Support a way to override test_root_dir for

[jira] [Commented] (TEZ-3131) Support a way to override test_root_dir for FaultToleranceTestRunner

2016-02-20 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15155862#comment-15155862 ] Bikas Saha commented on TEZ-3131: - lgtm overall. The string value of the config name is atypical of config

[jira] [Commented] (TEZ-3126) Auto-Reduce Parallelism: Vertex not re-configured when reduced by less than half.

2016-02-19 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15155030#comment-15155030 ] Bikas Saha commented on TEZ-3126: - Sure. When we have per partition sizes then we can remove these

[jira] [Commented] (TEZ-3126) Auto-Reduce Parallelism: Vertex not re-configured when reduced by less than half.

2016-02-18 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/TEZ-3126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15153493#comment-15153493 ] Bikas Saha commented on TEZ-3126: - I am sorry I might not have understood your comments fully. Are you

  1   2   3   4   5   6   7   8   9   10   >