[jira] [Created] (TEZ-2119) Counter for launched containers
Rohini Palaniswamy created TEZ-2119: --- Summary: Counter for launched containers Key: TEZ-2119 URL: https://issues.apache.org/jira/browse/TEZ-2119 Project: Apache Tez Issue Type: New Feature Reporter: Rohini Palaniswamy org.apache.tez.common.counters.DAGCounter NUM_SUCCEEDED_TASKS=32976 TOTAL_LAUNCHED_TASKS=32976 OTHER_LOCAL_TASKS=2 DATA_LOCAL_TASKS=9147 RACK_LOCAL_TASKS=23761 It would be very nice to have TOTAL_LAUNCHED_CONTAINERS counter added to this. The difference between TOTAL_LAUNCHED_CONTAINERS and TOTAL_LAUNCHED_TASKS should make it easy to see how much container reuse is happening. It is very hard to find out now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2119) Counter for launched containers
[ https://issues.apache.org/jira/browse/TEZ-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated TEZ-2119: Issue Type: Improvement (was: New Feature) > Counter for launched containers > --- > > Key: TEZ-2119 > URL: https://issues.apache.org/jira/browse/TEZ-2119 > Project: Apache Tez > Issue Type: Improvement >Reporter: Rohini Palaniswamy > > org.apache.tez.common.counters.DAGCounter > NUM_SUCCEEDED_TASKS=32976 > TOTAL_LAUNCHED_TASKS=32976 > OTHER_LOCAL_TASKS=2 > DATA_LOCAL_TASKS=9147 > RACK_LOCAL_TASKS=23761 > It would be very nice to have TOTAL_LAUNCHED_CONTAINERS counter added to > this. The difference between TOTAL_LAUNCHED_CONTAINERS and > TOTAL_LAUNCHED_TASKS should make it easy to see how much container reuse is > happening. It is very hard to find out now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2119) Counter for launched containers
[ https://issues.apache.org/jira/browse/TEZ-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14326802#comment-14326802 ] Rohini Palaniswamy commented on TEZ-2119: - [~sseth], Getting older :(. You were right. https://issues.apache.org/jira/browse/TEZ-987 is the one. Probably use this one for counters and use the other one to implement APIs? I was recently running a pig script on a very small queue which can run only 76 containers at a time. I was hoping it would be the same 76 containers reused over and over for the 33K tasks, but it was launching new containers often. I am wondering if it was because of data locality. Did not get to reading the AM logs yet as the size is ~350M and was feeling lazy to dig in. Is there something else that can be added for this? Swimlanes may be useful to get some idea on container reuse. But I am thinking more in terms of being able to mine later with job stats populated in hive tables. > Counter for launched containers > --- > > Key: TEZ-2119 > URL: https://issues.apache.org/jira/browse/TEZ-2119 > Project: Apache Tez > Issue Type: Improvement >Reporter: Rohini Palaniswamy > > org.apache.tez.common.counters.DAGCounter > NUM_SUCCEEDED_TASKS=32976 > TOTAL_LAUNCHED_TASKS=32976 > OTHER_LOCAL_TASKS=2 > DATA_LOCAL_TASKS=9147 > RACK_LOCAL_TASKS=23761 > It would be very nice to have TOTAL_LAUNCHED_CONTAINERS counter added to > this. The difference between TOTAL_LAUNCHED_CONTAINERS and > TOTAL_LAUNCHED_TASKS should make it easy to see how much container reuse is > happening. It is very hard to find out now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (TEZ-2144) Compressing user payload
Rohini Palaniswamy created TEZ-2144: --- Summary: Compressing user payload Key: TEZ-2144 URL: https://issues.apache.org/jira/browse/TEZ-2144 Project: Apache Tez Issue Type: Improvement Reporter: Rohini Palaniswamy Pig sets the input split information in user payload and when running against a table with 10s of 1000s of partitions, DAG submission fails with java.io.IOException: Requested data length 305844060 is longer than maximum configured RPC length 67108864 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2144) Compressing user payload
[ https://issues.apache.org/jira/browse/TEZ-2144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14337288#comment-14337288 ] Rohini Palaniswamy commented on TEZ-2144: - There are 3 ways to pass split information in Tez - https://issues.apache.org/jira/browse/PIG-3564?focusedCommentId=13816848&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13816848 Pig sets it on the DAG submission payload to AM. Calculating splits in AM is the best. But the probem is with input formats like HCat and HBase, the delegation token is fetched with getInputSplits(). {code} tezOp.getLoaderInfo().setInputSplitInfo(MRInputHelpers.generateInputSplitsToMem(conf, false, 0)); .. .. vertex.setLocationHint(VertexLocationHint.create(tezOp.getLoaderInfo().getInputSplitInfo().getTaskLocationHints())); vertex.addDataSource(ld.getOperatorKey().toString(), DataSourceDescriptor.create(InputDescriptor.create(MRInput.class.getName()) .setUserPayload(UserPayload.create(MRRuntimeProtos.MRInputUserPayloadProto.newBuilder() .setConfigurationBytes(TezUtils.createByteStringFromConf(payloadConf)) .setSplits(tezOp.getLoaderInfo().getInputSplitInfo().getSplitsProto()).build().toByteString().asReadOnlyByteBuffer())) .setHistoryText(convertToHistoryText("", payloadConf)), InputInitializerDescriptor.create(MRInputSplitDistributor.class.getName()), dag.getCredentials())); {code} Pig on Tez HCatLoader jobs with one of the biggest hcat tables with huge number of partitions hits the RPC limit issue. Running on HDFS data which has even triple the number of splits(100K+ splits and tasks) does not hit this issue. HCatBaseInputFormat.java: {code} //Call getSplit on the InputFormat, create an //HCatSplit for each underlying split //NumSplits is 0 for our purposes org.apache.hadoop.mapred.InputSplit[] baseSplits = inputFormat.getSplits(jobConf, 0); for(org.apache.hadoop.mapred.InputSplit split : baseSplits) { splits.add(new HCatSplit( partitionInfo, split,allCols)); } {code} Each hcatSplit duplicates a lot of information - partition schema and table schema On the Pig side, trying to workaround the problem by using the alternative of writing the split files to hdfs for big splits like this. Increasing the RPC size limit will not help as it is going to 300MB just for a day's worth of data. Weekly and monthly data scanning will make it run into GBs. Had a discussion with [~sseth] to see if it make sense for tez to compress the input split information. It would help a lot in cases like this. Even for HDFS paths, with the same base path being repeated compression would help. He asked to create this jira to see if the compression would need to be in Tez or Pig, check if there will be any gain if each proto is compressed separately or should the entire payload be compressed > Compressing user payload > > > Key: TEZ-2144 > URL: https://issues.apache.org/jira/browse/TEZ-2144 > Project: Apache Tez > Issue Type: Improvement >Reporter: Rohini Palaniswamy > > Pig sets the input split information in user payload and when running against > a table with 10s of 1000s of partitions, DAG submission fails with > java.io.IOException: Requested data length 305844060 is longer than maximum > configured RPC length 67108864 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2144) Compressing user payload
[ https://issues.apache.org/jira/browse/TEZ-2144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14337295#comment-14337295 ] Rohini Palaniswamy commented on TEZ-2144: - [~sseth], When you go with the option of writing input splits on HDFS it does not cause memory pressure for AM like the other two options right? > Compressing user payload > > > Key: TEZ-2144 > URL: https://issues.apache.org/jira/browse/TEZ-2144 > Project: Apache Tez > Issue Type: Improvement >Reporter: Rohini Palaniswamy > > Pig sets the input split information in user payload and when running against > a table with 10s of 1000s of partitions, DAG submission fails with > java.io.IOException: Requested data length 305844060 is longer than maximum > configured RPC length 67108864 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2144) Compressing user payload
[ https://issues.apache.org/jira/browse/TEZ-2144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14337428#comment-14337428 ] Rohini Palaniswamy commented on TEZ-2144: - bq. However, a question on the payload. To confirm, is the input split information in the payload of only the required descriptor i.e. the input of the map stage and not replicated for to user payloads of all descriptors? No. It is set only for the vertex that does the particular LOAD. bq. For such cases, can you try creating the TezClient in non-session mode by changing its input configuration. The DAG will be send as a local resource in that case and things should work. That will be a bigger change to Pig code and also not worth identifying what should be in session mode and non-session mode and switching between both. > Compressing user payload > > > Key: TEZ-2144 > URL: https://issues.apache.org/jira/browse/TEZ-2144 > Project: Apache Tez > Issue Type: Improvement >Reporter: Rohini Palaniswamy > > Pig sets the input split information in user payload and when running against > a table with 10s of 1000s of partitions, DAG submission fails with > java.io.IOException: Requested data length 305844060 is longer than maximum > configured RPC length 67108864 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (TEZ-2153) Swimlanes UI does not work well with thousands of tasks
Rohini Palaniswamy created TEZ-2153: --- Summary: Swimlanes UI does not work well with thousands of tasks Key: TEZ-2153 URL: https://issues.apache.org/jira/browse/TEZ-2153 Project: Apache Tez Issue Type: Bug Reporter: Rohini Palaniswamy Ran a simple pig script with LOAD and STORE which launched 33K tasks in the vertex. The swimlanes UI does not show all the tasks even when I edited with the firebug and increased the height and width. Saved the html and saw that there were only 122 class="tick" elements. Another issue was that the container names were not sorted and it made reading difficult. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2153) Swimlanes UI does not work well with thousands of tasks
[ https://issues.apache.org/jira/browse/TEZ-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated TEZ-2153: Component/s: UI > Swimlanes UI does not work well with thousands of tasks > --- > > Key: TEZ-2153 > URL: https://issues.apache.org/jira/browse/TEZ-2153 > Project: Apache Tez > Issue Type: Bug > Components: UI >Reporter: Rohini Palaniswamy > > Ran a simple pig script with LOAD and STORE which launched 33K tasks in the > vertex. The swimlanes UI does not show all the tasks even when I edited with > the firebug and increased the height and width. Saved the html and saw that > there were only 122 class="tick" elements. >Another issue was that the container names were not sorted and it made > reading difficult. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2144) Compressing MRInput Split Distributor payload
[ https://issues.apache.org/jira/browse/TEZ-2144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14339396#comment-14339396 ] Rohini Palaniswamy commented on TEZ-2144: - bq. Is there a reason to not run everything in non-session mode ? Currently the code does not differentiate between grunt mode and script mode. Also a single script can launch more than 1 DAG based on things like merge join, embedded Pig in parallel mode, presence of exec, shell or fs commands. Except for the parallel mode others will be serial. Launching different applications for them would then be same as mapreduce putting them back in queue and waiting for resources to be available to be launched. > Compressing MRInput Split Distributor payload > - > > Key: TEZ-2144 > URL: https://issues.apache.org/jira/browse/TEZ-2144 > Project: Apache Tez > Issue Type: Improvement >Reporter: Rohini Palaniswamy > > Pig sets the input split information in user payload and when running against > a table with 10s of 1000s of partitions, DAG submission fails with > java.io.IOException: Requested data length 305844060 is longer than maximum > configured RPC length 67108864 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2153) Swimlanes UI does not work well with thousands of tasks
[ https://issues.apache.org/jira/browse/TEZ-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14345047#comment-14345047 ] Rohini Palaniswamy commented on TEZ-2153: - Missed seeing TEZ-1652 as I was looking at tickets open for Component UI and TEZ-1652 did not have the Component set. Close this as dupe of the other or leave it open to handle the container name sorting? > Swimlanes UI does not work well with thousands of tasks > --- > > Key: TEZ-2153 > URL: https://issues.apache.org/jira/browse/TEZ-2153 > Project: Apache Tez > Issue Type: Bug > Components: UI >Reporter: Rohini Palaniswamy > > Ran a simple pig script with LOAD and STORE which launched 33K tasks in the > vertex. The swimlanes UI does not show all the tasks even when I edited with > the firebug and increased the height and width. Saved the html and saw that > there were only 122 class="tick" elements. >Another issue was that the container names were not sorted and it made > reading difficult. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-776) Reduce AM mem usage caused by storing TezEvents
[ https://issues.apache.org/jira/browse/TEZ-776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14354316#comment-14354316 ] Rohini Palaniswamy commented on TEZ-776: Ran the https://issues.apache.org/jira/secure/attachment/12703385/TEZ-776.ondemand.5.patch on a pig script with the below plan which required 16G AM before. It succeeded without OOM on 4G container. v1 93378 (LOAD, FILTER) -> v2 1000 (GROUP) v3 4431 (LOAD, FILTER) -> v4 1000 (GROUP) v2,v4 -> v5 1000 (JOIN) -> v6 1000 (GROUP) org.apache.tez.common.counters.DAGCounter NUM_SUCCEEDED_TASKS=101809 TOTAL_LAUNCHED_TASKS=101809 OTHER_LOCAL_TASKS=44 DATA_LOCAL_TASKS=75055 RACK_LOCAL_TASKS=22710 I was monitoring the top usage on the AM couple of times. It did not even use the 4G, but I was only checking before and around 30K tasks had completed. Did not check after that. Haven't got the time to go through the code in the patch or the long conversation history in the jira. But looks good from the test run perspective. > Reduce AM mem usage caused by storing TezEvents > --- > > Key: TEZ-776 > URL: https://issues.apache.org/jira/browse/TEZ-776 > Project: Apache Tez > Issue Type: Sub-task >Reporter: Siddharth Seth >Assignee: Bikas Saha > Attachments: TEZ-776.ondemand.1.patch, TEZ-776.ondemand.2.patch, > TEZ-776.ondemand.3.patch, TEZ-776.ondemand.4.patch, TEZ-776.ondemand.5.patch, > TEZ-776.ondemand.patch, With_Patch_AM_hotspots.png, > With_Patch_AM_profile.png, Without_patch_AM_CPU_Usage.png, > events-problem-solutions.txt, with_patch_jmc_output_of_AM.png, > without_patch_jmc_output_of_AM.png > > > This is open ended at the moment. > A fair chunk of the AM heap is taken up by TezEvents (specifically > DataMovementEvents - 64 bytes per event). > Depending on the connection pattern - this puts limits on the number of tasks > that can be processed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (TEZ-2192) Relocalization does not check for source
Rohini Palaniswamy created TEZ-2192: --- Summary: Relocalization does not check for source Key: TEZ-2192 URL: https://issues.apache.org/jira/browse/TEZ-2192 Project: Apache Tez Issue Type: Bug Reporter: Rohini Palaniswamy PIG-4443 spills the input splits to disk if serialized split size is greater than some threshold. It faces issues with relocalization when more than one vertex has job.split file. If a job.split file is already there on container reuse, it is reused causing wrong data to be read. Either need a way to turn off relocalization or check the source+timestamp and redownload the file during relocalization. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2192) Relocalization does not check for source
[ https://issues.apache.org/jira/browse/TEZ-2192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14355976#comment-14355976 ] Rohini Palaniswamy commented on TEZ-2192: - More details on the sequence of events leading to the problem: Three vertices v1, v2, v3. v1 does not have a job.split file. v2 and v3 have. When container of v1 is reused for v2, job.split file is downloaded as v1 did not have it. When v3 reuses the container again (according to initial YARN localization, container does not have job.split and there is no conflict), the relocalization code sees that the job.split file is already present locally and reuses it. If v1 had a job.split file, then there would be no issue as other vertices would see a conflict and not reuse the container. Problem is when a conflicting file is downloaded as part of relocalization and further relocalizations handle files with same names. > Relocalization does not check for source > > > Key: TEZ-2192 > URL: https://issues.apache.org/jira/browse/TEZ-2192 > Project: Apache Tez > Issue Type: Bug >Reporter: Rohini Palaniswamy > > PIG-4443 spills the input splits to disk if serialized split size is greater > than some threshold. It faces issues with relocalization when more than one > vertex has job.split file. If a job.split file is already there on container > reuse, it is reused causing wrong data to be read. > Either need a way to turn off relocalization or check the source+timestamp > and redownload the file during relocalization. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2192) Relocalization does not check for source
[ https://issues.apache.org/jira/browse/TEZ-2192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14356259#comment-14356259 ] Rohini Palaniswamy commented on TEZ-2192: - Thanks [~hitesh] for increasing the priority. [~sseth] suggested a hacky workaround which of course he did not like recommending. But for the short term going with that workaround in Pig to unblock reading from big tables with HCatLoader as there is no other alternative without having this fixed in Tez. Hack is to create a job.split file for all vertices if we create for one so that there is a conflict initially itself and containers are not reused across vertices. > Relocalization does not check for source > > > Key: TEZ-2192 > URL: https://issues.apache.org/jira/browse/TEZ-2192 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.6.0, 0.5.2 >Reporter: Rohini Palaniswamy >Priority: Blocker > > PIG-4443 spills the input splits to disk if serialized split size is greater > than some threshold. It faces issues with relocalization when more than one > vertex has job.split file. If a job.split file is already there on container > reuse, it is reused causing wrong data to be read. > Either need a way to turn off relocalization or check the source+timestamp > and redownload the file during relocalization. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-1870) Time displayed in the UI is in GMT
[ https://issues.apache.org/jira/browse/TEZ-1870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14370019#comment-14370019 ] Rohini Palaniswamy commented on TEZ-1870: - RM UI displays in GMT while Tez UI does in PST. This gets very confusing. Was there a reason we changed it to local time? We need to have Timezone configurable in UI if local time is required by some. > Time displayed in the UI is in GMT > -- > > Key: TEZ-1870 > URL: https://issues.apache.org/jira/browse/TEZ-1870 > Project: Apache Tez > Issue Type: Bug >Reporter: Siddharth Seth >Assignee: Sreenath Somarajapuram > Fix For: 0.6.0 > > Attachments: TEZ-1870.1.patch, TEZ-1870.2.patch, TEZ-1870.3.patch > > > Should this be local time ? > Otherwise, mentioning the timezone on the UI will be useful. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2205) Tez still tries to post to ATS when yarn.timeline-service.enabled=false
[ https://issues.apache.org/jira/browse/TEZ-2205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14371996#comment-14371996 ] Rohini Palaniswamy commented on TEZ-2205: - Wouldn't it be better if ATSHistoryLogger checked the value of the timeline setting and do nothing if it is false. > Tez still tries to post to ATS when yarn.timeline-service.enabled=false > --- > > Key: TEZ-2205 > URL: https://issues.apache.org/jira/browse/TEZ-2205 > Project: Apache Tez > Issue Type: Sub-task >Affects Versions: 0.6.1 >Reporter: Chang Li >Assignee: Chang Li > Attachments: TEZ-2205.wip.patch > > > when set yarn.timeline-service.enabled=false, Tez still tries posting to ATS, > but hits error as token is not found. Does not fail the job because of the > fix to not fail job when there is error posting to ATS. But it should not be > trying to post to ATS in the first place. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2205) Tez still tries to post to ATS when yarn.timeline-service.enabled=false
[ https://issues.apache.org/jira/browse/TEZ-2205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14372005#comment-14372005 ] Rohini Palaniswamy commented on TEZ-2205: - {code} public void handle(DAGHistoryEvent event) { eventQueue.add(event); } {code} It would be a simple check to not add to the queue if timeline service is disabled. > Tez still tries to post to ATS when yarn.timeline-service.enabled=false > --- > > Key: TEZ-2205 > URL: https://issues.apache.org/jira/browse/TEZ-2205 > Project: Apache Tez > Issue Type: Sub-task >Affects Versions: 0.6.1 >Reporter: Chang Li >Assignee: Chang Li > Attachments: TEZ-2205.wip.patch > > > when set yarn.timeline-service.enabled=false, Tez still tries posting to ATS, > but hits error as token is not found. Does not fail the job because of the > fix to not fail job when there is error posting to ATS. But it should not be > trying to post to ATS in the first place. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2205) Tez still tries to post to ATS when yarn.timeline-service.enabled=false
[ https://issues.apache.org/jira/browse/TEZ-2205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14377562#comment-14377562 ] Rohini Palaniswamy commented on TEZ-2205: - I prefer 3). For jobs launched through Oozie it is easy to turn off ATS via Oozie server side setting and this might be required in the near future now and then considering the issues we are facing with ATS. Since tez-site.xml for those jobs come from HDFS, it is not easy to change the tez ATS Logger easily (replacing the file on HDFS is more manual and can cause running jobs to fail as LocalResource time has changed) and so do not like 1). Also having to change multiple settings to turn off something is cumbersome. 2) is what is happening now but the problem I see is that it impacts performance as time is wasted trying to connect to ATS and failing due to lack of authentication. > Tez still tries to post to ATS when yarn.timeline-service.enabled=false > --- > > Key: TEZ-2205 > URL: https://issues.apache.org/jira/browse/TEZ-2205 > Project: Apache Tez > Issue Type: Sub-task >Affects Versions: 0.6.1 >Reporter: Chang Li >Assignee: Chang Li > Attachments: TEZ-2205.wip.patch > > > when set yarn.timeline-service.enabled=false, Tez still tries posting to ATS, > but hits error as token is not found. Does not fail the job because of the > fix to not fail job when there is error posting to ATS. But it should not be > trying to post to ATS in the first place. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-986) Make conf set on DAG and vertex available in jobhistory
[ https://issues.apache.org/jira/browse/TEZ-986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14377568#comment-14377568 ] Rohini Palaniswamy commented on TEZ-986: bq. viewable in Tez UI after the job completes. This is very essential for debugging jobs. Just wanted to mention that we need this for Pig and would be good to have it in one of the upcoming releases. While debugging some of the recent issues, realized that I don't have access to pig script if user ran it from a gateway (for Oozie I can get it from launcher job), because pig.script setting is only set on the DAG config and few other settings useful for debugging like pig version. For now, I have other workarounds to get this info or resort to asking the user. The vertex config also has some important debugging info like what feature is being run (group by, etc), input/output dirs, etc. Even for this can manage for the short term and figure these out with the explain output of the script. But life would be easier if those are shown in the UI. > Make conf set on DAG and vertex available in jobhistory > --- > > Key: TEZ-986 > URL: https://issues.apache.org/jira/browse/TEZ-986 > Project: Apache Tez > Issue Type: Sub-task > Components: UI >Reporter: Rohini Palaniswamy >Priority: Blocker > > Would like to have the conf set on DAG and Vertex > 1) viewable in Tez UI after the job completes. This is very essential for > debugging jobs. > 2) We have processes, that parse jobconf.xml from job history (hdfs) and > load them into hive tables for analysis. Would like to have Tez also make all > the configuration (byte array) available in job history so that we can > similarly parse them. 1) mandates that you store it in hdfs. 2) is just to > say make the format stored as a contract others can rely on for parsing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2205) Tez still tries to post to ATS when yarn.timeline-service.enabled=false
[ https://issues.apache.org/jira/browse/TEZ-2205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14378245#comment-14378245 ] Rohini Palaniswamy commented on TEZ-2205: - OOZIE-2133 is the one that handles getting delegation tokens for ATS for tez jobs. If oozie.action.launcher.yarn.timeline-service.enabled is set to true on the Oozie server configuration, it adds yarn.timeline-service.enabled=true to conf of JobClient that submits the launcher job if the tez-site.xml is part of the distributed cache. JobClient (YARN) fetches ATS delegation token if that setting is set before the job is submitted and adds it to the job. > Tez still tries to post to ATS when yarn.timeline-service.enabled=false > --- > > Key: TEZ-2205 > URL: https://issues.apache.org/jira/browse/TEZ-2205 > Project: Apache Tez > Issue Type: Sub-task >Affects Versions: 0.6.1 >Reporter: Chang Li >Assignee: Chang Li > Attachments: TEZ-2205.wip.patch > > > when set yarn.timeline-service.enabled=false, Tez still tries posting to ATS, > but hits error as token is not found. Does not fail the job because of the > fix to not fail job when there is error posting to ATS. But it should not be > trying to post to ATS in the first place. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2231) Create project by-laws
[ https://issues.apache.org/jira/browse/TEZ-2231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381652#comment-14381652 ] Rohini Palaniswamy commented on TEZ-2231: - +1 > Create project by-laws > -- > > Key: TEZ-2231 > URL: https://issues.apache.org/jira/browse/TEZ-2231 > Project: Apache Tez > Issue Type: Task >Reporter: Hitesh Shah >Assignee: Hitesh Shah > Attachments: by-laws.2.patch, by-laws.patch > > > Define the Project by-laws. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2252) Tez UI Graphical view is wrong in some cases
[ https://issues.apache.org/jira/browse/TEZ-2252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated TEZ-2252: Component/s: UI > Tez UI Graphical view is wrong in some cases > > > Key: TEZ-2252 > URL: https://issues.apache.org/jira/browse/TEZ-2252 > Project: Apache Tez > Issue Type: Bug > Components: UI >Reporter: Rohini Palaniswamy > > The information in the .dot file is correct and script runs fine. But the > Tez UI Graphical view shows that output is being written from multiple > vertices into one sink, while each of them is writing to their own sink. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (TEZ-2252) Tez UI Graphical view is wrong in some cases
Rohini Palaniswamy created TEZ-2252: --- Summary: Tez UI Graphical view is wrong in some cases Key: TEZ-2252 URL: https://issues.apache.org/jira/browse/TEZ-2252 Project: Apache Tez Issue Type: Bug Reporter: Rohini Palaniswamy The information in the .dot file is correct and script runs fine. But the Tez UI Graphical view shows that output is being written from multiple vertices into one sink, while each of them is writing to their own sink. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2252) Tez UI Graphical view is wrong in some cases
[ https://issues.apache.org/jira/browse/TEZ-2252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated TEZ-2252: Attachment: Graph2.png Graph2.dot Graph1.png Graph1.dot In the attached graphs - Graph1 has 10 outputs, but Tez UI only shows 6 with 5 vertices all going to scope-133. Graph1.dot correctly shows that they all go to different MROutput though. - Graph 2 has 3 outputs, but Tez UI only shows 1 with all 3 leaf vertices going to scope-228 > Tez UI Graphical view is wrong in some cases > > > Key: TEZ-2252 > URL: https://issues.apache.org/jira/browse/TEZ-2252 > Project: Apache Tez > Issue Type: Bug > Components: UI >Reporter: Rohini Palaniswamy > Attachments: Graph1.dot, Graph1.png, Graph2.dot, Graph2.png > > > The information in the .dot file is correct and script runs fine. But the > Tez UI Graphical view shows that output is being written from multiple > vertices into one sink, while each of them is writing to their own sink. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2252) Tez UI Graphical view looks wrong in some cases
[ https://issues.apache.org/jira/browse/TEZ-2252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated TEZ-2252: Description: The information in the .dot file is correct and script runs fine. But the Tez UI Graphical view shows that output is being written from multiple vertices into one sink. Actually it is writing to multiple sinks (checking the html elements in Firebug), but the sink circles all overlap one another with exact coordinates and the tool tip only shows for the top one. (was: The information in the .dot file is correct and script runs fine. But the Tez UI Graphical view shows that output is being written from multiple vertices into one sink, while each of them is writing to their own sink.) Summary: Tez UI Graphical view looks wrong in some cases (was: Tez UI Graphical view is wrong in some cases) > Tez UI Graphical view looks wrong in some cases > --- > > Key: TEZ-2252 > URL: https://issues.apache.org/jira/browse/TEZ-2252 > Project: Apache Tez > Issue Type: Bug > Components: UI >Reporter: Rohini Palaniswamy > Attachments: Graph1.dot, Graph1.png, Graph2.dot, Graph2.png > > > The information in the .dot file is correct and script runs fine. But the > Tez UI Graphical view shows that output is being written from multiple > vertices into one sink. Actually it is writing to multiple sinks (checking > the html elements in Firebug), but the sink circles all overlap one another > with exact coordinates and the tool tip only shows for the top one. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (TEZ-2252) Tez UI Graphical view looks wrong in some cases
[ https://issues.apache.org/jira/browse/TEZ-2252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14387593#comment-14387593 ] Rohini Palaniswamy edited comment on TEZ-2252 at 3/31/15 12:37 AM: --- In the attached graphs - Graph1 has 10 outputs (Graph1.dot), but Tez UI only shows 6 with 5 vertices all going to scope-133. - Graph 2 has 3 outputs(Graph 2.dot), but Tez UI only shows 1 with all 3 leaf vertices going to scope-228 HTML in case of Graph 2 with overlapped node outputs. {code} sc.. sc.. sc.. {code} was (Author: rohini): In the attached graphs - Graph1 has 10 outputs, but Tez UI only shows 6 with 5 vertices all going to scope-133. Graph1.dot correctly shows that they all go to different MROutput though. - Graph 2 has 3 outputs, but Tez UI only shows 1 with all 3 leaf vertices going to scope-228 > Tez UI Graphical view looks wrong in some cases > --- > > Key: TEZ-2252 > URL: https://issues.apache.org/jira/browse/TEZ-2252 > Project: Apache Tez > Issue Type: Bug > Components: UI >Reporter: Rohini Palaniswamy > Attachments: Graph1.dot, Graph1.png, Graph2.dot, Graph2.png > > > The information in the .dot file is correct and script runs fine. But the > Tez UI Graphical view shows that output is being written from multiple > vertices into one sink. Actually it is writing to multiple sinks (checking > the html elements in Firebug), but the sink circles all overlap one another > with exact coordinates and the tool tip only shows for the top one. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-1190) Allow multiple edges between two vertexes
[ https://issues.apache.org/jira/browse/TEZ-1190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14393033#comment-14393033 ] Rohini Palaniswamy commented on TEZ-1190: - I encountered a couple of queries in past two weeks which suffer from performance due to this. Currently we write out the data to another dummy vertex to avoid multiple edges and this adds overhead. The common patterns are 1) People split the data, perform some foreach transformations/filter, union them and then do some operation like group by or join with other data 2) People split the data, perform some foreach transformations/filter and self join them. No union in this case. Vertex groups accept multiple edges from same vertex. So we can optimize the multi-query planning for 1) when we know there is a vertex group. I hope we can rely on that behavior and that does not change? > Allow multiple edges between two vertexes > - > > Key: TEZ-1190 > URL: https://issues.apache.org/jira/browse/TEZ-1190 > Project: Apache Tez > Issue Type: Bug >Reporter: Daniel Dai > > This will be helpful in some scenario. In particular example, we can merge > two small pipelines together in one pair of vertex. Note it is possible the > edge type between the two vertexes are different. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (TEZ-2278) Tez UI start/end time and duration shown are wrong for tasks
Rohini Palaniswamy created TEZ-2278: --- Summary: Tez UI start/end time and duration shown are wrong for tasks Key: TEZ-2278 URL: https://issues.apache.org/jira/browse/TEZ-2278 Project: Apache Tez Issue Type: Bug Components: UI Affects Versions: 0.6.0 Reporter: Rohini Palaniswamy Observing lot of time discrepancies between vertex, task and swinlane views. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-1190) Allow multiple edges between two vertexes
[ https://issues.apache.org/jira/browse/TEZ-1190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14438125#comment-14438125 ] Rohini Palaniswamy commented on TEZ-1190: - With the changes in PIG-4495 have handled both of the above scenarios in Pig itself. So we do not require this anymore for Pig. But leaving it open if it makes life easier for Hive and Cascading. > Allow multiple edges between two vertexes > - > > Key: TEZ-1190 > URL: https://issues.apache.org/jira/browse/TEZ-1190 > Project: Apache Tez > Issue Type: Bug >Reporter: Daniel Dai > > This will be helpful in some scenario. In particular example, we can merge > two small pipelines together in one pair of vertex. Note it is possible the > edge type between the two vertexes are different. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (TEZ-390) Support a Vertex Initializer and Committer
[ https://issues.apache.org/jira/browse/TEZ-390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy resolved TEZ-390. Resolution: Fixed > Support a Vertex Initializer and Committer > -- > > Key: TEZ-390 > URL: https://issues.apache.org/jira/browse/TEZ-390 > Project: Apache Tez > Issue Type: Sub-task >Reporter: Rohini Palaniswamy > > Need equivalent of setupJob, commitJob and abortJob of OutputFormat. Many > LoadFunc/StoreFunc implement this. For eg: HCatStorer publishes partitions > atomically on commitJob. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (TEZ-2300) TezClient.stop() takes a lot of time
Rohini Palaniswamy created TEZ-2300: --- Summary: TezClient.stop() takes a lot of time Key: TEZ-2300 URL: https://issues.apache.org/jira/browse/TEZ-2300 Project: Apache Tez Issue Type: Bug Reporter: Rohini Palaniswamy Noticed this with a couple of pig scripts which were not behaving well (AM close to OOM, etc) and even with some that were running fine. Pig calls Tezclient.stop() in shutdown hook. Ctrl+C to the pig script either exits immediately or is hung. In both cases it either takes a long time for the yarn application to go to KILLED state. Many times I just end up calling yarn application -kill separately after waiting for 5 mins or more for it to get killed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2300) TezClient.stop() takes a lot of time or does not work sometimes
[ https://issues.apache.org/jira/browse/TEZ-2300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14488203#comment-14488203 ] Rohini Palaniswamy commented on TEZ-2300: - Recently, it was from too many job runs for issues addressed TEZ-776. I haven't kept track and the job logs are huge to dig and find out what I did to which job. Will get some with future runs. > TezClient.stop() takes a lot of time or does not work sometimes > --- > > Key: TEZ-2300 > URL: https://issues.apache.org/jira/browse/TEZ-2300 > Project: Apache Tez > Issue Type: Bug >Reporter: Rohini Palaniswamy > > Noticed this with a couple of pig scripts which were not behaving well (AM > close to OOM, etc) and even with some that were running fine. Pig calls > Tezclient.stop() in shutdown hook. Ctrl+C to the pig script either exits > immediately or is hung. In both cases it either takes a long time for the > yarn application to go to KILLED state. Many times I just end up calling yarn > application -kill separately after waiting for 5 mins or more for it to get > killed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2234) Allow vertex managers to get output size per source vertex
[ https://issues.apache.org/jira/browse/TEZ-2234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14488293#comment-14488293 ] Rohini Palaniswamy commented on TEZ-2234: - How about adding output records as well to the statistics? I can see that coming handy in future. Or at least have the APIs in place. > Allow vertex managers to get output size per source vertex > -- > > Key: TEZ-2234 > URL: https://issues.apache.org/jira/browse/TEZ-2234 > Project: Apache Tez > Issue Type: Bug >Reporter: Bikas Saha >Assignee: Bikas Saha > Attachments: TEZ-2234.1.patch, TEZ-2234.2.patch > > > Vertex managers may need per source vertex output stats to make > reconfiguration decisions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (TEZ-2300) TezClient.stop() takes a lot of time or does not work sometimes
[ https://issues.apache.org/jira/browse/TEZ-2300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14488203#comment-14488203 ] Rohini Palaniswamy edited comment on TEZ-2300 at 4/9/15 8:26 PM: - Recently, it was from too many job runs for issues TEZ-776 is trying to address. I haven't kept track and the job logs are huge to dig and find out what I did to which job. Will get some with future runs. was (Author: rohini): Recently, it was from too many job runs for issues addressed TEZ-776. I haven't kept track and the job logs are huge to dig and find out what I did to which job. Will get some with future runs. > TezClient.stop() takes a lot of time or does not work sometimes > --- > > Key: TEZ-2300 > URL: https://issues.apache.org/jira/browse/TEZ-2300 > Project: Apache Tez > Issue Type: Bug >Reporter: Rohini Palaniswamy > > Noticed this with a couple of pig scripts which were not behaving well (AM > close to OOM, etc) and even with some that were running fine. Pig calls > Tezclient.stop() in shutdown hook. Ctrl+C to the pig script either exits > immediately or is hung. In both cases it either takes a long time for the > yarn application to go to KILLED state. Many times I just end up calling yarn > application -kill separately after waiting for 5 mins or more for it to get > killed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2119) Counter for launched containers
[ https://issues.apache.org/jira/browse/TEZ-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14487925#comment-14487925 ] Rohini Palaniswamy commented on TEZ-2119: - Still need the LAUNCHED_CONTAINERS. If there are going to be containers launched and were not used (no tasks submitted to them) then you should have both TOTAL_LAUNCHED_CONTAINERS and TOTAL_USED_CONTAINERS. Also is it possible to add something like AVG_CONTAINER_REUSE or some better statistics to see the amount of reuse? > Counter for launched containers > --- > > Key: TEZ-2119 > URL: https://issues.apache.org/jira/browse/TEZ-2119 > Project: Apache Tez > Issue Type: Improvement >Reporter: Rohini Palaniswamy >Assignee: Jeff Zhang > > org.apache.tez.common.counters.DAGCounter > NUM_SUCCEEDED_TASKS=32976 > TOTAL_LAUNCHED_TASKS=32976 > OTHER_LOCAL_TASKS=2 > DATA_LOCAL_TASKS=9147 > RACK_LOCAL_TASKS=23761 > It would be very nice to have TOTAL_LAUNCHED_CONTAINERS counter added to > this. The difference between TOTAL_LAUNCHED_CONTAINERS and > TOTAL_LAUNCHED_TASKS should make it easy to see how much container reuse is > happening. It is very hard to find out now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-776) Reduce AM mem usage caused by storing TezEvents
[ https://issues.apache.org/jira/browse/TEZ-776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14487990#comment-14487990 ] Rohini Palaniswamy commented on TEZ-776: jmap -histo:live output of AM from one of the scripts I am desperately trying to get to run num #instances #bytes class name -- 1: 95865187 3067685984 org.apache.tez.runtime.api.impl.TezEvent 2: 95346787 3051097184 org.apache.tez.runtime.api.events.DataMovementEvent 95 million events in memory is crazy. > Reduce AM mem usage caused by storing TezEvents > --- > > Key: TEZ-776 > URL: https://issues.apache.org/jira/browse/TEZ-776 > Project: Apache Tez > Issue Type: Sub-task >Reporter: Siddharth Seth >Assignee: Bikas Saha > Attachments: TEZ-776.1.patch, TEZ-776.ondemand.1.patch, > TEZ-776.ondemand.2.patch, TEZ-776.ondemand.3.patch, TEZ-776.ondemand.4.patch, > TEZ-776.ondemand.5.patch, TEZ-776.ondemand.6.patch, TEZ-776.ondemand.7.patch, > TEZ-776.ondemand.patch, With_Patch_AM_hotspots.png, > With_Patch_AM_profile.png, Without_patch_AM_CPU_Usage.png, > events-problem-solutions.txt, with_patch_jmc_output_of_AM.png, > without_patch_jmc_output_of_AM.png > > > This is open ended at the moment. > A fair chunk of the AM heap is taken up by TezEvents (specifically > DataMovementEvents - 64 bytes per event). > Depending on the connection pattern - this puts limits on the number of tasks > that can be processed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-776) Reduce AM mem usage caused by storing TezEvents
[ https://issues.apache.org/jira/browse/TEZ-776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14487776#comment-14487776 ] Rohini Palaniswamy commented on TEZ-776: [~bikassaha]/[~sseth], Any further progress on this? Looks like we badly need this for some jobs. > Reduce AM mem usage caused by storing TezEvents > --- > > Key: TEZ-776 > URL: https://issues.apache.org/jira/browse/TEZ-776 > Project: Apache Tez > Issue Type: Sub-task >Reporter: Siddharth Seth >Assignee: Bikas Saha > Attachments: TEZ-776.1.patch, TEZ-776.ondemand.1.patch, > TEZ-776.ondemand.2.patch, TEZ-776.ondemand.3.patch, TEZ-776.ondemand.4.patch, > TEZ-776.ondemand.5.patch, TEZ-776.ondemand.6.patch, TEZ-776.ondemand.7.patch, > TEZ-776.ondemand.patch, With_Patch_AM_hotspots.png, > With_Patch_AM_profile.png, Without_patch_AM_CPU_Usage.png, > events-problem-solutions.txt, with_patch_jmc_output_of_AM.png, > without_patch_jmc_output_of_AM.png > > > This is open ended at the moment. > A fair chunk of the AM heap is taken up by TezEvents (specifically > DataMovementEvents - 64 bytes per event). > Depending on the connection pattern - this puts limits on the number of tasks > that can be processed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2300) TezClient.stop() takes a lot of time or does not work sometimes
[ https://issues.apache.org/jira/browse/TEZ-2300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated TEZ-2300: Summary: TezClient.stop() takes a lot of time or does not work sometimes (was: TezClient.stop() takes a lot of time) > TezClient.stop() takes a lot of time or does not work sometimes > --- > > Key: TEZ-2300 > URL: https://issues.apache.org/jira/browse/TEZ-2300 > Project: Apache Tez > Issue Type: Bug >Reporter: Rohini Palaniswamy > > Noticed this with a couple of pig scripts which were not behaving well (AM > close to OOM, etc) and even with some that were running fine. Pig calls > Tezclient.stop() in shutdown hook. Ctrl+C to the pig script either exits > immediately or is hung. In both cases it either takes a long time for the > yarn application to go to KILLED state. Many times I just end up calling yarn > application -kill separately after waiting for 5 mins or more for it to get > killed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2119) Counter for launched containers
[ https://issues.apache.org/jira/browse/TEZ-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14488060#comment-14488060 ] Rohini Palaniswamy commented on TEZ-2119: - So ALLOCATED, LAUNCHED and USED containers? Launched and Used will be same in case of no pre-warm but could differ with pre-warm. Is my understanding right? > Counter for launched containers > --- > > Key: TEZ-2119 > URL: https://issues.apache.org/jira/browse/TEZ-2119 > Project: Apache Tez > Issue Type: Improvement >Reporter: Rohini Palaniswamy >Assignee: Jeff Zhang > > org.apache.tez.common.counters.DAGCounter > NUM_SUCCEEDED_TASKS=32976 > TOTAL_LAUNCHED_TASKS=32976 > OTHER_LOCAL_TASKS=2 > DATA_LOCAL_TASKS=9147 > RACK_LOCAL_TASKS=23761 > It would be very nice to have TOTAL_LAUNCHED_CONTAINERS counter added to > this. The difference between TOTAL_LAUNCHED_CONTAINERS and > TOTAL_LAUNCHED_TASKS should make it easy to see how much container reuse is > happening. It is very hard to find out now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2234) Add API for statistics information - allow vertex managers to get output size per source vertex
[ https://issues.apache.org/jira/browse/TEZ-2234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14491748#comment-14491748 ] Rohini Palaniswamy commented on TEZ-2234: - Thanks [~bikassaha] > Add API for statistics information - allow vertex managers to get output size > per source vertex > --- > > Key: TEZ-2234 > URL: https://issues.apache.org/jira/browse/TEZ-2234 > Project: Apache Tez > Issue Type: Bug >Reporter: Bikas Saha >Assignee: Bikas Saha > Fix For: 0.7.0 > > Attachments: TEZ-2234.1.patch, TEZ-2234.2.patch, TEZ-2234.3.patch, > TEZ-2234.4.patch > > > Vertex managers may need per source vertex output stats to make > reconfiguration decisions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2300) TezClient.stop() takes a lot of time or does not work sometimes
[ https://issues.apache.org/jira/browse/TEZ-2300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14491781#comment-14491781 ] Rohini Palaniswamy commented on TEZ-2300: - This one was a case of AM lingering long due to posting to ATS. [~jlowe], did configure the tez.yarn.ats.max.events.per.batch to 500 in tez-site.xml to make the problem better. 2015-04-12 23:24:42,040 [Timer-0] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status: status=RUNNING, progress=TotalTasks: 51890 Succeeded: 0 Running: 50 Failed: 0 Killed: 0, diagnostics=, counters=null Did Ctlr+C here on the pig client. AM DAG log: 2015-04-12 23:24:52,015 INFO [AsyncDispatcher event handler] app.DAGAppMaster: DAG completed, dagId=dag_1428329756093_325099_1, dagState=KILLED 2015-04-12 23:24:52,015 INFO [AsyncDispatcher event handler] common.TezUtilsInternal: Redirecting log file based on addend: dag_1428329756093_325099_1_post In the dag_1428329756093_325099_1_post log (attached to the jira as well) {code} 2015-04-12 23:24:57,029 INFO [AMShutdownThread] ats.ATSHistoryLoggingService: Stopping ATSService, eventQueueBacklog=17927 2015-04-12 23:25:25,466 WARN [AMShutdownThread] ats.ATSHistoryLoggingService: ATSService being stopped, eventQueueBacklog=17927, maxTimeLeftToFlush=-1, waitForever=true Lot of ATS put errors 2015-04-12 23:32:53,197 INFO [AMShutdownThread] ats.ATSHistoryLoggingService: Event queue empty, stopping ATS Service 2015-04-12 23:32:53,200 INFO [DelayedContainerManager] rm.YarnTaskSchedulerService: AllocatedContainerManager Thread interrupted 2015-04-12 23:32:53,203 INFO [AMShutdownThread] rm.YarnTaskSchedulerService: Unregistering application from RM, exitStatus=SUCCEEDED, exitMessage=Session stats:submittedDAGs=1, successfulDAGs=0, failedDAGs=0, killedDAGs=1 , trackingURL=bassniumtan-jt1.tan.ygrid.yahoo.com:4080/tez/#/?appid=application_1428329756093_325099 2015-04-12 23:32:53,210 INFO [AMShutdownThread] impl.AMRMClientImpl: Waiting for application to be successfully unregistered. 2015-04-12 23:32:53,314 INFO [AMShutdownThread] rm.YarnTaskSchedulerService: Successfully unregistered application from RM 2015-04-12 23:32:53,315 INFO [AMRM Callback Handler Thread] impl.AMRMClientAsyncImpl: Interrupted while waiting for queue java.lang.InterruptedException at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2017) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2052) at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442) at org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$CallbackHandlerThread.run(AMRMClientAsyncImpl.java:274) 2015-04-12 23:32:53,316 INFO [AMShutdownThread] ipc.Server: Stopping server on 51921 2015-04-12 23:32:53,319 INFO [IPC Server listener on 51921] ipc.Server: Stopping IPC Server listener on 51921 2015-04-12 23:32:53,319 INFO [AMShutdownThread] ipc.Server: Stopping server on 50500 2015-04-12 23:32:53,320 INFO [IPC Server listener on 50500] ipc.Server: Stopping IPC Server listener on 50500 2015-04-12 23:32:53,320 INFO [IPC Server Responder] ipc.Server: Stopping IPC Server Responder 2015-04-12 23:32:53,320 INFO [IPC Server Responder] ipc.Server: Stopping IPC Server Responder 2015-04-12 23:32:53,324 INFO [AMShutdownThread] app.DAGAppMaster: Completed deletion of tez scratch data dir, path=hdfs://bassniumtan-nn1.tan.ygrid.yahoo.com:8020/tmp/temp-1464028011/.tez/application_1428329756093_325099 2015-04-12 23:32:53,324 INFO [AMShutdownThread] app.DAGAppMaster: Exiting DAGAppMaster..GoodBye! 2015-04-12 23:32:53,325 INFO [Thread-1] app.DAGAppMaster: DAGAppMasterShutdownHook invoked {code} Jstack still running thread on AM : {code} "AMShutdownThread" prio=10 tid=0x0f3ad800 nid=0x2b0e runnable [0x04e4] java.lang.Thread.State: RUNNABLE at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(SocketInputStream.java:150) at java.net.SocketInputStream.read(SocketInputStream.java:121) at java.io.BufferedInputStream.fill(BufferedInputStream.java:235) at java.io.BufferedInputStream.read1(BufferedInputStream.java:275) at java.io.BufferedInputStream.read(BufferedInputStream.java:334) - locked <0xde56b700> (a java.io.BufferedInputStream) at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:633) at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:579) at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1322) - locked <0xde42d320> (a sun.net.www.protocol.http.HttpURLConnection) at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:468) at com.sun.jersey.client.ur
[jira] [Updated] (TEZ-2300) TezClient.stop() takes a lot of time or does not work sometimes
[ https://issues.apache.org/jira/browse/TEZ-2300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated TEZ-2300: Attachment: syslog_dag_1428329756093_325099_1_post > TezClient.stop() takes a lot of time or does not work sometimes > --- > > Key: TEZ-2300 > URL: https://issues.apache.org/jira/browse/TEZ-2300 > Project: Apache Tez > Issue Type: Bug >Reporter: Rohini Palaniswamy > Attachments: syslog_dag_1428329756093_325099_1_post > > > Noticed this with a couple of pig scripts which were not behaving well (AM > close to OOM, etc) and even with some that were running fine. Pig calls > Tezclient.stop() in shutdown hook. Ctrl+C to the pig script either exits > immediately or is hung. In both cases it either takes a long time for the > yarn application to go to KILLED state. Many times I just end up calling yarn > application -kill separately after waiting for 5 mins or more for it to get > killed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (TEZ-2300) TezClient.stop() takes a lot of time or does not work sometimes
[ https://issues.apache.org/jira/browse/TEZ-2300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14491781#comment-14491781 ] Rohini Palaniswamy edited comment on TEZ-2300 at 4/12/15 11:45 PM: --- This one was a case of AM lingering long due to posting to ATS. [~jlowe], had configured the tez.yarn.ats.max.events.per.batch to 500 in tez-site.xml as we had earlier noticed that even on normal shutdown AM was lingering for a lot of time when there are lot of events. But in this case it is still worse taking more than 5 mins. I believe it is due to ATS put errors and retries. 2015-04-12 23:24:42,040 [Timer-0] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status: status=RUNNING, progress=TotalTasks: 51890 Succeeded: 0 Running: 50 Failed: 0 Killed: 0, diagnostics=, counters=null Did Ctlr+C here on the pig client. AM DAG log: 2015-04-12 23:24:52,015 INFO [AsyncDispatcher event handler] app.DAGAppMaster: DAG completed, dagId=dag_1428329756093_325099_1, dagState=KILLED 2015-04-12 23:24:52,015 INFO [AsyncDispatcher event handler] common.TezUtilsInternal: Redirecting log file based on addend: dag_1428329756093_325099_1_post In the dag_1428329756093_325099_1_post log (attached to the jira as well) {code} 2015-04-12 23:24:57,029 INFO [AMShutdownThread] ats.ATSHistoryLoggingService: Stopping ATSService, eventQueueBacklog=17927 2015-04-12 23:25:25,466 WARN [AMShutdownThread] ats.ATSHistoryLoggingService: ATSService being stopped, eventQueueBacklog=17927, maxTimeLeftToFlush=-1, waitForever=true Lot of ATS put errors 2015-04-12 23:32:53,197 INFO [AMShutdownThread] ats.ATSHistoryLoggingService: Event queue empty, stopping ATS Service 2015-04-12 23:32:53,200 INFO [DelayedContainerManager] rm.YarnTaskSchedulerService: AllocatedContainerManager Thread interrupted 2015-04-12 23:32:53,203 INFO [AMShutdownThread] rm.YarnTaskSchedulerService: Unregistering application from RM, exitStatus=SUCCEEDED, exitMessage=Session stats:submittedDAGs=1, successfulDAGs=0, failedDAGs=0, killedDAGs=1 , trackingURL=bassniumtan-jt1.tan.ygrid.yahoo.com:4080/tez/#/?appid=application_1428329756093_325099 2015-04-12 23:32:53,210 INFO [AMShutdownThread] impl.AMRMClientImpl: Waiting for application to be successfully unregistered. 2015-04-12 23:32:53,314 INFO [AMShutdownThread] rm.YarnTaskSchedulerService: Successfully unregistered application from RM 2015-04-12 23:32:53,315 INFO [AMRM Callback Handler Thread] impl.AMRMClientAsyncImpl: Interrupted while waiting for queue java.lang.InterruptedException at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2017) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2052) at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442) at org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$CallbackHandlerThread.run(AMRMClientAsyncImpl.java:274) 2015-04-12 23:32:53,316 INFO [AMShutdownThread] ipc.Server: Stopping server on 51921 2015-04-12 23:32:53,319 INFO [IPC Server listener on 51921] ipc.Server: Stopping IPC Server listener on 51921 2015-04-12 23:32:53,319 INFO [AMShutdownThread] ipc.Server: Stopping server on 50500 2015-04-12 23:32:53,320 INFO [IPC Server listener on 50500] ipc.Server: Stopping IPC Server listener on 50500 2015-04-12 23:32:53,320 INFO [IPC Server Responder] ipc.Server: Stopping IPC Server Responder 2015-04-12 23:32:53,320 INFO [IPC Server Responder] ipc.Server: Stopping IPC Server Responder 2015-04-12 23:32:53,324 INFO [AMShutdownThread] app.DAGAppMaster: Completed deletion of tez scratch data dir, path=hdfs://bassniumtan-nn1.tan.ygrid.yahoo.com:8020/tmp/temp-1464028011/.tez/application_1428329756093_325099 2015-04-12 23:32:53,324 INFO [AMShutdownThread] app.DAGAppMaster: Exiting DAGAppMaster..GoodBye! 2015-04-12 23:32:53,325 INFO [Thread-1] app.DAGAppMaster: DAGAppMasterShutdownHook invoked {code} Jstack still running thread on AM : {code} "AMShutdownThread" prio=10 tid=0x0f3ad800 nid=0x2b0e runnable [0x04e4] java.lang.Thread.State: RUNNABLE at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(SocketInputStream.java:150) at java.net.SocketInputStream.read(SocketInputStream.java:121) at java.io.BufferedInputStream.fill(BufferedInputStream.java:235) at java.io.BufferedInputStream.read1(BufferedInputStream.java:275) at java.io.BufferedInputStream.read(BufferedInputStream.java:334) - locked <0xde56b700> (a java.io.BufferedInputStream) at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:633) at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:579) at sun.net.www.protocol.http.Ht
[jira] [Comment Edited] (TEZ-2300) TezClient.stop() takes a lot of time or does not work sometimes
[ https://issues.apache.org/jira/browse/TEZ-2300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14491781#comment-14491781 ] Rohini Palaniswamy edited comment on TEZ-2300 at 4/12/15 11:46 PM: --- This one was a case of AM lingering long due to posting to ATS. [~jlowe], had configured the tez.yarn.ats.max.events.per.batch to 500 in tez-site.xml as we had earlier noticed that even on normal shutdown AM was lingering for a lot of time when there are lot of events. But in this case it is still worse taking more than 5 mins. I believe it is due to ATS put errors and retries. 2015-04-12 23:24:42,040 [Timer-0] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status: status=RUNNING, progress=TotalTasks: 51890 Succeeded: 0 Running: 50 Failed: 0 Killed: 0, diagnostics=, counters=null Did Ctlr+C here on the pig client. AM DAG log: 2015-04-12 23:24:52,015 INFO [AsyncDispatcher event handler] app.DAGAppMaster: DAG completed, dagId=dag_1428329756093_325099_1, dagState=KILLED 2015-04-12 23:24:52,015 INFO [AsyncDispatcher event handler] common.TezUtilsInternal: Redirecting log file based on addend: dag_1428329756093_325099_1_post In the dag_1428329756093_325099_1_post log (attached to the jira as well) {code} 2015-04-12 23:24:57,029 INFO [AMShutdownThread] ats.ATSHistoryLoggingService: Stopping ATSService, eventQueueBacklog=17927 2015-04-12 23:25:25,466 WARN [AMShutdownThread] ats.ATSHistoryLoggingService: ATSService being stopped, eventQueueBacklog=17927, maxTimeLeftToFlush=-1, waitForever=true Lot of ATS put errors 2015-04-12 23:32:53,197 INFO [AMShutdownThread] ats.ATSHistoryLoggingService: Event queue empty, stopping ATS Service 2015-04-12 23:32:53,200 INFO [DelayedContainerManager] rm.YarnTaskSchedulerService: AllocatedContainerManager Thread interrupted 2015-04-12 23:32:53,203 INFO [AMShutdownThread] rm.YarnTaskSchedulerService: Unregistering application from RM, exitStatus=SUCCEEDED, exitMessage=Session stats:submittedDAGs=1, successfulDAGs=0, failedDAGs=0, killedDAGs=1 , trackingURL=bassniumtan-jt1.tan.ygrid.yahoo.com:4080/tez/#/?appid=application_1428329756093_325099 2015-04-12 23:32:53,210 INFO [AMShutdownThread] impl.AMRMClientImpl: Waiting for application to be successfully unregistered. 2015-04-12 23:32:53,314 INFO [AMShutdownThread] rm.YarnTaskSchedulerService: Successfully unregistered application from RM 2015-04-12 23:32:53,315 INFO [AMRM Callback Handler Thread] impl.AMRMClientAsyncImpl: Interrupted while waiting for queue java.lang.InterruptedException at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2017) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2052) at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442) at org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$CallbackHandlerThread.run(AMRMClientAsyncImpl.java:274) 2015-04-12 23:32:53,316 INFO [AMShutdownThread] ipc.Server: Stopping server on 51921 2015-04-12 23:32:53,319 INFO [IPC Server listener on 51921] ipc.Server: Stopping IPC Server listener on 51921 2015-04-12 23:32:53,319 INFO [AMShutdownThread] ipc.Server: Stopping server on 50500 2015-04-12 23:32:53,320 INFO [IPC Server listener on 50500] ipc.Server: Stopping IPC Server listener on 50500 2015-04-12 23:32:53,320 INFO [IPC Server Responder] ipc.Server: Stopping IPC Server Responder 2015-04-12 23:32:53,320 INFO [IPC Server Responder] ipc.Server: Stopping IPC Server Responder 2015-04-12 23:32:53,324 INFO [AMShutdownThread] app.DAGAppMaster: Completed deletion of tez scratch data dir, path=hdfs://bassniumtan-nn1.tan.ygrid.yahoo.com:8020/tmp/temp-1464028011/.tez/application_1428329756093_325099 2015-04-12 23:32:53,324 INFO [AMShutdownThread] app.DAGAppMaster: Exiting DAGAppMaster..GoodBye! 2015-04-12 23:32:53,325 INFO [Thread-1] app.DAGAppMaster: DAGAppMasterShutdownHook invoked {code} Jstack still running thread on AM : {code} "AMShutdownThread" prio=10 tid=0x0f3ad800 nid=0x2b0e runnable [0x04e4] java.lang.Thread.State: RUNNABLE at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(SocketInputStream.java:150) at java.net.SocketInputStream.read(SocketInputStream.java:121) at java.io.BufferedInputStream.fill(BufferedInputStream.java:235) at java.io.BufferedInputStream.read1(BufferedInputStream.java:275) at java.io.BufferedInputStream.read(BufferedInputStream.java:334) - locked <0xde56b700> (a java.io.BufferedInputStream) at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:633) at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:579) at sun.net.www.protocol.http.Ht
[jira] [Commented] (TEZ-2300) TezClient.stop() takes a lot of time or does not work sometimes
[ https://issues.apache.org/jira/browse/TEZ-2300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14492693#comment-14492693 ] Rohini Palaniswamy commented on TEZ-2300: - TEZ-2311 is another case where the kill event was not processed correctly when AM was recovering. > TezClient.stop() takes a lot of time or does not work sometimes > --- > > Key: TEZ-2300 > URL: https://issues.apache.org/jira/browse/TEZ-2300 > Project: Apache Tez > Issue Type: Bug >Reporter: Rohini Palaniswamy > Attachments: syslog_dag_1428329756093_325099_1_post > > > Noticed this with a couple of pig scripts which were not behaving well (AM > close to OOM, etc) and even with some that were running fine. Pig calls > Tezclient.stop() in shutdown hook. Ctrl+C to the pig script either exits > immediately or is hung. In both cases it either takes a long time for the > yarn application to go to KILLED state. Many times I just end up calling yarn > application -kill separately after waiting for 5 mins or more for it to get > killed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-776) Reduce AM mem usage caused by storing TezEvents
[ https://issues.apache.org/jira/browse/TEZ-776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14493244#comment-14493244 ] Rohini Palaniswamy commented on TEZ-776: bq. 95 million events in memory is crazy. This was caused by default 100ms interval for heartbeating. MR uses 3 seconds for heartbeating. The number of events in queue reduced once we tried with 1000ms. > Reduce AM mem usage caused by storing TezEvents > --- > > Key: TEZ-776 > URL: https://issues.apache.org/jira/browse/TEZ-776 > Project: Apache Tez > Issue Type: Sub-task >Reporter: Siddharth Seth >Assignee: Bikas Saha > Attachments: TEZ-776.1.patch, TEZ-776.ondemand.1.patch, > TEZ-776.ondemand.2.patch, TEZ-776.ondemand.3.patch, TEZ-776.ondemand.4.patch, > TEZ-776.ondemand.5.patch, TEZ-776.ondemand.6.patch, TEZ-776.ondemand.7.patch, > TEZ-776.ondemand.patch, With_Patch_AM_hotspots.png, > With_Patch_AM_profile.png, Without_patch_AM_CPU_Usage.png, > events-problem-solutions.txt, with_patch_jmc_output_of_AM.png, > without_patch_jmc_output_of_AM.png > > > This is open ended at the moment. > A fair chunk of the AM heap is taken up by TezEvents (specifically > DataMovementEvents - 64 bytes per event). > Depending on the connection pattern - this puts limits on the number of tasks > that can be processed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (TEZ-2314) Tez task attempt failures due to bad event serialization
Rohini Palaniswamy created TEZ-2314: --- Summary: Tez task attempt failures due to bad event serialization Key: TEZ-2314 URL: https://issues.apache.org/jira/browse/TEZ-2314 Project: Apache Tez Issue Type: Bug Reporter: Rohini Palaniswamy {code} 2015-04-13 19:21:48,516 WARN [Socket Reader #3 for port 53530] ipc.Server: Unable to read call parameters for client 10.216.13.112on connection protocol org.apache.tez.common.TezTaskUmbilicalProtocol for rpcKind RPC_WRITABLE java.lang.ArrayIndexOutOfBoundsException: 1935896432 at org.apache.tez.runtime.api.impl.EventMetaData.readFields(EventMetaData.java:120) at org.apache.tez.runtime.api.impl.TezEvent.readFields(TezEvent.java:271) at org.apache.tez.runtime.api.impl.TezHeartbeatRequest.readFields(TezHeartbeatRequest.java:110) at org.apache.hadoop.io.ObjectWritable.readObject(ObjectWritable.java:285) at org.apache.hadoop.ipc.WritableRpcEngine$Invocation.readFields(WritableRpcEngine.java:160) at org.apache.hadoop.ipc.Server$Connection.processRpcRequest(Server.java:1884) at org.apache.hadoop.ipc.Server$Connection.processOneRpc(Server.java:1816) at org.apache.hadoop.ipc.Server$Connection.readAndProcess(Server.java:1574) at org.apache.hadoop.ipc.Server$Listener.doRead(Server.java:806) at org.apache.hadoop.ipc.Server$Listener$Reader.doRunLoop(Server.java:673) at org.apache.hadoop.ipc.Server$Listener$Reader.run(Server.java:644) {code} cc/ [~hitesh] and [~bikassaha] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (TEZ-2316) Log count of different events if eventQueue has lot of events
Rohini Palaniswamy created TEZ-2316: --- Summary: Log count of different events if eventQueue has lot of events Key: TEZ-2316 URL: https://issues.apache.org/jira/browse/TEZ-2316 Project: Apache Tez Issue Type: Bug Reporter: Rohini Palaniswamy Assignee: Bikas Saha The default heart beat interval is 100ms which is a poor configuration for very big jobs and could lead to event queue quickly filling up. Logging number of events and it types once it crosses a threshold (For eg: every 200K/400K/600K, etc) will at least help see the problem while debugging. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (TEZ-2317) Successful task attempts getting killed
Rohini Palaniswamy created TEZ-2317: --- Summary: Successful task attempts getting killed Key: TEZ-2317 URL: https://issues.apache.org/jira/browse/TEZ-2317 Project: Apache Tez Issue Type: Bug Reporter: Rohini Palaniswamy -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2317) Successful task attempts getting killed
[ https://issues.apache.org/jira/browse/TEZ-2317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated TEZ-2317: Attachment: AM-taskkill.log For a complex DAG when there were lot of events generated and it could not process the events fast enough, we (me and [~bikassaha]) saw that many tasks were killed because only TA_SCHEDULE was processed and before it got to processing the RUNNING event that it got a commit go/no-go request which is a separate async call that does not go via the event queue. These issues were mostly with ONE-ONE edges Pig was using for distributed order by with sampling and since it was not doing much except partitioning they were finishing too fast as well. Issues to fix: - Optimize by not sending a commit go/no-go request if there is no hdfs output (DataSink) involved. In the above case, it is always intermediate output. - Handle the commit go/no-go request after processing events in the event queue. May be something like ask the task to come back after some time. - We saw that for 3058 KilledTaskAttempts TA_KILL_REQUEST events was 383519. This is way high. - In the attached AM-taskkill.log which has grepped statements for a single task that was killed, it has 327 repeats of below message. Need to see why so much and fix that. {code} 2015-04-13 23:19:11,126 INFO [IPC Server handler 22 on 53043] app.TaskAttemptListenerImpTezDag: Commit go/no-go request from attempt_1428329756093_374362_1_29_008426_0 2015-04-13 23:19:11,126 INFO [IPC Server handler 22 on 53043] impl.TaskImpl: Task not running. Issuing kill to bad commit attempt attempt_1428329756093_374362_1_29_008426_0 {code} Please create separate jiras as required. > Successful task attempts getting killed > --- > > Key: TEZ-2317 > URL: https://issues.apache.org/jira/browse/TEZ-2317 > Project: Apache Tez > Issue Type: Bug >Reporter: Rohini Palaniswamy > Attachments: AM-taskkill.log > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2317) Successful task attempts getting killed
[ https://issues.apache.org/jira/browse/TEZ-2317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14493820#comment-14493820 ] Rohini Palaniswamy commented on TEZ-2317: - Forgot to mention that this caused a lot of nodes to be blacklisted. > Successful task attempts getting killed > --- > > Key: TEZ-2317 > URL: https://issues.apache.org/jira/browse/TEZ-2317 > Project: Apache Tez > Issue Type: Bug >Reporter: Rohini Palaniswamy > Attachments: AM-taskkill.log > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (TEZ-2319) DAG history in HDFS
Rohini Palaniswamy created TEZ-2319: --- Summary: DAG history in HDFS Key: TEZ-2319 URL: https://issues.apache.org/jira/browse/TEZ-2319 Project: Apache Tez Issue Type: New Feature Reporter: Rohini Palaniswamy We have processes, that parse jobconf.xml and job history details (map and reduce task details, etc) in avro files from HDFS and load them into hive tables for analysis for mapreduce jobs. Would like to have Tez also make this information written to a history file in HDFS when AM or each DAG completes so that we can do analytics on Tez jobs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2319) DAG history in HDFS
[ https://issues.apache.org/jira/browse/TEZ-2319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14494536#comment-14494536 ] Rohini Palaniswamy commented on TEZ-2319: - This is for offline analysis where all job information is parsed and loaded into multiple hive tables and queries are then run on those tables to analyze cluster usage. We keep 1 year worth of data in those hive tables. Using ATS for that is out of question. Also extracting data periodically (every 1 hr or 4 hrs) and dumping from ATS is also out of question as it hardly scales as it is and that will bring it down. This is kind of a tee on the side and written only when the job completes to HDFS similar to MR. If there are better alternatives to get the information, we are open. > DAG history in HDFS > --- > > Key: TEZ-2319 > URL: https://issues.apache.org/jira/browse/TEZ-2319 > Project: Apache Tez > Issue Type: New Feature >Reporter: Rohini Palaniswamy > > We have processes, that parse jobconf.xml and job history details (map and > reduce task details, etc) in avro files from HDFS and load them into hive > tables for analysis for mapreduce jobs. Would like to have Tez also make this > information written to a history file in HDFS when AM or each DAG completes > so that we can do analytics on Tez jobs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2317) Successful task attempts getting killed
[ https://issues.apache.org/jira/browse/TEZ-2317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14496439#comment-14496439 ] Rohini Palaniswamy commented on TEZ-2317: - Ah. Thanks [~bikassaha]. Issue is with PigProcessor calling canCommit. Fixing that in PIG-4508. > Successful task attempts getting killed > --- > > Key: TEZ-2317 > URL: https://issues.apache.org/jira/browse/TEZ-2317 > Project: Apache Tez > Issue Type: Bug >Reporter: Rohini Palaniswamy >Assignee: Bikas Saha > Fix For: 0.7.0 > > Attachments: AM-taskkill.log > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (TEZ-2317) Successful task attempts getting killed
[ https://issues.apache.org/jira/browse/TEZ-2317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14496439#comment-14496439 ] Rohini Palaniswamy edited comment on TEZ-2317 at 4/15/15 4:01 PM: -- Thanks [~bikassaha]. Issue is with PigProcessor calling canCommit. Fixing that in PIG-4508. was (Author: rohini): Ah. Thanks [~bikassaha]. Issue is with PigProcessor calling canCommit. Fixing that in PIG-4508. > Successful task attempts getting killed > --- > > Key: TEZ-2317 > URL: https://issues.apache.org/jira/browse/TEZ-2317 > Project: Apache Tez > Issue Type: Bug >Reporter: Rohini Palaniswamy >Assignee: Bikas Saha > Fix For: 0.7.0 > > Attachments: AM-taskkill.log > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2322) Succeeded count wrong for Pig on Tez job, decreased 380 => 181
[ https://issues.apache.org/jira/browse/TEZ-2322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14496452#comment-14496452 ] Rohini Palaniswamy commented on TEZ-2322: - No. Have only seen - TotalTasks come down when a new vertex is starting and tasks reduced due to auto parallelism with ShuffleVertexManager. - If the AM gets killed and a new one is launched, Succeeded goes to 0 and then increases as recovery kicks in. Have not seen Succeeded reduce to a non-zero count. But I have only seen AM relaunch due to OOM or other issues with very big jobs (30K+ tasks). So worthwhile to check if there is a second AM attempt launched. Pig prints that status every 20 secs and it is possible a new AM was launched and recovery recovered 181 tasks by then. > Succeeded count wrong for Pig on Tez job, decreased 380 => 181 > -- > > Key: TEZ-2322 > URL: https://issues.apache.org/jira/browse/TEZ-2322 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.5.2 > Environment: HDP 2.2 >Reporter: Hari Sekhon >Priority: Minor > > During a Pig on Tez job the number of succeeded tasks dropped from 380 => 181 > as shown below: > {code} > 2015-04-15 15:09:56,992 [Timer-0] INFO > org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status: > status=RUNNING, progress=TotalTasks: 905 Succeeded: 380 Running: 58 Failed: 0 > Killed: 0 FailedTaskAttempts: 10 KilledTaskAttempts: 16, diagnostics= > 2015-04-15 15:10:16,992 [Timer-0] INFO > org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status: > status=RUNNING, progress=TotalTasks: 905 Succeeded: 380 Running: 58 Failed: 0 > Killed: 0 FailedTaskAttempts: 10 KilledTaskAttempts: 16, diagnostics= > 2015-04-15 15:10:36,992 [Timer-0] INFO > org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status: > status=RUNNING, progress=TotalTasks: 905 Succeeded: 380 Running: 58 Failed: 0 > Killed: 0 FailedTaskAttempts: 10 KilledTaskAttempts: 16, diagnostics= > 2015-04-15 15:10:56,992 [Timer-0] INFO > org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status: > status=RUNNING, progress=TotalTasks: 905 Succeeded: 181 Running: 724 Failed: > 0 Killed: 0 FailedTaskAttempts: 10 KilledTaskAttempts: 89, diagnostics= > 2015-04-15 15:11:16,992 [Timer-0] INFO > org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status: > status=RUNNING, progress=TotalTasks: 905 Succeeded: 181 Running: 724 Failed: > 0 Killed: 0 FailedTaskAttempts: 10 KilledTaskAttempts: 89, diagnostics= > 2015-04-15 15:11:36,992 [Timer-0] INFO > org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status: > status=RUNNING, progress=TotalTasks: 905 Succeeded: 182 Running: 723 Failed: > 0 Killed: 0 FailedTaskAttempts: 10 KilledTaskAttempts: 89, diagnostics= > 2015-04-15 15:11:56,993 [Timer-0] INFO > org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status: > status=RUNNING, progress=TotalTasks: 905 Succeeded: 184 Running: 721 Failed: > 0 Killed: 0 FailedTaskAttempts: 10 KilledTaskAttempts: 89, diagnostics= > 2015-04-15 15:12:16,992 [Timer-0] INFO > org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status: > status=RUNNING, progress=TotalTasks: 905 Succeeded: 186 Running: 719 Failed: > 0 > {code} > Now this may be because the tasks failed, some certainly did due to space > exceptions having checked the logs, but surely once a task has finished > successfully and is marked as succeeded it cannot then later be removed from > the succeeded count? Perhaps the succeeded counter is incremented too early > before the task results are really saved? > KilledTaskAttempts jumped from 16 => 89 at the same time, but even this > doesn't account for the large drop in number of succeeded tasks. > There was also a noticeable jump in Running tasks from 58 => 724 at the same > time which is suspicious, I'm pretty sure there was no contending job to > finish and release so much more resource to this Tez job, so it's also > unclear how the running count count have jumped up to significantly given the > cluster hardware resources have been the same throughout. > Hari Sekhon > http://www.linkedin.com/in/harisekhon -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2319) DAG history in HDFS
[ https://issues.apache.org/jira/browse/TEZ-2319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14496715#comment-14496715 ] Rohini Palaniswamy commented on TEZ-2319: - bq. Maybe this should be a primary ask for ATS v2 This is something that we do not want to wait for ATS v2. But it would be good if they captured this as part of the design. bq. make the SimpleHistoryLogger ( to HDFS ) production-ready and tez should allow publishing to multiple loggers. This history only needs to capture the final state of the DAG, its tasks and counters. It does not need to capture intermediate data. I am not sure SimpleHistoryLogger in its current form is a good fit. The job history in MR is in avro format and gives the whole state of the job on its completion. If AM has that in memory, then we can have a config to dump that into HDFS in some format (json/avro) which is the easiest thing. Else will need another Logger to - build the state over time (not preferrable as it will consume lot of memory) and dump on completion. - or write events as it happens, then parse it and construct only relevant information and write another file. Both options with another Logger are not efficient and I don't like the idea myself. [~jlowe]/[~jeagles] , Any better suggestions on how this can be done based on your experience with how it is currently done in MR? > DAG history in HDFS > --- > > Key: TEZ-2319 > URL: https://issues.apache.org/jira/browse/TEZ-2319 > Project: Apache Tez > Issue Type: New Feature >Reporter: Rohini Palaniswamy > > We have processes, that parse jobconf.xml and job history details (map and > reduce task details, etc) in avro files from HDFS and load them into hive > tables for analysis for mapreduce jobs. Would like to have Tez also make this > information written to a history file in HDFS when AM or each DAG completes > so that we can do analytics on Tez jobs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2317) Successful task attempts getting killed
[ https://issues.apache.org/jira/browse/TEZ-2317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14497375#comment-14497375 ] Rohini Palaniswamy commented on TEZ-2317: - +1 on the patch. Looks good. But let me test it out as well before you commit it. > Successful task attempts getting killed > --- > > Key: TEZ-2317 > URL: https://issues.apache.org/jira/browse/TEZ-2317 > Project: Apache Tez > Issue Type: Bug >Reporter: Rohini Palaniswamy >Assignee: Bikas Saha > Fix For: 0.7.0 > > Attachments: AM-taskkill.log, TEZ-2317.1.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2314) Tez task attempt failures due to bad event serialization
[ https://issues.apache.org/jira/browse/TEZ-2314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14498083#comment-14498083 ] Rohini Palaniswamy commented on TEZ-2314: - [~bikassaha], I don't see this issue with tez 0.6 for the same script even for multiple runs. Should be something introduced in master. > Tez task attempt failures due to bad event serialization > > > Key: TEZ-2314 > URL: https://issues.apache.org/jira/browse/TEZ-2314 > Project: Apache Tez > Issue Type: Bug >Reporter: Rohini Palaniswamy > > {code} > 2015-04-13 19:21:48,516 WARN [Socket Reader #3 for port 53530] ipc.Server: > Unable to read call parameters for client 10.216.13.112on connection protocol > org.apache.tez.common.TezTaskUmbilicalProtocol for rpcKind RPC_WRITABLE > java.lang.ArrayIndexOutOfBoundsException: 1935896432 > at > org.apache.tez.runtime.api.impl.EventMetaData.readFields(EventMetaData.java:120) > at > org.apache.tez.runtime.api.impl.TezEvent.readFields(TezEvent.java:271) > at > org.apache.tez.runtime.api.impl.TezHeartbeatRequest.readFields(TezHeartbeatRequest.java:110) > at > org.apache.hadoop.io.ObjectWritable.readObject(ObjectWritable.java:285) > at > org.apache.hadoop.ipc.WritableRpcEngine$Invocation.readFields(WritableRpcEngine.java:160) > at > org.apache.hadoop.ipc.Server$Connection.processRpcRequest(Server.java:1884) > at > org.apache.hadoop.ipc.Server$Connection.processOneRpc(Server.java:1816) > at > org.apache.hadoop.ipc.Server$Connection.readAndProcess(Server.java:1574) > at org.apache.hadoop.ipc.Server$Listener.doRead(Server.java:806) > at > org.apache.hadoop.ipc.Server$Listener$Reader.doRunLoop(Server.java:673) > at org.apache.hadoop.ipc.Server$Listener$Reader.run(Server.java:644) > {code} > cc/ [~hitesh] and [~bikassaha] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2317) Successful task attempts getting killed
[ https://issues.apache.org/jira/browse/TEZ-2317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14498104#comment-14498104 ] Rohini Palaniswamy commented on TEZ-2317: - +1. Don't see killed tasks with this patch anymore. > Successful task attempts getting killed > --- > > Key: TEZ-2317 > URL: https://issues.apache.org/jira/browse/TEZ-2317 > Project: Apache Tez > Issue Type: Bug >Reporter: Rohini Palaniswamy >Assignee: Bikas Saha > Fix For: 0.7.0 > > Attachments: AM-taskkill.log, TEZ-2317.1.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2300) TezClient.stop() takes a lot of time or does not work sometimes
[ https://issues.apache.org/jira/browse/TEZ-2300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14498230#comment-14498230 ] Rohini Palaniswamy commented on TEZ-2300: - There are couple of issues with the behavior after talking to [~jlowe] and comparing what is done in MR - Kill is put in the event queue and is processed like any other event. When there are millions of event in the queue it takes a long time to get to that and I see the AM even scheduling new tasks. MR also does it this way. Problem is with too many events and TEZ-776 should reduce that. But still with large jobs there are going to be many events in the queue. - TezClient.stop() returns immediately after the kill. It should not and it should poll and wait on the client side. MR does that. - If the DAG is not killed and session not shutdown even after a certain timeout, yarn kill should be called. MR does that. This is an important issue as people might kill a script and think the application is killed and proceed with running a new one which could cause lot of issues while the old one is still running. So the kill needs to be synchronous and reliable. > TezClient.stop() takes a lot of time or does not work sometimes > --- > > Key: TEZ-2300 > URL: https://issues.apache.org/jira/browse/TEZ-2300 > Project: Apache Tez > Issue Type: Bug >Reporter: Rohini Palaniswamy > Attachments: syslog_dag_1428329756093_325099_1_post > > > Noticed this with a couple of pig scripts which were not behaving well (AM > close to OOM, etc) and even with some that were running fine. Pig calls > Tezclient.stop() in shutdown hook. Ctrl+C to the pig script either exits > immediately or is hung. In both cases it either takes a long time for the > yarn application to go to KILLED state. Many times I just end up calling yarn > application -kill separately after waiting for 5 mins or more for it to get > killed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2317) Successful task attempts getting killed
[ https://issues.apache.org/jira/browse/TEZ-2317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14498481#comment-14498481 ] Rohini Palaniswamy commented on TEZ-2317: - +1 > Successful task attempts getting killed > --- > > Key: TEZ-2317 > URL: https://issues.apache.org/jira/browse/TEZ-2317 > Project: Apache Tez > Issue Type: Bug >Reporter: Rohini Palaniswamy >Assignee: Bikas Saha > Attachments: AM-taskkill.log, TEZ-2317.1.patch, TEZ-2317.2.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2314) Tez task attempt failures due to bad event serialization
[ https://issues.apache.org/jira/browse/TEZ-2314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated TEZ-2314: Affects Version/s: 0.7.0 Fix Version/s: 0.7.0 bq. If there is a simple pig script we can use to reproduce this locally, that would help too. I don't have any. I noticed it in two of the large pig scripts that I ran. I will debug it with log statements and update. > Tez task attempt failures due to bad event serialization > > > Key: TEZ-2314 > URL: https://issues.apache.org/jira/browse/TEZ-2314 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.7.0 >Reporter: Rohini Palaniswamy > Fix For: 0.7.0 > > Attachments: TEZ-2314.log.patch > > > {code} > 2015-04-13 19:21:48,516 WARN [Socket Reader #3 for port 53530] ipc.Server: > Unable to read call parameters for client 10.216.13.112on connection protocol > org.apache.tez.common.TezTaskUmbilicalProtocol for rpcKind RPC_WRITABLE > java.lang.ArrayIndexOutOfBoundsException: 1935896432 > at > org.apache.tez.runtime.api.impl.EventMetaData.readFields(EventMetaData.java:120) > at > org.apache.tez.runtime.api.impl.TezEvent.readFields(TezEvent.java:271) > at > org.apache.tez.runtime.api.impl.TezHeartbeatRequest.readFields(TezHeartbeatRequest.java:110) > at > org.apache.hadoop.io.ObjectWritable.readObject(ObjectWritable.java:285) > at > org.apache.hadoop.ipc.WritableRpcEngine$Invocation.readFields(WritableRpcEngine.java:160) > at > org.apache.hadoop.ipc.Server$Connection.processRpcRequest(Server.java:1884) > at > org.apache.hadoop.ipc.Server$Connection.processOneRpc(Server.java:1816) > at > org.apache.hadoop.ipc.Server$Connection.readAndProcess(Server.java:1574) > at org.apache.hadoop.ipc.Server$Listener.doRead(Server.java:806) > at > org.apache.hadoop.ipc.Server$Listener$Reader.doRunLoop(Server.java:673) > at org.apache.hadoop.ipc.Server$Listener$Reader.run(Server.java:644) > {code} > cc/ [~hitesh] and [~bikassaha] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2294) Add tez-site-template.xml with description of config properties
[ https://issues.apache.org/jira/browse/TEZ-2294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14507884#comment-14507884 ] Rohini Palaniswamy commented on TEZ-2294: - Valid values column is always null and can be removed. Rest looks good. > Add tez-site-template.xml with description of config properties > --- > > Key: TEZ-2294 > URL: https://issues.apache.org/jira/browse/TEZ-2294 > Project: Apache Tez > Issue Type: Improvement >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan > Attachments: TEZ-2294.wip.2.patch, TEZ-2294.wip.3.patch, > TEZ-2294.wip.patch, TezConfiguration.html, TezRuntimeConfiguration.html, > tez-default-template.xml, tez-runtime-default-template.xml > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2231) Create project by-laws
[ https://issues.apache.org/jira/browse/TEZ-2231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14522168#comment-14522168 ] Rohini Palaniswamy commented on TEZ-2231: - [~hitesh], Did vimdiff between by-laws.patch and by-laws.3.patch and confirmed that the new changes made are good. +1. The by-laws.3.patch contains a lot of code changes that you were working on apart from the bylaws. Please upload the final patch which does contain only the by laws changes before checking in for future reference. > Create project by-laws > -- > > Key: TEZ-2231 > URL: https://issues.apache.org/jira/browse/TEZ-2231 > Project: Apache Tez > Issue Type: Task >Reporter: Hitesh Shah >Assignee: Hitesh Shah > Attachments: by-laws.2.patch, by-laws.3.patch, by-laws.patch > > > Define the Project by-laws. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2393) Tez pickup PATH env from gateway machine
[ https://issues.apache.org/jira/browse/TEZ-2393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14523731#comment-14523731 ] Rohini Palaniswamy commented on TEZ-2393: - Will test it. [~jlowe], I believe MR AM expands all the variables on the client side if available which we ran into some issue. Though this fix is good, do you see any problems with it? > Tez pickup PATH env from gateway machine > > > Key: TEZ-2393 > URL: https://issues.apache.org/jira/browse/TEZ-2393 > Project: Apache Tez > Issue Type: Bug >Reporter: Daniel Dai >Assignee: Hitesh Shah > Attachments: TEZ-2393.1.patch > > > I found this issue on Windows. When I do: > set PATH=C:\dummy;%PATH% > Then run a tez job. "C:\dummy" appears in PATH of the vertex container. This > is surprising since we don't expect frontend PATH will propagate to backend. > [~hitesh] tried it on Linux and found the same behavior. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2221) VertexGroup name should be unqiue
[ https://issues.apache.org/jira/browse/TEZ-2221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14527425#comment-14527425 ] Rohini Palaniswamy commented on TEZ-2221: - bq. what happens if someone does the following. This should also be disallowed. Correct? {code} dag.createVertexGroup("group_1", v1,v2); dag.createVertexGroup("group_2", v1,v2); {code} [~daijy] pointed out this breaks a lot of Pig scripts on Tez with UnionOptimizer as we have multiple outputs from each vertex and we create a vertex group for each of those output now. For eg: union followed by order by. There will be one sample output and one partitioner output from the union vertex going to two different downstream vertices. With the UnionOptimizer, the union is removed and two vertex groups are created. If this is disallowed we will have to reuse the same Vertex group to route multiple outputs. GroupInputEdge.create(VertexGroup inputVertexGroup, Vertex outputVertex, EdgeProperty edgeProperty, InputDescriptor mergedInput) API seem to allow that. Will doing that work and that is how you want us to construct the plan? Consider another case of union followed by replicate join with two tables followed by order by. The plan will consist of 8 vertices - V1 (Load) + V2 (Load) + V3 (union) + V4a (Replicate join T1 load) + V4b (Replicate join T2 load) + V5 (partitioner) + V6 (sampler) + V7 (order by) with V1,V2->V3, V4a->V3, V4b->V3, V4->V5, V4->V6, V6->V5, V5->V7. Optimized plan will become V4a -> (V1,V2 vertex group) , V4b -> (V1,V2 vertex group) , (V1,V2 vertex group) -> V5, (V1,V2 vertex group) -> V6, V6->V5, V5->V7. So using one vertex group for routing multiple outputs and multiple inputs is how we are expected to construct the plan? > VertexGroup name should be unqiue > - > > Key: TEZ-2221 > URL: https://issues.apache.org/jira/browse/TEZ-2221 > Project: Apache Tez > Issue Type: Bug >Reporter: Jeff Zhang >Assignee: Jeff Zhang > Fix For: 0.7.0, 0.5.4, 0.6.1 > > Attachments: TEZ-2221-1.patch, TEZ-2221-2.patch, TEZ-2221-3.patch, > TEZ-2221-4.patch > > > VertexGroupCommitStartedEvent & VertexGroupCommitFinishedEvent use vertex > group name to identify the vertex group commit, the same name of vertex group > will conflict. While in the current equals & hashCode of VertexGroup, vertex > group name and members name are used. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (TEZ-2221) VertexGroup name should be unqiue
[ https://issues.apache.org/jira/browse/TEZ-2221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14527425#comment-14527425 ] Rohini Palaniswamy edited comment on TEZ-2221 at 5/4/15 10:12 PM: -- bq. what happens if someone does the following. This should also be disallowed. Correct? {code} dag.createVertexGroup("group_1", v1,v2); dag.createVertexGroup("group_2", v1,v2); {code} [~daijy] pointed out this breaks a lot of Pig scripts on Tez with UnionOptimizer as we have multiple outputs from each vertex and we create a vertex group for each of those output now. For eg: union followed by order by. There will be one sample output and one partitioner output from the union vertex going to two different downstream vertices. With the UnionOptimizer, the union is removed and two vertex groups are created. If this is disallowed we will have to reuse the same Vertex group to route multiple outputs. GroupInputEdge.create(VertexGroup inputVertexGroup, Vertex outputVertex, EdgeProperty edgeProperty, InputDescriptor mergedInput) API seem to allow that. Will doing that work and that is how you want us to construct the plan? Consider another case of union followed by replicate join with two tables followed by order by. The plan will consist of 8 vertices - V1 (Load) + V2 (Load) + V3 (union) + V4 (Replicate join T1 load) + V5 (Replicate join T2 load) + V6 (partitioner) + V7 (sampler) + V8 (order by) with V1,V2->V3, V4->V3, V5->V3, V3->V6, V3->V7, V7->V6, V6->V8. Optimized plan will become V4->(V1,V2 vertex group) , V5->(V1,V2 vertex group) , (V1,V2 vertex group) - > V6, (V1,V2 vertex group) - > V7, V7->V6, V6->V8. So using one vertex group for routing multiple outputs and multiple inputs is how we are expected to construct the plan? was (Author: rohini): bq. what happens if someone does the following. This should also be disallowed. Correct? {code} dag.createVertexGroup("group_1", v1,v2); dag.createVertexGroup("group_2", v1,v2); {code} [~daijy] pointed out this breaks a lot of Pig scripts on Tez with UnionOptimizer as we have multiple outputs from each vertex and we create a vertex group for each of those output now. For eg: union followed by order by. There will be one sample output and one partitioner output from the union vertex going to two different downstream vertices. With the UnionOptimizer, the union is removed and two vertex groups are created. If this is disallowed we will have to reuse the same Vertex group to route multiple outputs. GroupInputEdge.create(VertexGroup inputVertexGroup, Vertex outputVertex, EdgeProperty edgeProperty, InputDescriptor mergedInput) API seem to allow that. Will doing that work and that is how you want us to construct the plan? Consider another case of union followed by replicate join with two tables followed by order by. The plan will consist of 8 vertices - V1 (Load) + V2 (Load) + V3 (union) + V4a (Replicate join T1 load) + V4b (Replicate join T2 load) + V5 (partitioner) + V6 (sampler) + V7 (order by) with V1,V2->V3, V4a->V3, V4b->V3, V4->V5, V4->V6, V6->V5, V5->V7. Optimized plan will become V4a -> (V1,V2 vertex group) , V4b -> (V1,V2 vertex group) , (V1,V2 vertex group) -> V5, (V1,V2 vertex group) -> V6, V6->V5, V5->V7. So using one vertex group for routing multiple outputs and multiple inputs is how we are expected to construct the plan? > VertexGroup name should be unqiue > - > > Key: TEZ-2221 > URL: https://issues.apache.org/jira/browse/TEZ-2221 > Project: Apache Tez > Issue Type: Bug >Reporter: Jeff Zhang >Assignee: Jeff Zhang > Fix For: 0.7.0, 0.5.4, 0.6.1 > > Attachments: TEZ-2221-1.patch, TEZ-2221-2.patch, TEZ-2221-3.patch, > TEZ-2221-4.patch > > > VertexGroupCommitStartedEvent & VertexGroupCommitFinishedEvent use vertex > group name to identify the vertex group commit, the same name of vertex group > will conflict. While in the current equals & hashCode of VertexGroup, vertex > group name and members name are used. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2221) VertexGroup name should be unqiue
[ https://issues.apache.org/jira/browse/TEZ-2221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14527479#comment-14527479 ] Rohini Palaniswamy commented on TEZ-2221: - bq. dag.createVertexGroup("group_1", v1,v2); dag.createVertexGroup("group_2", v1,v2); It should be a simple change for us to reuse the vertex group. But since we have never used it that way want to ensure that Tez will be fine if we constructed plans like that. bq. dag.createVertexGroup("group_1", v1,v2); dag.createVertexGroup("group_1", v2,v3); We are not reusing group names anywhere. So that is not an issue for us. > VertexGroup name should be unqiue > - > > Key: TEZ-2221 > URL: https://issues.apache.org/jira/browse/TEZ-2221 > Project: Apache Tez > Issue Type: Bug >Reporter: Jeff Zhang >Assignee: Jeff Zhang > Fix For: 0.7.0, 0.5.4, 0.6.1 > > Attachments: TEZ-2221-1.patch, TEZ-2221-2.patch, TEZ-2221-3.patch, > TEZ-2221-4.patch > > > VertexGroupCommitStartedEvent & VertexGroupCommitFinishedEvent use vertex > group name to identify the vertex group commit, the same name of vertex group > will conflict. While in the current equals & hashCode of VertexGroup, vertex > group name and members name are used. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2221) VertexGroup name should be unqiue
[ https://issues.apache.org/jira/browse/TEZ-2221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14528835#comment-14528835 ] Rohini Palaniswamy commented on TEZ-2221: - bq. Since case 1 (must to have) impact the pig and pig don't use case 2, why not keep this patch ? We don't use case 1. We use case 2. bq. This looks more like hack or workaround for multiple edges. If we need to support multiple edges, may need to create more elegant API. In my opinion, having different vertex groups for different outputs or inputs is more cleaner and gives better control. It is more logical as well and easy to visualize. Routing multiple inputs and outputs through one vertex group is actually very complicated and messy. Our query plan construction is also more simple and easy when using different vertex groups for different outputs. If only a unique vertex group was allowed our plan was to not touch query planning, but during DAG construction find the duplicates and reuse VertexGroup. > VertexGroup name should be unqiue > - > > Key: TEZ-2221 > URL: https://issues.apache.org/jira/browse/TEZ-2221 > Project: Apache Tez > Issue Type: Bug >Reporter: Jeff Zhang >Assignee: Jeff Zhang > Fix For: 0.7.0, 0.5.4, 0.6.1 > > Attachments: TEZ-2221-1.patch, TEZ-2221-2.patch, TEZ-2221-3.patch, > TEZ-2221-4.patch > > > VertexGroupCommitStartedEvent & VertexGroupCommitFinishedEvent use vertex > group name to identify the vertex group commit, the same name of vertex group > will conflict. While in the current equals & hashCode of VertexGroup, vertex > group name and members name are used. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2221) VertexGroup name should be unqiue
[ https://issues.apache.org/jira/browse/TEZ-2221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14528839#comment-14528839 ] Rohini Palaniswamy commented on TEZ-2221: - bq. Unless there is a technical reason to not support v1,v2 in multiple vertex groups simultaneously, we should support it. +1 for this. If there was no concrete reason for doing it, lets please revert it. To be clear, I am only asking to revert duplicate groups with different group names. It is good to disallow groups with same group names as the name is the identifier and should be unique. > VertexGroup name should be unqiue > - > > Key: TEZ-2221 > URL: https://issues.apache.org/jira/browse/TEZ-2221 > Project: Apache Tez > Issue Type: Bug >Reporter: Jeff Zhang >Assignee: Jeff Zhang > Fix For: 0.7.0, 0.5.4, 0.6.1 > > Attachments: TEZ-2221-1.patch, TEZ-2221-2.patch, TEZ-2221-3.patch, > TEZ-2221-4.patch > > > VertexGroupCommitStartedEvent & VertexGroupCommitFinishedEvent use vertex > group name to identify the vertex group commit, the same name of vertex group > will conflict. While in the current equals & hashCode of VertexGroup, vertex > group name and members name are used. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2221) VertexGroup name should be unqiue
[ https://issues.apache.org/jira/browse/TEZ-2221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14532993#comment-14532993 ] Rohini Palaniswamy commented on TEZ-2221: - https://issues.apache.org/jira/secure/attachment/12730678/TEZ-2221-5-revert.patch looks good. +1 for that patch. > VertexGroup name should be unqiue > - > > Key: TEZ-2221 > URL: https://issues.apache.org/jira/browse/TEZ-2221 > Project: Apache Tez > Issue Type: Bug >Reporter: Jeff Zhang >Assignee: Jeff Zhang > Fix For: 0.7.0, 0.5.4, 0.6.1 > > Attachments: TEZ-2221-1.patch, TEZ-2221-2.patch, TEZ-2221-3.patch, > TEZ-2221-4.patch, TEZ-2221-5-revert.patch > > > VertexGroupCommitStartedEvent & VertexGroupCommitFinishedEvent use vertex > group name to identify the vertex group commit, the same name of vertex group > will conflict. While in the current equals & hashCode of VertexGroup, vertex > group name and members name are used. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2300) TezClient.stop() takes a lot of time or does not work sometimes
[ https://issues.apache.org/jira/browse/TEZ-2300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated TEZ-2300: Target Version/s: 0.7.1 > TezClient.stop() takes a lot of time or does not work sometimes > --- > > Key: TEZ-2300 > URL: https://issues.apache.org/jira/browse/TEZ-2300 > Project: Apache Tez > Issue Type: Bug >Reporter: Rohini Palaniswamy > Attachments: syslog_dag_1428329756093_325099_1_post > > > Noticed this with a couple of pig scripts which were not behaving well (AM > close to OOM, etc) and even with some that were running fine. Pig calls > Tezclient.stop() in shutdown hook. Ctrl+C to the pig script either exits > immediately or is hung. In both cases it either takes a long time for the > yarn application to go to KILLED state. Many times I just end up calling yarn > application -kill separately after waiting for 5 mins or more for it to get > killed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (TEZ-2491) Optimize storage and exchange of Counters for better scaling
Rohini Palaniswamy created TEZ-2491: --- Summary: Optimize storage and exchange of Counters for better scaling Key: TEZ-2491 URL: https://issues.apache.org/jira/browse/TEZ-2491 Project: Apache Tez Issue Type: Task Reporter: Rohini Palaniswamy Counters take up a lot of space in the task events generated and is a major bottleneck for scaling ATS. [~jlowe] found a lot of potential optimizations. Using this as an umbrella jira for documentation. Can create sub-tasks later after discussions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2491) Optimize storage and exchange of Counters for better scaling
[ https://issues.apache.org/jira/browse/TEZ-2491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14561283#comment-14561283 ] Rohini Palaniswamy commented on TEZ-2491: - [~jlowe] is working on storing tez events to disk and parsing them like mapreduce JHS as ATS does not scale for us with the direct publishing from AMs. Capturing his thoughts below on some of the possible fixes from the discussion we had on the size analysis he did after logging the events to the file. Analysis: - A 40K+ task created a file size of 517.8M. We suspected configuration was taking up a lot of space, but it was only 6MB. Task events had taken up 499MB of space - 426mb of the 499mb are finished events. Half of which is attempt finished events. So counters being logged twice is the most of it Possible fixes: - There are some odd counters. "WRONG_REDUCE", "WRONG_MAP", etc. seems like counters that should never be non-zero in practice, so sort of a waste to emit them over and over and over. Realize they _could_ occur, but seems so rare to bother dedicating a counter just for those cases. Would be nice to omit zero counters. Looking up a non-existent counter means you get zero, so why bother storing it explicitly This could break if people were iterating over Group Counters instead of direct counter lookup. For eg: Pig iterates over counter groups for RANK implementation in mapreduce, but application should handle missing counters as empty maps and reducers will not produce counters in mapreduce. So that should not be an issue. Or can omit sending send counters when running, but send only on completion (succeeded, failed, killed) in case it might be still required for something like counter drill down in UI or analytics of the jobs themselves later. - Counter display names take a lot of space {code} {'counterDisplayName': 'BAD_ID', 'counterName': 'BAD_ID', 'counterValue': 0}, {code} Can omit if name and display name are same. Will require UI changes. Better would be store the display names only once for all counters for the app. Reducing the counter size will also reduce memory usage on AM and allow it to process task events faster. > Optimize storage and exchange of Counters for better scaling > > > Key: TEZ-2491 > URL: https://issues.apache.org/jira/browse/TEZ-2491 > Project: Apache Tez > Issue Type: Task >Reporter: Rohini Palaniswamy > > Counters take up a lot of space in the task events generated and is a > major bottleneck for scaling ATS. [~jlowe] found a lot of potential > optimizations. Using this as an umbrella jira for documentation. Can create > sub-tasks later after discussions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2537) mapreduce.map.env and mapreduce.reduce.env need to fall back to mapred.child.env for compatibility
[ https://issues.apache.org/jira/browse/TEZ-2537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14571792#comment-14571792 ] Rohini Palaniswamy commented on TEZ-2537: - +1 > mapreduce.map.env and mapreduce.reduce.env need to fall back to > mapred.child.env for compatibility > -- > > Key: TEZ-2537 > URL: https://issues.apache.org/jira/browse/TEZ-2537 > Project: Apache Tez > Issue Type: Bug >Reporter: Jonathan Eagles > Attachments: TEZ-2537.1.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2278) Tez UI start/end time and duration shown are wrong for tasks
[ https://issues.apache.org/jira/browse/TEZ-2278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14584304#comment-14584304 ] Rohini Palaniswamy commented on TEZ-2278: - [~hitesh], Could this be looked into for Tez 0.7.1? > Tez UI start/end time and duration shown are wrong for tasks > > > Key: TEZ-2278 > URL: https://issues.apache.org/jira/browse/TEZ-2278 > Project: Apache Tez > Issue Type: Bug > Components: UI >Affects Versions: 0.6.0 >Reporter: Rohini Palaniswamy > Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png, > screenshot-4.png > > > Observing lot of time discrepancies between vertex, task and swinlane views. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-3391) MR split file validation should be done in the AM
[ https://issues.apache.org/jira/browse/TEZ-3391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16013138#comment-16013138 ] Rohini Palaniswamy commented on TEZ-3391: - Doing these two things will save couple of millis in each map vertex. 1) Moving the validation checks to AM 2) In the vertex construct TaskSplitMetaInfo only for the split of that task instead of constructing for all splits. ie change public static TaskSplitMetaInfo[] readSplitMetaInfo(Configuration conf, FileSystem fs) to public static TaskSplitMetaInfo getSplitMetaInfo(Configuration conf, FileSystem fs, int index) and skip reading splits below the index. If there are 1000 splits, the first task will read 1 split, second task will read 2 splits and so on instead of each task reading all the 1000 splits as is happening now. SplitMetaInfoReaderTez.java {code} try { JobSplit.SplitMetaInfo splitMetaInfo = new JobSplit.SplitMetaInfo(); for (int i = 0; i < numSplits; i++) { splitMetaInfo.readFields(in); if (i == index) { return new JobSplit.TaskSplitMetaInfo(splitIndex, splitMetaInfo.getLocations(), splitMetaInfo.getInputDataLength()); } } } finally { in.close(); } {code} > MR split file validation should be done in the AM > - > > Key: TEZ-3391 > URL: https://issues.apache.org/jira/browse/TEZ-3391 > Project: Apache Tez > Issue Type: Bug >Reporter: Rohini Palaniswamy > > We had a case where Split metadata size exceeded 1000. Instead of job > failing from validation during initialization in AM like mapreduce, each of > the tasks failed doing that validation during initialization. > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (TEZ-3734) Remove config checks in Input/Output.
[ https://issues.apache.org/jira/browse/TEZ-3734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16025327#comment-16025327 ] Rohini Palaniswamy commented on TEZ-3734: - bq. Will this affect the message payload in a bad way? Pig has not been using the config builders and it also does not filter out any configs like the builder. So it should not make a difference for Pig. > Remove config checks in Input/Output. > - > > Key: TEZ-3734 > URL: https://issues.apache.org/jira/browse/TEZ-3734 > Project: Apache Tez > Issue Type: Bug >Reporter: Harish Jaiprakash >Assignee: Harish Jaiprakash > Attachments: TEZ-3734.01.patch > > > The configs in TezRuntimeConfiguration are not propagated if its not in > Input/Output checks, remove the checks and propagate all of > TezRuntimConfiguration to I/Os. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (TEZ-394) Better scheduling for uneven DAGs
[ https://issues.apache.org/jira/browse/TEZ-394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16030718#comment-16030718 ] Rohini Palaniswamy commented on TEZ-394: bq. Is the intent to make V6 low priority? No. Intent is for V6 to have higher priority (same as V1). bq. For that using current approach of distance from root or distance from leaf, both would give V6 high priority Distance from root would have both V1 and V6 at higher priority as the distance from root is 0. Distance from leaf did not. V1's distance from leaf was 3, while V6's distance from leaf was 1. So V1 had higher priority, while V6 had priority similar to V4. Since that is not correct, Jason changed the logic. > Better scheduling for uneven DAGs > - > > Key: TEZ-394 > URL: https://issues.apache.org/jira/browse/TEZ-394 > Project: Apache Tez > Issue Type: Sub-task >Reporter: Rohini Palaniswamy >Assignee: Jason Lowe > Attachments: TEZ-394.001.patch, TEZ-394.002.patch, TEZ-394.003.patch > > > Consider a series of joins or group by on dataset A with few datasets that > takes 10 hours followed by a final join with a dataset X. The vertex that > loads dataset X will be one of the top vertexes and initialized early even > though its output is not consumed till the end after 10 hours. > 1) Could either use delayed start logic for better resource allocation > 2) Else if they are started upfront, need to handle failure/recovery cases > where the nodes which executed the MapTask might have gone down when the > final join happens. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (TEZ-394) Better scheduling for uneven DAGs
[ https://issues.apache.org/jira/browse/TEZ-394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16052192#comment-16052192 ] Rohini Palaniswamy commented on TEZ-394: [~jlowe], [~gopalv] was mentioning in the Summit, that he tried the patch as Hive also has same problem. But he was running into lot more pre-emption of tasks in his case which was unnecessary and wasteful. [~gopalv], Could you elaborate on the issue? > Better scheduling for uneven DAGs > - > > Key: TEZ-394 > URL: https://issues.apache.org/jira/browse/TEZ-394 > Project: Apache Tez > Issue Type: Sub-task >Reporter: Rohini Palaniswamy >Assignee: Jason Lowe > Attachments: TEZ-394.001.patch, TEZ-394.002.patch, TEZ-394.003.patch > > > Consider a series of joins or group by on dataset A with few datasets that > takes 10 hours followed by a final join with a dataset X. The vertex that > loads dataset X will be one of the top vertexes and initialized early even > though its output is not consumed till the end after 10 hours. > 1) Could either use delayed start logic for better resource allocation > 2) Else if they are started upfront, need to handle failure/recovery cases > where the nodes which executed the MapTask might have gone down when the > final join happens. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEZ-394) Better scheduling for uneven DAGs
[ https://issues.apache.org/jira/browse/TEZ-394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16052232#comment-16052232 ] Rohini Palaniswamy commented on TEZ-394: Yes. That was the issue [~gopalv] described. bq. I have a new task scheduler that's in the works that fixes that (among other things), and I hope to have it posted soon. Is it the work you are doing as part of TEZ-3535? > Better scheduling for uneven DAGs > - > > Key: TEZ-394 > URL: https://issues.apache.org/jira/browse/TEZ-394 > Project: Apache Tez > Issue Type: Sub-task >Reporter: Rohini Palaniswamy >Assignee: Jason Lowe > Attachments: TEZ-394.001.patch, TEZ-394.002.patch, TEZ-394.003.patch > > > Consider a series of joins or group by on dataset A with few datasets that > takes 10 hours followed by a final join with a dataset X. The vertex that > loads dataset X will be one of the top vertexes and initialized early even > though its output is not consumed till the end after 10 hours. > 1) Could either use delayed start logic for better resource allocation > 2) Else if they are started upfront, need to handle failure/recovery cases > where the nodes which executed the MapTask might have gone down when the > final join happens. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEZ-3775) Tez UI: Show DAG context in document title
[ https://issues.apache.org/jira/browse/TEZ-3775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16065165#comment-16065165 ] Rohini Palaniswamy commented on TEZ-3775: - Noticed two things different from 0.7 - We are replacing : in 0.7 with - in 0.9 as a separator. Not a big deal. - 0.7 did not have Details/Counters/Tasks/Attempts for vertex, task, attempt. This patch adds it and it is good. Just couple of minor comments. 1) Application Configurations -> Application Configuration 2) Vertex Attempts -> Task Attempts or Vertex Task Attempts > Tez UI: Show DAG context in document title > --- > > Key: TEZ-3775 > URL: https://issues.apache.org/jira/browse/TEZ-3775 > Project: Apache Tez > Issue Type: Bug > Components: UI >Reporter: Jonathan Eagles >Assignee: Jonathan Eagles > Attachments: TEZ-3775.1.patch > > > In Tez UI 0.7, DAG (vertex, app, task, attempt) context was shown in the > document title. This was lost in the 0.9 UI migration. This jira attempts to > bring that feature back. This feature is essential when supporting large > clusters where a dev or support person may have dozens of dags open at the > same time. Having context in the document title (the tab title), will allow > us to quickly navigate. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEZ-3775) Tez UI: Show DAG context in document title
[ https://issues.apache.org/jira/browse/TEZ-3775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16065589#comment-16065589 ] Rohini Palaniswamy commented on TEZ-3775: - Couple more comments: 1) Tez UI in each title seems redundant. Have to increase tab width more than before (using https://addons.mozilla.org/en-US/firefox/addon/tree-style-tab/) to view full title. If we are removing "Tez UI:" , can we change - to : in 0.9 as well? Consistent when dealing with multiple versions and easier on eyes like mine with OCD . 2) Details can also be removed for same reason. DAG/Vertex/Task/Task Attempt is good enough for the details page. 3) 0.9 shows vertex title as "Tez UI: Vertex Details - vertex_1492628984747_2502482_7_00" . 0.7 shows the vertex title as "Vertex: File Merge - vertex_1492628984747_2502482_7_00" where "File Merge" is the name of vertex in this hive DAG. In Pig DAG's you will have scope- in there. Name of the vertex is the most useful info. That needs to be added back. Thanks for fixing this. I have been going crazy without this when opening lot of tabs and struggling to switch between them without a clue. > Tez UI: Show DAG context in document title > --- > > Key: TEZ-3775 > URL: https://issues.apache.org/jira/browse/TEZ-3775 > Project: Apache Tez > Issue Type: Bug > Components: UI >Reporter: Jonathan Eagles >Assignee: Jonathan Eagles > Attachments: TEZ-3775.1.patch, TEZ-3775.2.patch > > > In Tez UI 0.7, DAG (vertex, app, task, attempt) context was shown in the > document title. This was lost in the 0.9 UI migration. This jira attempts to > bring that feature back. This feature is essential when supporting large > clusters where a dev or support person may have dozens of dags open at the > same time. Having context in the document title (the tab title), will allow > us to quickly navigate. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEZ-3212) IFile throws NegativeArraySizeException for value sizes between 1GB and 2GB
[ https://issues.apache.org/jira/browse/TEZ-3212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16093815#comment-16093815 ] Rohini Palaniswamy commented on TEZ-3212: - Based on https://stackoverflow.com/questions/3038392/do-java-arrays-have-a-maximum-size we might want to set Integer.MAX_VALUE - 8 as max array size to support more jvms. bq. In such instances (i.e 2 GB), should valBytes allocation be restricted only to current key/value length? (i.e Math.min(currentKeyLength, MAX_ARRAY_LENGTH) I like this idea as it might save lot of space. But instead of doing Math.min, we should throw an error if data size exceeds the max array size limit. If we don't do that we might be losing data. i.e {code} if (currentValueLength > MAX_ARRAY_LENGTH) { throw new NegativeArraySizeException("Size of data " + currentValueLength + " is greater than java maximum byte array size"); } int newLength = currentValueLength << 1; if (newLength < 0) { newLength = currentValueLength; } {code} > IFile throws NegativeArraySizeException for value sizes between 1GB and 2GB > --- > > Key: TEZ-3212 > URL: https://issues.apache.org/jira/browse/TEZ-3212 > Project: Apache Tez > Issue Type: Bug >Reporter: Jonathan Eagles >Assignee: Jonathan Eagles > Attachments: TEZ-3212.1.patch > > > This is not a regression with respect to MR, just an issue that was > encountered with a job whose IFile record values (which can be of max size > 2GB) which can be successfully written but not successfully read. > Failure while running task:java.lang.NegativeArraySizeException > at > org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.nextRawValue(IFile.java:765) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEZ-3753) Improve error message in IFile for buffer length overflow
[ https://issues.apache.org/jira/browse/TEZ-3753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16096866#comment-16096866 ] Rohini Palaniswamy commented on TEZ-3753: - Why don't we have the patch in TEZ-3212 and mark this as duplicate. That jira has most of the conversation history. > Improve error message in IFile for buffer length overflow > - > > Key: TEZ-3753 > URL: https://issues.apache.org/jira/browse/TEZ-3753 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.7.1 >Reporter: Muhammad Samir Khan >Assignee: Muhammad Samir Khan > Attachments: tez-3753.001.patch, tez-3753.002.patch > > > When a record size is too big and the byte array doubling expansion crosses > 2G the array size overflows and becomes negative. It would be good to fail > the code paths saying record is too big so that error is easy to understand > for users. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEZ-3753) Improve error message in IFile for buffer length overflow
[ https://issues.apache.org/jira/browse/TEZ-3753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16096875#comment-16096875 ] Rohini Palaniswamy commented on TEZ-3753: - Also per [~rajesh.balamohan]'s suggestion in TEZ-3212, it would be preferable to have newLength = currentValueLength instead of newLength = MAX_BUFFER_SIZE; > Improve error message in IFile for buffer length overflow > - > > Key: TEZ-3753 > URL: https://issues.apache.org/jira/browse/TEZ-3753 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.7.1 >Reporter: Muhammad Samir Khan >Assignee: Muhammad Samir Khan > Attachments: tez-3753.001.patch, tez-3753.002.patch > > > When a record size is too big and the byte array doubling expansion crosses > 2G the array size overflows and becomes negative. It would be good to fail > the code paths saying record is too big so that error is easy to understand > for users. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEZ-3159) Reduce memory utilization while serializing keys and values
[ https://issues.apache.org/jira/browse/TEZ-3159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16125983#comment-16125983 ] Rohini Palaniswamy commented on TEZ-3159: - General comments: 1) We need a equivalent class for DataInputBuffer as well. FlexibleByteArrayOutputStream.java: 1) MAX_BUFFER_SIZE should not be required anymore after this patch (after DataInputBuffer also has multiple buffers). Can you rename the class to UnboundedDataOutputBuffer and extend DataOutputStream instead of OutputStream. 2) Have been giving some thought and I think instead of 64MB, we should have either 1MB or 2MB as DEFAULT_CAPACITY_OF_SINGLE_BUFFER. And only the first buffer we should grow from 32 bytes till the capacity. Subsequent buffers we should just create with the full capacity. It will more efficient that way. IFile.java: 1) Find it confusing to see keyData being passed as null to writeKVPair for writeValue methods. It would be cleaner to have separate methods as before instead of checking for null and having multiple branches for useRLE in writeKVPair method. > Reduce memory utilization while serializing keys and values > --- > > Key: TEZ-3159 > URL: https://issues.apache.org/jira/browse/TEZ-3159 > Project: Apache Tez > Issue Type: Improvement >Reporter: Rohini Palaniswamy >Assignee: Muhammad Samir Khan > Attachments: TEZ-3159.001.patch, TEZ-3159.002.patch, > TEZ-3159.003.patch > > > Currently DataOutputBuffer is used for serializing. The underlying buffer > keeps doubling in size when it reaches capacity. In some of the Pig scripts > which serialize big bags, we end up with OOM in Tez as there is no space to > double the array size. Mapreduce mode runs fine in those cases with 1G heap. > The scenarios are > - When combiner runs in reducer and some of the fields after combining > are still big bags (For eg: distinct). Currently with mapreduce combiner does > not run in reducer - MAPREDUCE-5221. Since input sort buffers hold good > amount of memory at that time it can easily go OOM. >- While serializing output with bags when there are multiple inputs and > outputs and the sort buffers for those take up space. > It is a pain especially after buffer size hits 128MB. Doubling at 128MB will > require 128MB (existing array) +256MB (new array). Any doubling after that > requires even more space. But most of the time the data is probably not going > to fill up that 256MB leading to wastage. > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEZ-3159) Reduce memory utilization while serializing keys and values
[ https://issues.apache.org/jira/browse/TEZ-3159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16126007#comment-16126007 ] Rohini Palaniswamy commented on TEZ-3159: - Actually my comment on MAX_BUFFER_SIZE may not be right. There might be other places where the primitive array length implementation might come into picture. So better to keep that as the max size and not rename it as UnboundedDataOutputBuffer. You should rename it to FlexibleDataOutputBuffer though. > Reduce memory utilization while serializing keys and values > --- > > Key: TEZ-3159 > URL: https://issues.apache.org/jira/browse/TEZ-3159 > Project: Apache Tez > Issue Type: Improvement >Reporter: Rohini Palaniswamy >Assignee: Muhammad Samir Khan > Attachments: TEZ-3159.001.patch, TEZ-3159.002.patch, > TEZ-3159.003.patch > > > Currently DataOutputBuffer is used for serializing. The underlying buffer > keeps doubling in size when it reaches capacity. In some of the Pig scripts > which serialize big bags, we end up with OOM in Tez as there is no space to > double the array size. Mapreduce mode runs fine in those cases with 1G heap. > The scenarios are > - When combiner runs in reducer and some of the fields after combining > are still big bags (For eg: distinct). Currently with mapreduce combiner does > not run in reducer - MAPREDUCE-5221. Since input sort buffers hold good > amount of memory at that time it can easily go OOM. >- While serializing output with bags when there are multiple inputs and > outputs and the sort buffers for those take up space. > It is a pain especially after buffer size hits 128MB. Doubling at 128MB will > require 128MB (existing array) +256MB (new array). Any doubling after that > requires even more space. But most of the time the data is probably not going > to fill up that 256MB leading to wastage. > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEZ-3159) Reduce memory utilization while serializing keys and values
[ https://issues.apache.org/jira/browse/TEZ-3159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16126065#comment-16126065 ] Rohini Palaniswamy commented on TEZ-3159: - Some more comments: Just to be clear, when you implement FlexibleDataInputBuffer, instead of reading into a byte array and using that byte array after reset as it is currently done now {code} int i = readData(valBytes, 0, currentValueLength); if (i != currentValueLength) { throw new IOException(String.format(INCOMPLETE_READ, currentValueLength, i)); } value.reset(valBytes, currentValueLength); {code} You will have to read directly into the FlexibleDataInputBuffer FlexibleByteArrayOutputStream.java: 1) You should keep a totalCount as well which will be the total number of bytes written and should return that in size(). And ensure that is always less than MAX_BUFFER_SIZE while writing the data itself. 2) Nitpick. Place the default empty constructor first. 3) Rename DEFAULT_CAPACITY_OF_SINGLE_BUFFER and DEFAULT_INITIAL_SIZE_OF_SINGLE_BUFFER to BUFFER_SIZE_DEFAULT and BUFFER_INITIAL_SIZE_DEFAULT to be consistent with the naming convention we generally use. 4) Code should get simplified a lot if we are doing double of the buffer size from 32 bytes to full buffer capacity only for the first buffer and go with creating the subsequent buffers with the full buffer capacity. Keep separate code paths for them (index == 0) so that code is simpler and easier to read. > Reduce memory utilization while serializing keys and values > --- > > Key: TEZ-3159 > URL: https://issues.apache.org/jira/browse/TEZ-3159 > Project: Apache Tez > Issue Type: Improvement >Reporter: Rohini Palaniswamy >Assignee: Muhammad Samir Khan > Attachments: TEZ-3159.001.patch, TEZ-3159.002.patch, > TEZ-3159.003.patch > > > Currently DataOutputBuffer is used for serializing. The underlying buffer > keeps doubling in size when it reaches capacity. In some of the Pig scripts > which serialize big bags, we end up with OOM in Tez as there is no space to > double the array size. Mapreduce mode runs fine in those cases with 1G heap. > The scenarios are > - When combiner runs in reducer and some of the fields after combining > are still big bags (For eg: distinct). Currently with mapreduce combiner does > not run in reducer - MAPREDUCE-5221. Since input sort buffers hold good > amount of memory at that time it can easily go OOM. >- While serializing output with bags when there are multiple inputs and > outputs and the sort buffers for those take up space. > It is a pain especially after buffer size hits 128MB. Doubling at 128MB will > require 128MB (existing array) +256MB (new array). Any doubling after that > requires even more space. But most of the time the data is probably not going > to fill up that 256MB leading to wastage. > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEZ-3385) DAGClient API should be accessible outside of DAG submission
[ https://issues.apache.org/jira/browse/TEZ-3385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16150750#comment-16150750 ] Rohini Palaniswamy commented on TEZ-3385: - bq. Is this only for DAGClient, or for TezClient only Both should be accessible. But created this jira for DAGClient. > DAGClient API should be accessible outside of DAG submission > > > Key: TEZ-3385 > URL: https://issues.apache.org/jira/browse/TEZ-3385 > Project: Apache Tez > Issue Type: New Feature >Reporter: Rohini Palaniswamy > > In PIG-4958, I had to resort to > DAGClient client = new DAGClientImpl(appId, dagID, new > TezConfiguration(conf), null); > This is not good as DAGClientImpl is a internal class and not something users > should be referring to. Tez needs to have an API to give DAGClient given the > appId, dagID and configuration. This is something basic like > JobClient.getJob(String jobID). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEZ-3831) Reduce Unordered memory needed for storing empty completed events
[ https://issues.apache.org/jira/browse/TEZ-3831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16165368#comment-16165368 ] Rohini Palaniswamy commented on TEZ-3831: - +1. Looks good to me > Reduce Unordered memory needed for storing empty completed events > - > > Key: TEZ-3831 > URL: https://issues.apache.org/jira/browse/TEZ-3831 > Project: Apache Tez > Issue Type: Bug >Reporter: Jonathan Eagles >Assignee: Jonathan Eagles > Attachments: Screen Shot 2017-09-13 at 4.55.11 PM.png, > TEZ-3831.001.patch > > > the completedInputs blocking queue is used to store inputs for the > UnorderedKVReader to consume. With Auto-reduce parallelism enabled and nearly > all empty inputs, the reader can't prune the empty events from the blocking > queue fast enough to keep up. In my scenario, an OOM occurred. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (TEZ-3865) A new vertex manager to partition data for STORE
Rohini Palaniswamy created TEZ-3865: --- Summary: A new vertex manager to partition data for STORE Key: TEZ-3865 URL: https://issues.apache.org/jira/browse/TEZ-3865 Project: Apache Tez Issue Type: New Feature Reporter: Rohini Palaniswamy Restricting number of files in output is a very common use case. In Pig, currently users add a ORDER BY, GROUP BY or DISTINCT with the required parallelism before STORE to achieve it. All of the above operations create unnecessary overhead in processing. It would be ideal if STORE clause supported the PARALLEL statement and the partitioning of data was handled in a more simple and efficient manner. Partitioning of the data can be achieved using a very efficient vertex manager as described below. Going to call it PartitionVertexManager (PVM) for now till someone proposes a better name. Will be explaining using Pig examples, but the logic is same for hive as well. There are multiple cases to consider when storing 1) No partitions - Data is stored into a single directory using FileOutputFormat implementations 2) Partitions - Data is stored into multiple partitions. Case of static or dynamic partitioning with HCat 3) HBase I have kind of forgotten what exactly my thoughts were on this when storing to multiple regions. Will update once I remember. Let us consider below script with pig.exec.bytes.per.reducer (this setting is usually translated to tez.shuffle-vertex-manager.desired-task-input-size with ShuffleVertexManager) set to 1G. {code} A = LOAD 'data' ; B = GROUP A BY $0 PARALLEL 1000; C = FOREACH B GENERATE group, COUNT(A.a), SUM(A.b), ..; D = STORE C into 'output' using SomeStoreFunc() PARALLEL 20; {code} The implementation will have 3 vertices. v1 - LOAD vertex v2 - GROUP BY vertex v3 - STORE vertex PVM will be used on v3. It is going to be similar to ShuffleVertexManager but with some differences. The main difference is that the source vertex does not care about the parallelism of destination vertex and the number of partitioned outputs it produces does not depend on that. 1) Case of no partitions Each task in vertex v2 will produce a single partition output (no Partitioner is required). The PVM will bucket this single partition data from 1000 source tasks into multiple destination tasks of v3 trying to keep 1G per task but max of 20 tasks (auto parallelism). 2) Partitions Let us say the table has 2 partition keys (dt and region). Since there could be any number of regions for a given date, we will use store parallelism as the upper limit on the number of partitions. i.e a HashPartitioner with numReduceTasks as 20 and (dt, region) as the partition key. If there are only 5 regions then each task of v2 will produce 5 partitions (with rest 15 being empty) if there is no hash collision. If there are 30 regions, then each task of v2 will produce 20 partitions. The PVM when it groups will try to group all Partition0 segments as much as possible into one v3 task. Based on skew it could end up in more tasks. i.e there is no restriction on one partition going to same reducer task. Doing this will avoid having to open multiple ORC files in one task when doing dynamic partitioning and will be very efficient reducing namespace usage even further while keeping file sizes more uniform. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEZ-3865) A new vertex manager to partition data for STORE
[ https://issues.apache.org/jira/browse/TEZ-3865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16252543#comment-16252543 ] Rohini Palaniswamy commented on TEZ-3865: - Just read through the jira description and design doc. Thanks for catching the similiarity [~gopalv]. {code} The P10 belonging to the first 10 mappers will be processed by one reducer with 1GB input data. The P10 belonging to the second 10 mappers will be processed by another reducer, etc. ● Each task B only processes one partition’s data. We might want to consider having one task B to process several empty or small partitions together later, similar to to how autoparallelism works. But for now, the goal is to cut down the run time of longest running tasks, less about reducing the total number of tasks. ● Each task B only processes up to a max of input size defined by a config similar to tez.shufflevertexmanager.desiredtaskinputsize. For example, if it is defined to be 1G, and the output size of each of {A1P2, A2P2, A3P2, A4P2} is 0.5G, then it will have one task process {A1P2, A2P2} and another task process {A3P2, A4P2}. {code} The idea mentioned to distribute skewed partitions is exactly same as what I have described for the case of partitions. So we can combine this into the FairVertexManager. But there are additional things required for this jira. Items mentioned below will have to be added as enhancements (configurable options) to FVM. 1) Will need the ability for each task to process more than one partition similar to auto-parallelism though. It is currently listed as an open issue and future work in FVM 2) For case of no partition, probably can just treat it as only P0 has data and other partitions are empty. But this will not be efficient with the unordered writer as having no partition at all (no spill). So will need to add explicit support for the case of no partition. i.e the ability to keep the number of output partitions of the source vertex separate from the number of destination tasks. This can be used even in case of multiple partitions. For eg: If it is known that there will be only 3 partitions, we can configure that instead of the destination parallelism of 20 (example script in jira description) which will avoid empty partitions. 3) FVM has restriction of range bucketing. i.e output of maps are consumed in order in the tasks. This is not required here. Just need best fit based on desired input size. Features not relevant: 1) Do not care about support for MultipleOutputs, as the output format implementation of HCatStorer takes care of writing to different ORC files in different partition folders in the case of dynamic partitioning. Same with MultiStorage. 2) There is no over partitioning in this scenario. So do not care about Volume based partitioning. As long as the ability to configure different partitioners is there. That is good. > A new vertex manager to partition data for STORE > > > Key: TEZ-3865 > URL: https://issues.apache.org/jira/browse/TEZ-3865 > Project: Apache Tez > Issue Type: New Feature >Reporter: Rohini Palaniswamy > > Restricting number of files in output is a very common use case. In Pig, > currently users add a ORDER BY, GROUP BY or DISTINCT with the required > parallelism before STORE to achieve it. All of the above operations create > unnecessary overhead in processing. It would be ideal if STORE clause > supported the PARALLEL statement and the partitioning of data was handled in > a more simple and efficient manner. > Partitioning of the data can be achieved using a very efficient vertex > manager as described below. Going to call it PartitionVertexManager (PVM) for > now till someone proposes a better name. Will be explaining using Pig > examples, but the logic is same for hive as well. > There are multiple cases to consider when storing > 1) No partitions >- Data is stored into a single directory using FileOutputFormat > implementations > 2) Partitions > - Data is stored into multiple partitions. Case of static or dynamic > partitioning with HCat > 3) HBase > I have kind of forgotten what exactly my thoughts were on this when > storing to multiple regions. Will update once I remember. > Let us consider below script with pig.exec.bytes.per.reducer (this setting is > usually translated to tez.shuffle-vertex-manager.desired-task-input-size with > ShuffleVertexManager) set to 1G. > {code} > A = LOAD 'data' ; > B = GROUP A BY $0 PARALLEL 1000; > C = FOREACH B GENERATE group, COUNT(A.a), SUM(A.b), ..; > D = STORE C into 'output' using SomeStoreFunc() PARALLEL 20; > {code} > The implementation will have 3 vertices. > v1 - LOAD vertex > v2 - GROUP BY vertex > v3 - STORE vertex > PVM will be used on v3. It is going to be simil
[jira] [Commented] (TEZ-2950) Poor performance of UnorderedPartitionedKVWriter
[ https://issues.apache.org/jira/browse/TEZ-2950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16290330#comment-16290330 ] Rohini Palaniswamy commented on TEZ-2950: - Here is a simpler suggestion to try speed it up a bit. Can probably be addressed in a separate jira as a short term solution and leave this one for long term solution. https://github.com/apache/tez/blob/master/tez-runtime-library/src/main/java/org/apache/tez/runtime/library/common/writers/UnorderedPartitionedKVWriter.java#L1010-L1022 - For each partition, each spill file is opened once. For parallelism of 1000 and 8500 spills, it will be making 850 file open calls. This can be cut down by batching of spill file reads and partition writes. Let's say for a batch size of 10, 10 writers (partitions) and 10 spill file readers are kept open in parallel and merging is done. It will cut down file open by 10x to 85. > Poor performance of UnorderedPartitionedKVWriter > > > Key: TEZ-2950 > URL: https://issues.apache.org/jira/browse/TEZ-2950 > Project: Apache Tez > Issue Type: Bug >Reporter: Rohini Palaniswamy >Assignee: Kuhu Shukla > Attachments: TEZ-2950.001_prelim.patch > > > Came across a job which was taking a long time in > UnorderedPartitionedKVWriter.mergeAll. It was decompressing and reading data > from spill files (8500 spills) and then writing the final compressed merge > file. Why do we need spill files for UnorderedPartitionedKVWriter? Why not > just buffer and keep directly writing to the final file which will save a lot > of time. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (TEZ-2950) Poor performance of UnorderedPartitionedKVWriter
[ https://issues.apache.org/jira/browse/TEZ-2950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16290330#comment-16290330 ] Rohini Palaniswamy edited comment on TEZ-2950 at 12/14/17 4:59 AM: --- Here is a simpler suggestion to try speed it up a bit. Can probably be addressed in a separate jira as a short term solution and leave this one for long term solution. https://github.com/apache/tez/blob/master/tez-runtime-library/src/main/java/org/apache/tez/runtime/library/common/writers/UnorderedPartitionedKVWriter.java#L1010-L1022 - For each partition, each spill file is opened once. For parallelism of 1000 and 8500 spills, it will be making 850 file open calls. We can try keeping the first N file handles open always (will need a new IFile.Reader method that does not close the underlying input stream but does rest of close() like freeing up decompressor and buffers). Let us say we keep first 100 spill files always open, it will cut down number of file open calls to 8400100. For parallelism of 1000 and 100 spills, it will cut down file open calls from 10 to 100. was (Author: rohini): Here is a simpler suggestion to try speed it up a bit. Can probably be addressed in a separate jira as a short term solution and leave this one for long term solution. https://github.com/apache/tez/blob/master/tez-runtime-library/src/main/java/org/apache/tez/runtime/library/common/writers/UnorderedPartitionedKVWriter.java#L1010-L1022 - For each partition, each spill file is opened once. For parallelism of 1000 and 8500 spills, it will be making 850 file open calls. This can be cut down by batching of spill file reads and partition writes. Let's say for a batch size of 10, 10 writers (partitions) and 10 spill file readers are kept open in parallel and merging is done. It will cut down file open by 10x to 85. > Poor performance of UnorderedPartitionedKVWriter > > > Key: TEZ-2950 > URL: https://issues.apache.org/jira/browse/TEZ-2950 > Project: Apache Tez > Issue Type: Bug >Reporter: Rohini Palaniswamy >Assignee: Kuhu Shukla > Attachments: TEZ-2950.001_prelim.patch > > > Came across a job which was taking a long time in > UnorderedPartitionedKVWriter.mergeAll. It was decompressing and reading data > from spill files (8500 spills) and then writing the final compressed merge > file. Why do we need spill files for UnorderedPartitionedKVWriter? Why not > just buffer and keep directly writing to the final file which will save a lot > of time. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEZ-2950) Poor performance of UnorderedPartitionedKVWriter
[ https://issues.apache.org/jira/browse/TEZ-2950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16290362#comment-16290362 ] Rohini Palaniswamy commented on TEZ-2950: - And another small optimization would be to always write the partition 0 to file.out directly instead of buffer. > Poor performance of UnorderedPartitionedKVWriter > > > Key: TEZ-2950 > URL: https://issues.apache.org/jira/browse/TEZ-2950 > Project: Apache Tez > Issue Type: Bug >Reporter: Rohini Palaniswamy >Assignee: Kuhu Shukla > Attachments: TEZ-2950.001_prelim.patch > > > Came across a job which was taking a long time in > UnorderedPartitionedKVWriter.mergeAll. It was decompressing and reading data > from spill files (8500 spills) and then writing the final compressed merge > file. Why do we need spill files for UnorderedPartitionedKVWriter? Why not > just buffer and keep directly writing to the final file which will save a lot > of time. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (TEZ-3877) Delete spill files once merge is done
Rohini Palaniswamy created TEZ-3877: --- Summary: Delete spill files once merge is done Key: TEZ-3877 URL: https://issues.apache.org/jira/browse/TEZ-3877 Project: Apache Tez Issue Type: Bug Reporter: Rohini Palaniswamy I see that spill files are not deleted right after merge completes. We should do that as it takes up a lot of space and we can't afford that wastage when Tez takes up a lot of shuffle space with complex DAGs. [~jlowe] told me they are only cleaned up after application completes as they are written in app directory and not container directory. That also has to be done so that they are cleaned up by node manager during task failures or container crashes. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEZ-160) Remove 5 second sleep at the end of AM completion.
[ https://issues.apache.org/jira/browse/TEZ-160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16313799#comment-16313799 ] Rohini Palaniswamy commented on TEZ-160: Recently ran noticed that about 5% of Pig jobs launched from Oozie in a cluster, had application status as KILLED even though the DAG succeeded and Pig scripts completed successfully. This was because Pig calls TezClient.stop() on shutdown. If it is not killed within 10 seconds, it calls frameworkClient.killApplication(sessionAppId); which kill the AM. Because of the sleep time of 5 seconds after shutdown is issued, an application finishing as SUCCEEDED or KILLED depended on whether the shutdown completed within the next 5 seconds. Can we skip this check if it is a user initiated shutdown or at least lower it to 1 or 2 seconds? In case of Pig it is a Tez session and pig client is calling shutdown. I think we can skip it in general if it was a Tez session. The only time it will go down automatically is if session timeout expires. Adding another 5 seconds in that case is also wasteful. > Remove 5 second sleep at the end of AM completion. > -- > > Key: TEZ-160 > URL: https://issues.apache.org/jira/browse/TEZ-160 > Project: Apache Tez > Issue Type: Bug >Reporter: Siddharth Seth > Labels: TEZ-0.2.0 > Attachments: test.timeouts.txt > > > ClientServiceDelegate/DAGClient doesn't seem to be getting job completion > status from the AM after job completion. It, instead, always relies on the RM > for this information. The information returned by the AM should be used while > it's available. -- This message was sent by Atlassian JIRA (v6.4.14#64029)