Hello Robert,
Some of the questions may be better answered on the Hive list but I will take a
first crack of some of them.
From a Tez perspective, lets use vertices and ignore Maps and Reducers for now.
Hive uses this as a convention to indicate that a vertex is either reading data
from HDFS
For the logs to a container in the NM, the NM’s http address is obtained from
YARN APIs. Is this the only page in which the “:” is missing or is it missing
in other rows’ links within the task attempts table? Can you confirm that the
links to the NodeManagers work correctly from the ResourceMana
Hello Premal,
This is likely a combination of a lag in publishing the history events to YARN
timeline which is consumed by the UI and also related to the UI relying more on
YARN Timeline for data as compared to reading the information directly from the
Tez AM. The Hive client is directly gettin
Hello Dharmesh,
The tez staging dir is where scratch data is kept for the lifetime of the Tez
session. i.e. data which can be deleted once the application completes.
Staging data includes the following:
- recovery logs used by the Tez AM for checkpointing state
- Configs and/or dag plan pa
I suggest writing a custom InputFormat or modifying your existing InputFormat
to generate more splits and at the same time, disable splits grouping for the
vertex in question to ensure that you get the high level of parallelism that
you want to achieve.
The log snippet is just indicating that v
t;
> thanks,
> Madhu
>
>
> On Thursday, October 27, 2016 11:19 PM, Hitesh Shah wrote:
>
>
> Hello Madhusudan,
>
> I will start with how container allocations work and make my way back to
> explaining splits.
>
> At the lowest level, each vertex wi
Hello Madhusudan,
I will start with how container allocations work and make my way back to
explaining splits.
At the lowest level, each vertex will have decided to run a number of tasks. At
a high level, when a task is ready to run, it tells the global DAG scheduler
about its requirements ( i
, we were thinking this could save on the cost of AM, and
> container initialization.
>
> We haven't looked into tez recovery as well. Durability is one of our big
> concerns as well.
>
>
> On Thursday, October 20, 2016 12:44 PM, Hitesh Shah wrote:
>
>
&
Not supported as of now. There are multiple aspects to supporting this
properly. One of the most important issues to address would be to do proper QoS
across various DAGs i.e. what kind of policies would need to be built out to
run multiple DAGs to completion within a limited amount of resources
Hello Madhu,
If you are using Tez via Hive, then this would need a fix in Hive. I don’t
believe Hive supports different settings for each vertex in a given query today.
However, for native jobs, Tez already supports different specs for each vertex:
Vertex::setTaskResource() ( configuring yarn r
jersey jars look to be in sync but the jackson ones look a step behind.
>
> what do you think? should i force the 1.9's into the ATS CLASSPATH? can't
> hurt would be my guess. lemme try.
>
> Cheers,
> Stephen.
>
>
> On Mon, Oct 17, 2016 at 2:44 PM, Steph
oryLoggingService.java:
>> 53)
>> at org.apache.tez.dag.history.
>> logging.ats.ATSHistoryLoggingService$1.run(ATSHistoryLoggingService.
>> java:190)
>> at java.lang.Thread.run(Thread.
>> java:745)
>>
>>
>> i'm running the hive cl
LISTEN
> 31168/java
> tcp0 0 172.19.103.136:8188 0.0.0.0:* LISTEN
> 31168/java
>
>
> might there be a debug log level i can set on impl.TimelineClientImpl to see
> what is happening on the connection event?
>
>
Hello Stephen,
yarn-site.xml needs to be updated wherever the Tez client is used. i.e if you
are using Hive, then wherever you launch the Hive CLI and also where the
HiveServer2 is installed ( HS2 will need a restart ).
To see if the connection to timeline is/was an issue, please check the yar
If you have the logs for the application master, you can try the following:
grep “[HISTORY]” | grep “TASK_ATTEMPT_FINISHED”
This will give you info on any failed task attempts.
The AM logs have history events being published to them. You can do grep
“[HISTORY]” | grep “_” where entity type is
dependent on
YARN implementation behavior.
— Hitesh
> On Oct 5, 2016, at 10:06 AM, Madhusudan Ramanna wrote:
>
> Seems like with this approach, there is no need to have information on
> current dir.
>
> thanks,
> Madhu
>
>
> On Tuesday, October 4, 2016 4:44 PM, H
these jobs. Would it be possible to
> get some support to set up my workstation to achieve this?
>
> Brgds
>
> Manuel
>
> On Wed, Sep 28, 2016 at 8:37 PM, Hitesh Shah wrote:
> Thanks for the context, Manuel.
>
> Full compat with MR is something that has not really bee
The env is one approach for augmenting classpath. The other approach which
modifies classpath for both the AM and the task containers is to use
“tez.cluster.additional.classpath.prefix” by setting it to something like
“./archive name/*”
— Hitesh
> On Oct 4, 2016, at 4:38 PM, Madhusudan Ramann
gs\/container_1475091857089_0015_01_02\/apxqueue","completedLogsURL":"http:\/\/ip-10-1-3-71.us-west-2.compute.internal:19888\/jobhistory\/logs\/\/ip-10-1-2-173.us-west-2.compute.internal:8041\/container_1475091857089_0015_01_02\/v_pager_attempt_14
}},{"timestamp":1475094062692,"eventtype":"DAG_STARTED","eventinfo":{}},{"timestamp":1475094062688,"eventtype":"DAG_INITIALIZED","eventinfo":{}},{"timestamp":1475094062055,"eventtype":"DAG_SUBMITT
ue we got is that we used to package our
> applicative jars with nested dependencies in /lib and these are ignored by
> Tez. We could easily work around this expanding these and adapting our
> classpath.
>
> Regards
>
> On Wed, Sep 28, 2016 at 5:46 PM, Hitesh Shah wrote:
> He
Hello Manuel,
Thanks for reporting the issue. Let me try and reproduce this locally to see
what is going on.
A quick question in general though - are you hitting issues when running in
non-local mode too? Would you mind sharing that details on the issues you hit?
thanks
— Hitesh
> On Sep 2
ne server is up and running. Tez UI is however not able to display DAG
> and other details
>
> thanks,
> Madhu
>
>
>
> On Saturday, September 24, 2016 12:01 PM, Hitesh Shah
> wrote:
>
>
> tez-dist tar balls are not published to maven today - only the module
nks,
> Madhu
>
>
> On Friday, September 23, 2016 5:19 PM, Hitesh Shah wrote:
>
>
> Hello Madhusudan,
>
> If you look at the MANIFEST.MF inside any of the tez jars, it will provide
> the commit hash via the SCM-Revision field.
>
> The tez client and the DAGAp
Hello Madhusudan,
If you look at the MANIFEST.MF inside any of the tez jars, it will provide the
commit hash via the SCM-Revision field.
The tez client and the DAGAppMaster also log this info at runtime.
— Hitesh
> On Sep 23, 2016, at 4:08 PM, Madhusudan Ramanna wrote:
>
> Zhiyuan,
>
> We
Hello Madhusudan
Thanks for reporting the issue. Would you mind filing a bug at
https://issues.apache.org/jira/browse/tez with the application logs and tez
configs attached? If you have a simple dag/job example that reproduces the
behavior that would be great too.
thanks
— Hitesh
> On Sep 23
ally? Interestingly this is not a problem with HDP deployment which
> obviously has a 'fuller' setup. Local mode really helps to test.
>
> Thank you,
> Uday
> From: Hitesh Shah
> Sent: 25 August 2016 20:06:30
> To: user@tez.apache.org
> Subject: Re: Parallel
Created https://cwiki.apache.org/confluence/display/TEZ/FAQ which might be a
better fit for such content and other related questions down the line.
> On Aug 25, 2016, at 1:16 PM, Hitesh Shah wrote:
>
> +1. Would you like to contribute the content? You should be able to add an
> a
:59 PM, Madhusudan Ramanna wrote:
>
> Thanks, #2 worked !
>
> Might be a good idea to add to confluence ?
>
> Madhu
>
>
> On Thursday, August 25, 2016 12:00 PM, Hitesh Shah wrote:
>
>
> Hello Madhu,
>
> There are 2 approaches for this:
>
>
Hello Uday,
I don’t believe anyone has tried running 2 dags in parallel in local mode
within the same TezClient ( and definitely not for HiveServer2 ). If this is
with 2 instances of Tez client, this could likely be a bug in terms of either
how Hive is setting up the TezClient for local mode wi
Hello Madhu,
There are 2 approaches for this:
1) Programmatically, for user code running in tasks, you would need to use
either DAG::addTaskLocalFiles() or Vertex::addTaskLocalFiles() - former if the
same jars are needed in all tasks of the DAG.
TezClient::addAppMasterLocalFiles only impacts
Hello Nathaniel,
You are probably right that they should not be as long as the cluster classpath
used contains the MR jars. I believe these jars were retained as a result of
using yarn.application.classpath for augmenting the runtime classpath when
using the classpath from the cluster instead
When comparing just a simple MR job to a Tez dag with 2 vertices, the perf
improvements are limited (as the plan is pretty much the same and data is
transferred via a shuffle edge):
- container re-use
- pipelined sorter vs the MR sorter ( your mileage may vary here depending
on the kind of
ion ?
> 在 2016-08-10 01:02:31,"Hitesh Shah" 写道:
>> The following 2 links should help you get started. Might be best to start
>> with the sigmod paper and one of the earlier videos.
>>
>> https://cwiki.apache.org/confluence/display/TEZ/How+to+Contribu
The following 2 links should help you get started. Might be best to start with
the sigmod paper and one of the earlier videos.
https://cwiki.apache.org/confluence/display/TEZ/How+to+Contribute+to+Tez
https://cwiki.apache.org/confluence/display/TEZ/Presentations%2C+publications%2C+and+articles+ab
Hello
I am assuming that this is the same issue as the one reported in TEZ-3396?
Based on the logs in the jira:
2016-08-03 10:55:33,856 [INFO] [Thread-2] |app.DAGAppMaster|:
DAGAppMasterShutdownHook invoked
2016-08-03 10:55:33,856 [INFO] [Thread-2] |app.DAGAppMaster|: DAGAppMaster
received a
TSService, eventQueueBacklog=17553
> I'll look into lowering tez.yarn.ats.event.flush.timeout.millis while trying
> to look into the timelineserver.
>
> Thanks for your help,
> Slava
>
> On Wed, Aug 3, 2016 at 2:45 PM, Hitesh Shah wrote:
> Hello Slava,
>
Hello Slava,
Can you check for a log line along the lines of "Stopping ATSService,
eventQueueBacklog=“ to see how backed up is the event queue to YARN timeline?
I have noticed this in quite a few installs with YARN Timeline where YARN
Timeline is using the simple Level DB impl and not the Rol
Please check Step 7 on http://tez.apache.org/install.html
thanks
— Hitesh
> On Aug 1, 2016, at 10:25 AM, zhiyuan yang wrote:
>
> The nice thing of Tez is it’s compatible with MapReduce API. So if you just
> want to run MapReduce on Tez, you just learn how to write standard MapReduce
> and cha
r.com/display/MapR41/Installing+and+Configuring+Tez+0.5.3 .But
> hive job gave some NumberFormatError and found out by googling that there is
> version mismatch between tez and hadoop libs.
>
> On Sat, Jul 30, 2016 at 10:22 PM, Sandeep Khurana
> wrote:
> Hitesh
>
> Both o
Hello Sandeep,
2 things to check:
- When compiling Tez, is the hadoop.version in the top-level pom ( and
addition of mapr’s maven repo ) being used to compile against MapR’s hadoop
distribution and not the std. apache release? The Tez AM cannot seem to do a
handshake with the YARN RM. If Ma
That is highly unlikely to work as Hive-2.x requires APIs introduced in Tez
0.8.x.
thanks
— Hitesh
> On Jul 28, 2016, at 8:56 PM, darion.yaphet wrote:
>
> Hi team :
>
> We are using hadoop 2.5.0 and hive 1.2.1 tez 0.5.4 . Now we want to upgrade
> to hive 2.X . Could Tez 0.5.4 support Hive
Either emails to the dev list or specific JIRAs on any usability issues that
you come across - be it missing/unclear docs, APIs that could require cleaning
up, bugs or potential helper libraries to make things easier. Pretty much any
feedback ( and/or patches ) are welcome :)
thanks
— Hitesh
Thanks for the update, Scott.
Given that the APIs have mostly been used by other framework developers, there
is probably quite a few things which may not be easily surfaced in javadocs,
usage examples ( and their lack of ), etc. It would be great if you can provide
feedback ( and patches ) to
Hello Muhammad,
Did you try any of the calls to YARN timeline as described by Rajesh in his
earlier reply?
thanks
— Hitesh
> On Jun 28, 2016, at 1:20 PM, Muhammad Haris
> wrote:
>
> Hi,
> Could anybody please guide me how to get all task level counters? Thanks a lot
>
>
>
> Regards
>
>
ontainers=2 clusterResource= type=OFF_SWITCH
> 2016-06-17 19:04:50,407 INFO security.NMTokenSecretManagerInRM
> (NMTokenSecretManagerInRM.java:createAndGetNMToken(200)) - Sending NMToken
> for nodeId : usw2stdpwo12.glassdoor.local:45454 for container
>
> On Fri, Jun 17, 2016 at
-dev@tez for now.
Hello Anandha,
The usual issue with this is a lack of resources. e.g. no cluster capacity to
launch the AM, queue configs not allowing another AM to launch, the memory size
configured for the AM is too large such that it cannot be scheduled on any
existing node, etc.
Can y
yarn/apps/hadoop/logs/application_1465996511770_0001 does not
> exist.
> Log aggregation has not completed or is not enabled.
>
> I think we are missing some configuration that would help us get more insight?
>
> Thanks!
>
> Joze.
>
> 2016-06-15 12:03 GMT-03:00 H
Hello Joze,
Would it be possible for you to provide the YARN application logs obtained via
“bin/yarn logs -applicationId ” for both of the cases you have seen?
Feel free to file JIRAs and attach logs to each of them.
thanks
— Hitesh
> On Jun 15, 2016, at 7:38 AM, Jose Rozanec
> wrote:
>
>
1
>
>
> tez.lib.uris
> /usr/lib/apache-tez-0.7.1-bin/share
>
>
> hduser@rhes564: /home/hduser/hadoop-2.6.0/etc/Hadoop>
>
>
>
> Dr Mich Talebzadeh
>
> LinkedIn
> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPC
k 1.3.1 engine and I compiled it spark
> from source code. so hopefully I can use TEZ as Spark engine as well.
>
>
> thanks
>
>
>
> Dr Mich Talebzadeh
>
> LinkedIn
> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>
>
are using your local build ).
thanks
— Hitesh
> On May 20, 2016, at 4:39 PM, Mich Talebzadeh
> wrote:
>
> This is the instruction?
>
> Created by Hitesh Shah, last modified on May 02, 2016 Go to start of metadata
> Making use of the Tez Binary Release tarball
>
apache-tez-0.7.1-bin/lib
>
>
> -Original Message-
> From: Hitesh Shah [mailto:hit...@apache.org]
> Sent: Friday, May 20, 2016 4:18 PM
> To: user@tez.apache.org
> Subject: Re: My first TEZ job fails
>
> Can you try the instructions mentioned at
> https://cwiki.ap
$Handler.run(Server.java:2033)
>
> at org.apache.hadoop.ipc.Client.call(Client.java:1468)
>
> at org.apache.hadoop.ipc.Client.call(Client.java:1399)
>
> at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
>
> at com.sun.proxy.$Prox
Logs from `bin/yarn logs -applicationId application_1463758195355_0002` would
be more useful to debug your setup issue. The RM logs usually do not shed much
light on why an application failed.
Can you confirm that you configured tez.lib.uris correctly to point to the tez
tarball on HDFS (tez tar
>
>
> On 5/5/16, 11:00 AM, "Hitesh Shah" wrote:
>
>> What version are you running with?
>>
>> thanks
>> — Hitesh
What version are you running with?
thanks
— Hitesh
> On May 5, 2016, at 10:31 AM, Kurt Muehlner wrote:
>
> Hello,
>
> We have a Pig/Tez application which is exhibiting a strange problem. This
> application was recently migrated from Pig/MR to Pig/Tez. We carefully
> vetted during QA that
Bikas’ comment ( and mine below ) is relevant only for task specific settings.
Hive does not override any settings for the Tez AM so the tez configs for the
AM memory/vcores will reflect at runtime.
I believe Hive has a proxy config - hive.tez.cpu.vcores - for (3) which may be
why your setting
Take a look at TaskCounter and DAGCounter under
https://git-wip-us.apache.org/repos/asf?p=tez.git;a=tree;f=tez-api/src/main/java/org/apache/tez/common/counters;h=df3784e54d1fa6075dcbbca8d1405e309bce1460;hb=HEAD
and let us know if that is insufficient.
thanks
— Hitesh
On Apr 11, 2016, at 4:42
URE [8.107s]
> [INFO] Tez ... SUCCESS [0.063s]
>
> For the npm error I see a open JIRA :
> https://issues.apache.org/jira/browse/BIGTOP-1826
>
> Do you have any suggestion?
>
> Thanks.
>
>
> On Wed, Apr 6, 2016 at 4
0.7.0
> version).
>
> Please help.
>
>
> Thanks,
> Joel
>
> On Wed, Apr 6, 2016 at 3:01 PM Hitesh Shah wrote:
> Hello Sam,
>
> Couple of things to confirm:
> - I assume you are building branch-0.7 of Tez for 0.7.1-SNAPSHOT as there
> has not yet
Hello Sam,
Couple of things to confirm:
- I assume you are building branch-0.7 of Tez for 0.7.1-SNAPSHOT as there
has not yet been a release of 0.7.1?
- For hadoop, are you using hadoop-2.7.0 or hadoop-2.7.1 ( though this
really should not be too relevant here )?
I took branch-0.7 of
do I apply that .patch file to my existing setup of jars?
>
> Appreciate your help and time.
>
> Thanks,
> Joel
>
> On Wed, Apr 6, 2016 at 2:28 PM, Hitesh Shah wrote:
> Every component has a different approach to how it is deployed/upgraded.
>
> I can cover ho
Every component has a different approach to how it is deployed/upgraded.
I can cover how you can go about patching Tez on an existing production system.
The steps should be similar to that described in INSTALL.md in the source tree
with a few minor gotchas to be aware of:
- Deploying Tez ha
Hi Kurt,
The Tez UI as documented should work with any version beyond 0.5.2 if the
history logging is configured to use YARN timeline. As for scopes, some bits of
the vertex description are currently not displayed in the UI though I am not
sure if Pig has integrated with that API yet. Dependin
00ms before retrying getTask again. Got null now. Next getTask sleep
> message after 3ms
> . . . etc.
>
>
> Is there anything else I can provide for now?
>
> Thanks,
> Kurt
>
>
>
> On 3/23/16, 11:43 AM, "Hitesh Shah" wrote:
>
>> Hel
Hello Kurt,
Can you file a jira with a stack dump for the ApplicationMaster process when it
is in this hung state and also include all the application master logs. Also
please mention what version of Pig and Tez you are running.
The main question would be whether the AM is really hung or does
t this.
>
>
>
> Thanks
>
> Bikas
>
>
>
> From: Stephen Sprague [mailto:sprag...@gmail.com]
> Sent: Monday, February 22, 2016 6:59 AM
> To: user@tez.apache.org
> Subject: Re: tez and beeline and hs2
>
>
>
> just an update. i haven't
Not exactly. I think the UI bits might be a red-herring. Bouncing YARN and HS2
also should unlikely be needed unless you are modifying configs.
There is likely a bug ( the NPE being logged ) in the shutting down code for
the org.apache.tez.dag.history.logging.impl.SimpleHistoryLoggingService (
Hello Stephen,
This question should ideally be posted to user@hive as this mainly relates to
HS2 functionality and not really Tez.
That said, a couple of things to look at/try out:
1) Unrelated point - "set mapreduce.framework.name=yarn-tez;” - this is not
needed. What this setting does is
One option may be to try using HADOOP_USER_CLASSPATH_FIRST with it set to true
and adding the hive-exec.jar to the front of HADOOP_CLASSPATH. Using this ( and
verifying by running “hadoop classpath”), you could try to get hive-exec.jar to
the front of the classpath and see if that makes a differ
-rw-rw-r-- 1 105112 2014-01-30 07:08 servlet-api-2.5.jar
> -rw-rw-r-- 1 1251514 2014-01-30 07:08 snappy-java-1.0.5.jar
> -rw-r--r-- 1 162976273 2015-09-10 20:16
> spark-assembly-1.4.1-hadoop2.6.0.jar
> -rw-rw-r-- 1 26514 2014-01-30 07:08 stax-api-1.0.1.jar
> -rw-rw
hadoop-common?
thanks
— Hitesh
On Feb 12, 2016, at 12:16 AM, no jihun wrote:
> Thanks Hitesh Shah.
>
> It claims
>
> 2016-02-12 14:59:07,388 [ERROR] [main] |app.DAGAppMaster|: Error starting
> DAGAppMaster
>
> jav
Run the following command: “bin/yarn logs -applicationId
application_1452243782005_0292” . This should give you the logs for
container_1452243782005_0292_02_01 which may shed more light on why the Tez
ApplicationMaster is failing to launch when triggered via Hive.
thanks
— Hitesh
On Feb
There are 3 types defined as you have noticed:
persisted_reliable: assumes a vertex output is stored in a reliable store like
HDFS. This states that if the node on which the task ran disappears, the output
is still available.
persisted: vertex output stored on local disk where the task ran.
ep
Assuming you have the guava jar available on all nodes, you can set
“tez.cluster.additional.classpath.prefix” to point to it and this classpath
value will be prepended to the classpath of the tez runtime layers. However,
please note that this is not a guarantee to work if the guava jar from your
Couple of other points to add to Bikas’s email:
Regarding your question on small data: No - Tez is geared to work in both small
data and extremely large data cases. Hive should likely perform better with Tez
regardless of data size unless there is a bad query plan created that is
non-optimal f
e",why?
> (2)As mentioned earlier,I also packaged this "conf/hbasetable" to conf.jar,
> and it was downloaded to the AM container path, why it can not be parsed or
> decompressed ?
>
> Is there any configuration options can do this?
>
> best wishes &
Hello
You are right that when hive.compute.splits.in.am is true, the splits are
computed in the cluster in the Tez AM container.
Now, there are a bunch of options to consider but the general gist is that if
you are familiar with MapReduce Distributed Cache or YARN local resources, you
need t
This is probably something that was missed for Tez. Would you mind filing a bug
for this? The fix is to always use {{VAR}} ( instead of say $VAR ) which is
then automatically changed by YARN to $VAR or %VAR% based on the env where the
container is being launched.
— Hitesh
On Jan 12, 2016, at
from. This information handshake is part of the Input/Output pair
> implementation."
>
> If the edges had type PERSISTED_RELIABLE, the information handshake is
> probably not needed. Is that right ?
>
> - Raajay
>
> On Tue, Dec 8, 2015 at 6:17 PM, Hitesh Shah wrote:
&
The other way to look at this problem is that for a given edge between 2
vertices, the data format and transfer mechanism is governed by the Output of
the upstream vertex and the Input of the downstream vertex. You can potentially
write your own Input and Output pair that work with HDFS or tachy
I don’t believe I have seen this error reported before. The error mainly seems
to be coming from somewhere in the Hive codebase so the hive mailing list might
provide a more relevant answer. If you don’t get one, would you mind setting
“tez.am.log.level” to DEBUG in your tez-site.xml, re-run the
g.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.(FileOutputCommitter.java:80)
> at
> org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.getOutputCommitter(FileOutputFormat.java:309)
> at
> org.apache.tez.mapreduce.committer.MROutputCommitter.getOutputCommitter
The general approach for add-on jars requires 2 steps:
1) On the client host, where the job is submitted, you need to ensure that the
add-on jars are in the local classpath. This is usually done by adding them to
HADOOP_CLASSPATH. Please do pay attention to adding the jars via "/*”
instead of j
Hello Juho
As you are probably aware, each hive query will largely have different memory
requirements depending on what kind of plan it ends up executing. For the most
part, a common container size and general settings work well for most queries.
In this case, this might need additional tuning
Hello Dale,
I think I can guess what is happening. Hue keeps connections open between
itself and the HiveServer2. Now what happens is that the Tez session times
itself out if queries are not submitted to it within a certain time window ( to
stop wasting resources on a YARN cluster ).
This ca
In previous releases of the Tez view, it did not have the integration in place
for the Hive query info.
There is one approach that you can try:
Ambari has something called standalone mode. You can deploy a new ambari-server
version 2.1 ( on a different host ) and just instantiate a Tez view o
I don’t believe the binary should need changing at all unless you need
enhancements from recent commits. It should just be setting up the UI and
configuring Tez for using YARN Timeline.
The instructions that you can follow:
http://tez.apache.org/tez-ui.html
http://tez.apache.org/tez_yarn_ti
wrote:
> Maybe it would be a good idea to send the dot file to the ATS along with the
> other information you are sending. I too wanted to look at a dot file the
> other day and had problem finding it back.
>
> - André
>
> On Thu, Oct 1, 2015 at 4:00 AM, Hitesh Shah wrot
The .dot file is generated into the Tez Application Master’s container log dir.
Firstly, you need to figure out the yarn application in which the query/Tez DAG
ran. Once you have the applicationId, you can use one of these 2 approaches:
1) Go to the YARN ResourceManager UI, find the application
This is a question that is probably meant for the Hive mailing list. I believe
either the Hive query Id or the information from the query plan ( as used in
https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/hooks/ATSHook.java
) should probably be able to give you th
The Hive query id maps to the Tez dag name. You can try the following call
against timeline:
/ws/v1/timeline/TEZ_DAG_ID?primaryFilter=dagName:{tezDagName}
thanks
— Hitesh
On Sep 14, 2015, at 10:45 PM, Dharmesh Kakadia
wrote:
> Hi,
>
> I am running Hive on Tez, with timeline server. We have
This is probably a question for the Hive dev mailing list on how the
staging/output directory name is determined. i.e.
".hive-staging_hive_2015-09-11_00-07-40_043_6365145769624003668-1”. You may
need to change this value in the config being used to configure the output of
the vertex that is doi
“tez.aux.uris” supports a comma separated list of files and directories on HDFS
or any distributed filesystem ( no tar balls, archives, etc and no support for
file:// ). When Hive submits a query to Tez, it adds its hive-exec.jar to the
tez runtime ( similar to MR distributed cache ). If you are
ded? How are Vertex
> Location Hints handled ? What if YARN is not able to provide containers in
> requested locations ?
>
> Raajay
>
> On Thu, Sep 10, 2015 at 10:19 AM, Hitesh Shah wrote:
> In almost all cases, this is usually hostnames. The general flow is find the
> b
Most of the parallelism ( no. of containers ) is controlled by the upper layer
application. There is some minor control that Tez does when it groups splits
together but for the most part, the upper layer decides how many containers to
run.
You should look at Hive configs to see how to control
In almost all cases, this is usually hostnames. The general flow is find the
block locations for the data source, extract the hostname from there and
provide it to YARN so that it can provide a container on the same host as the
datanode having the data. As long as YARN is using hostnames, the co
There are 2 aspects to using Vertex Location Hints and parallelism. All of this
depends on how you define the work that needs to be done by a particular task.
I will take the MR approach and compare it to the more dynamic approach that
Jeff has been explaining.
For MR, all the work was decide
st for curiosity. Can i use tez-0.7.0 with latest PIG-0.14.0? Is it tested
> earlier?
>
> Regards,
> Sandeep
>
> On Wed, Sep 2, 2015 at 8:44 PM, Hitesh Shah wrote:
> Based on the stack trace, the following issue seems to be the cause:
1 - 100 of 216 matches
Mail list logo