I'm also wondering which settings I can play around with to affect this?
Say I want to make my jobs keep stuff longer.

Thanks,
Lars


On Fri, Jun 20, 2014 at 11:08 AM, Lars Selsaas <
[email protected]> wrote:

> Thanks!
>
> Hopefully I'm getting the correct logs here:
>
> It seems the same application manager keeps on taking the requests.
>
> They both get the same application ID: application_1403285786962_0002
> <http://127.0.0.1:8088/cluster/app/application_1403285786962_0002>
>
> dag_1403285786962_0004_1.dot : Total file length is 2179 bytes.
> <http://localhost:8042/node/containerlogs/container_1403285786962_0004_01_000001/root/dag_1403285786962_0004_1.dot/?start=-4096>
>
> dag_1403285786962_0004_2.dot : Total file length is 2179 bytes.
> <http://localhost:8042/node/containerlogs/container_1403285786962_0004_01_000001/root/dag_1403285786962_0004_2.dot/?start=-4096>
>
> dag_1403285786962_0004_3.dot : Total file length is 2179 bytes.
> <http://localhost:8042/node/containerlogs/container_1403285786962_0004_01_000001/root/dag_1403285786962_0004_3.dot/?start=-4096>
>
> dag_1403285786962_0004_4.dot : Total file length is 2179 bytes.
> <http://localhost:8042/node/containerlogs/container_1403285786962_0004_01_000001/root/dag_1403285786962_0004_4.dot/?start=-4096>
>
> stderr : Total file length is 0 bytes.
> <http://localhost:8042/node/containerlogs/container_1403285786962_0004_01_000001/root/stderr/?start=-4096>
>
> stderr_dag_1403285786962_0004_1 : Total file length is 0 bytes.
> <http://localhost:8042/node/containerlogs/container_1403285786962_0004_01_000001/root/stderr_dag_1403285786962_0004_1/?start=-4096>
>
> stderr_dag_1403285786962_0004_1_post : Total file length is 0 bytes.
> <http://localhost:8042/node/containerlogs/container_1403285786962_0004_01_000001/root/stderr_dag_1403285786962_0004_1_post/?start=-4096>
>
> stderr_dag_1403285786962_0004_2 : Total file length is 0 bytes.
> <http://localhost:8042/node/containerlogs/container_1403285786962_0004_01_000001/root/stderr_dag_1403285786962_0004_2/?start=-4096>
>
> stderr_dag_1403285786962_0004_2_post : Total file length is 0 bytes.
> <http://localhost:8042/node/containerlogs/container_1403285786962_0004_01_000001/root/stderr_dag_1403285786962_0004_2_post/?start=-4096>
>
> stderr_dag_1403285786962_0004_3 : Total file length is 0 bytes.
> <http://localhost:8042/node/containerlogs/container_1403285786962_0004_01_000001/root/stderr_dag_1403285786962_0004_3/?start=-4096>
>
> stderr_dag_1403285786962_0004_3_post : Total file length is 0 bytes.
> <http://localhost:8042/node/containerlogs/container_1403285786962_0004_01_000001/root/stderr_dag_1403285786962_0004_3_post/?start=-4096>
>
> stderr_dag_1403285786962_0004_4 : Total file length is 0 bytes.
> <http://localhost:8042/node/containerlogs/container_1403285786962_0004_01_000001/root/stderr_dag_1403285786962_0004_4/?start=-4096>
>
> stderr_dag_1403285786962_0004_4_post : Total file length is 0 bytes.
> <http://localhost:8042/node/containerlogs/container_1403285786962_0004_01_000001/root/stderr_dag_1403285786962_0004_4_post/?start=-4096>
>
> stdout : Total file length is 0 bytes.
> <http://localhost:8042/node/containerlogs/container_1403285786962_0004_01_000001/root/stdout/?start=-4096>
>
> stdout_dag_1403285786962_0004_1 : Total file length is 0 bytes.
> <http://localhost:8042/node/containerlogs/container_1403285786962_0004_01_000001/root/stdout_dag_1403285786962_0004_1/?start=-4096>
>
> stdout_dag_1403285786962_0004_1_post : Total file length is 0 bytes.
> <http://localhost:8042/node/containerlogs/container_1403285786962_0004_01_000001/root/stdout_dag_1403285786962_0004_1_post/?start=-4096>
>
> stdout_dag_1403285786962_0004_2 : Total file length is 0 bytes.
> <http://localhost:8042/node/containerlogs/container_1403285786962_0004_01_000001/root/stdout_dag_1403285786962_0004_2/?start=-4096>
>
> stdout_dag_1403285786962_0004_2_post : Total file length is 0 bytes.
> <http://localhost:8042/node/containerlogs/container_1403285786962_0004_01_000001/root/stdout_dag_1403285786962_0004_2_post/?start=-4096>
>
> stdout_dag_1403285786962_0004_3 : Total file length is 0 bytes.
> <http://localhost:8042/node/containerlogs/container_1403285786962_0004_01_000001/root/stdout_dag_1403285786962_0004_3/?start=-4096>
>
> stdout_dag_1403285786962_0004_3_post : Total file length is 0 bytes.
> <http://localhost:8042/node/containerlogs/container_1403285786962_0004_01_000001/root/stdout_dag_1403285786962_0004_3_post/?start=-4096>
>
> stdout_dag_1403285786962_0004_4 : Total file length is 0 bytes.
> <http://localhost:8042/node/containerlogs/container_1403285786962_0004_01_000001/root/stdout_dag_1403285786962_0004_4/?start=-4096>
>
> stdout_dag_1403285786962_0004_4_post : Total file length is 0 bytes.
> <http://localhost:8042/node/containerlogs/container_1403285786962_0004_01_000001/root/stdout_dag_1403285786962_0004_4_post/?start=-4096>
>
> syslog : Total file length is 7577 bytes.
> <http://localhost:8042/node/containerlogs/container_1403285786962_0004_01_000001/root/syslog/?start=-4096>
>
> syslog_dag_1403285786962_0004_1 : Total file length is 57034 bytes.
> <http://localhost:8042/node/containerlogs/container_1403285786962_0004_01_000001/root/syslog_dag_1403285786962_0004_1/?start=-4096>
>
> syslog_dag_1403285786962_0004_1_post : Total file length is 4775 bytes.
> <http://localhost:8042/node/containerlogs/container_1403285786962_0004_01_000001/root/syslog_dag_1403285786962_0004_1_post/?start=-4096>
>
> syslog_dag_1403285786962_0004_2 : Total file length is 56104 bytes.
> <http://localhost:8042/node/containerlogs/container_1403285786962_0004_01_000001/root/syslog_dag_1403285786962_0004_2/?start=-4096>
>
> syslog_dag_1403285786962_0004_2_post : Total file length is 707 bytes.
> <http://localhost:8042/node/containerlogs/container_1403285786962_0004_01_000001/root/syslog_dag_1403285786962_0004_2_post/?start=-4096>
>
> syslog_dag_1403285786962_0004_3 : Total file length is 53187 bytes.
> <http://localhost:8042/node/containerlogs/container_1403285786962_0004_01_000001/root/syslog_dag_1403285786962_0004_3/?start=-4096>
>
> syslog_dag_1403285786962_0004_3_post : Total file length is 5003 bytes.
> <http://localhost:8042/node/containerlogs/container_1403285786962_0004_01_000001/root/syslog_dag_1403285786962_0004_3_post/?start=-4096>
>
> syslog_dag_1403285786962_0004_4 : Total file length is 56111 bytes.
> <http://localhost:8042/node/containerlogs/container_1403285786962_0004_01_000001/root/syslog_dag_1403285786962_0004_4/?start=-4096>
>
> syslog_dag_1403285786962_0004_4_post : Total file length is 4204 bytes.
> <http://localhost:8042/node/containerlogs/container_1403285786962_0004_01_000001/root/syslog_dag_1403285786962_0004_4_post/?start=-4096>
>
> fast run
>
>  Map 1 <http://127.0.0.1:8080/#> 1 734 Bytes 438 Bytes 639 ms Map 2
> <http://127.0.0.1:8080/#> 1 245 KB478 Bytes 1.34 secs Reducer 3
> <http://127.0.0.1:8080/#> 1 446 Bytes 557 Bytes 3.63 secs
>
>
> slow run
>
>  Map 1 <http://127.0.0.1:8080/#> 1 734 Bytes 438 Bytes 12.62 secs Map 2
> <http://127.0.0.1:8080/#> 1 245 KB478 Bytes 14.37 secs Reducer 3
> <http://127.0.0.1:8080/#> 1 446 Bytes 557 Bytes 15.67 secs
>
>
>
> On Fri, Jun 20, 2014 at 10:31 AM, Hitesh Shah <[email protected]> wrote:
>
>> Hello Lars,
>>
>> Just to be very clear - there is no caching of results/data across
>> queries except for some minimal meta-data caching for ORC. If you can send
>> across the logs generated by “yarn logs -applicationId <appId>”, we can try
>> and help you get a better understanding of where the speed difference is
>> stemming from.
>>
>> — HItesh
>>
>> On Jun 20, 2014, at 10:13 AM, Bikas Saha <[email protected]> wrote:
>>
>> > Hi,
>> >
>> > Thanks for your interest in trying out Hive on Tez. There are multiple
>> reasons for the observations you see below.
>> > 1)      Containers are warmed up the longer they get used. So if you
>> repeatedly run queries then the JVM has all classes loaded and ready and
>> may have JIT-ed the frequently run code path. As it learns more about your
>> execution pattern, the JIT can do a better job. This will help you across
>> different queries.
>> > 2)      As you frequently access the same data from the OS it will
>> increase the chances of your finding that data in the OS buffer cache. So
>> you get the benefits of in-memory data JThis will help repeated runs of
>> queries on the same data.
>> > 3)      Hive is smart about explicitly caching de-serialized (Java
>> objects) within query in order to reduce re-computation of work that has
>> already been done. This will help within a query.
>> > 4)      If you are using the ORC file then Hive will try to cache ORC
>> file metadata like locations/sizes etc. and this helps different queries
>> that access the same data.
>> > 5)      If your Tez query session has been idle for some time, then the
>> system starts pro-actively releasing resources back to the cluster so that
>> they may be used by other applications (good for multi-tenancy). So if you
>> fire a query after some delay then a slowdown will be observed in case we
>> need to reclaim some of the released resources. This delay is configurable.
>> >
>> > Hope this helps and you have a positive experience experimenting with
>> Hive on Tez.
>> > Please let us know how we can help!
>> > Bikas
>> >
>> > From: Lars Selsaas [mailto:[email protected]]
>> > Sent: Friday, June 20, 2014 8:50 AM
>> > To: user
>> > Subject: Tez performance on Hive
>> >
>> > Hi,
>> >
>> > So when you set Tez as the execution engine for Hive it takes about
>> half the time to finish a query the second time you run it going from say
>> 24 seconds to 12 seconds. but if I keep re running it it gets down to about
>> 2 seconds on that same query. The speed goes up to 12 seconds if I wait to
>> long before the next rerun or if I do large enough adjustments to the query.
>> >
>> >
>> > So I'm working on a blogpost about Tez and need to find out why this is
>> happening. The first reduced speed seem to mainly just be because of hot
>> containers that store the information about where to find your data. While
>> the seconds reduce down to about 2 sec seems to be some in memory storage
>> of the data. Does it store the results in memory and keep it ready for next
>> time or?
>> >
>> >
>> >
>> > --
>> > <~WRD018.jpg>
>> > Lars Selsaas
>> > Data Engineer
>> > Think Big Analytics
>> > [email protected]
>> > 650-537-5321
>> >
>> >
>> > CONFIDENTIALITY NOTICE
>> > NOTICE: This message is intended for the use of the individual or
>> entity to which it is addressed and may contain information that is
>> confidential, privileged and exempt from disclosure under applicable law.
>> If the reader of this message is not the intended recipient, you are hereby
>> notified that any printing, copying, dissemination, distribution,
>> disclosure or forwarding of this communication is strictly prohibited. If
>> you have received this communication in error, please contact the sender
>> immediately and delete it from your system. Thank You.
>>
>>
>
>
> --
>
> Lars Selsaas
>
> Data Engineer
>
> Think Big Analytics <http://thinkbiganalytics.com>
>
> [email protected]
>
> 650-537-5321
>
>


-- 

Lars Selsaas

Data Engineer

Think Big Analytics <http://thinkbiganalytics.com>

[email protected]

650-537-5321

Reply via email to