Prasanth Jayachandran created HIVE-18160:
--------------------------------------------

             Summary: Jar localization during session initialization is slow
                 Key: HIVE-18160
                 URL: https://issues.apache.org/jira/browse/HIVE-18160
             Project: Hive
          Issue Type: Bug
    Affects Versions: 3.0.0
            Reporter: Prasanth Jayachandran
            Assignee: Prasanth Jayachandran


Same Jar getting localized multiple times resulting in SHA256 computation 
several times causes slow session initialization time. Also, the default sha256 
implementation from commons-codec uses 1KB buffer to read jar file which is 
slow (buffer size not configurable).
{code}
2017-11-28T00:40:55,795 INFO  [main]: tez.TezSessionState 
(TezSessionState.java:createJarLocalResource(716)) - Computed sha: 
55aa783d2dda0599fb89a37daae2a2efebf0eed0d4f6e99e3ce140d2fa2f0c30 for file: 
file:/work/hive/hive-git/packaging/target/apache-hive-3.0.0-SNAPSHOT-bin/apache-hive-3.0.0-SNAPSHOT-bin/lib/hive-exec-3.0.0-SNAPSHOT.jar
 of length: 35.68MB in 241 ms
2017-11-28T00:40:56,105 INFO  [main]: tez.TezSessionState 
(TezSessionState.java:createJarLocalResource(716)) - Computed sha: 
e20986f3a422f8fa5eb61c5a2756cd6f7d2b779dbcab49eae6f2c8dfff7ad2a2 for file: 
file:/work/hive/hive-git/packaging/target/apache-hive-3.0.0-SNAPSHOT-bin/apache-hive-3.0.0-SNAPSHOT-bin/lib/hive-llap-tez-3.0.0-SNAPSHOT.jar
 of length: 109.53KB in 1 ms
2017-11-28T00:40:56,353 INFO  [main]: tez.TezSessionState 
(TezSessionState.java:createJarLocalResource(716)) - Computed sha: 
55aa783d2dda0599fb89a37daae2a2efebf0eed0d4f6e99e3ce140d2fa2f0c30 for file: 
file:/work/hive/hive-git/packaging/target/apache-hive-3.0.0-SNAPSHOT-bin/apache-hive-3.0.0-SNAPSHOT-bin/lib/hive-exec-3.0.0-SNAPSHOT.jar
 of length: 35.68MB in 231 ms
2017-11-28T00:40:56,602 INFO  [main]: tez.TezSessionState 
(TezSessionState.java:createJarLocalResource(716)) - Computed sha: 
55aa783d2dda0599fb89a37daae2a2efebf0eed0d4f6e99e3ce140d2fa2f0c30 for file: 
file:/work/hive/hive-git/packaging/target/apache-hive-3.0.0-SNAPSHOT-bin/apache-hive-3.0.0-SNAPSHOT-bin/lib/hive-exec-3.0.0-SNAPSHOT.jar
 of length: 35.68MB in 241 ms
2017-11-28T00:40:56,612 INFO  [main]: tez.TezSessionState 
(TezSessionState.java:createJarLocalResource(716)) - Computed sha: 
686d66b825fdc4fc241e0591e7646a1bbca1c7114a7224c41da7f4795cf9477a for file: 
file:/work/hadoop/hadoop/hadoop-dist/target/hadoop-2.9.0-SNAPSHOT/share/hadoop/yarn/hadoop-yarn-registry-2.9.0-SNAPSHOT.jar
 of length: 122.72KB in 2 ms
{code} 

>From above logs, sha256 is computed 3 times for hive-exec jar and each 
>invocation takes around 240ms. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to