Joe McDonnell created IMPALA-7871:
-------------------------------------

             Summary: Don't load Hive builtin jars for dataload
                 Key: IMPALA-7871
                 URL: https://issues.apache.org/jira/browse/IMPALA-7871
             Project: IMPALA
          Issue Type: Improvement
          Components: Infrastructure
    Affects Versions: Impala 3.1.0
            Reporter: Joe McDonnell
            Assignee: Joe McDonnell


One step in dataload is "Loading Hive Builtins", which copies a large number of 
jars into HDFS (or whatever storage). This step takes a couple minutes on HDFS 
dataload and 8 minutes on S3. Despite its name, I can't find any indication 
that Hive or anything else uses these jars. Dataload and core tests run fine 
without it. S3 can load data without it. There's no indication that this is 
needed.

Unless we find something using these jars, we should remove this step.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to