Srimanth Gunturi created AMBARI-5628:
----------------------------------------

             Summary: Explicitly disabling datanucleus l2 cache for hive
                 Key: AMBARI-5628
                 URL: https://issues.apache.org/jira/browse/AMBARI-5628
             Project: Ambari
          Issue Type: Bug
          Components: client
    Affects Versions: 1.5.1
            Reporter: Srimanth Gunturi
            Assignee: Srimanth Gunturi
             Fix For: 1.6.1


Ambari installations of hive currently do not set any datanucleus related 
properties. There is such a thing as a datanucleus l2 cache, that is pretty bad 
for hive in a distributed environment if it is set. (If there is a lone 
embedded hive instance, with no other codepaths to the db, then it's fine, but 
that never happens in a distributed environment.)

By default, if no setting is present, datanucleus defaults the l2 cache to 
being on, so hive ups the ante by  defaulting to turning it off by default if 
no other setting is configured.

Now, in a war of "defaults", the hive default should win, but this is an area 
where we have had recurring support issues from clients that turn it on 
expecting improved performance. Thus, I'd like ambari installed hive-site.xml 
to explicitly have this config parameter turned off, with a comment asking 
users to not switch it on as it impacts hive negatively.

The parameter in question is "datanucleus.cache.level2.type" , and it's value 
should be "none".  (Note that I've seen some older configs that seem to do 
things like turning datanucleus.cache.level2 = false and stuff like that, that 
is bogus config and does nothing and should not be assumed to be a catch-all 
enabler.)

As a comment, I'd like the following comment "Disables datanucleus l2 cache. 
This must be set to 'none' for hive to work properly" or something to that 
effect.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to