Hi Gautam

When you stated

"Have you tried putting multiple metastores behind a load balancer"


Should read

"Have you tried putting multiple *metastore services* ....

Basically there is only one backend database AKA metastore. Unless you have
set up bi-directional replication on the database, then all these services
canonly write to one database.

Hi Udit,

Have you tried looking for waitevents or equivalent in MySQL to see what
these threads are waiting for. I don't know about MySQL, we use Oracle DB.
You may have a bottleneck on your database or concurrency issues.


HTh


Dr Mich Talebzadeh



LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*



http://talebzadehmich.wordpress.com



On 30 March 2016 at 23:20, Gautam <gautamkows...@gmail.com> wrote:

> The metastore service is a java process that is a thrift server .. so you
> can point multiple such hive metastore instances with
> "javax.jdo.option.ConnectionURL" poitning to the same mysql db.
>
> On Wed, Mar 30, 2016 at 3:11 PM, Mich Talebzadeh <
> mich.talebza...@gmail.com> wrote:
>
>>
>>
>> Can you clarify this please
>>
>> "Have you tried putting multiple metastores behind a load balancer"
>>
>> Are you implying that metastore and backend DB are different entities
>> here.
>>
>> As far as I know $HIVE_HOME/bin/hive --service metastore & starts Hive
>> threads to the backend database/metastore and Hive server2 acts a gateway
>> for remote access to Hive metastore through beeline or other clients
>>
>> There is only one metastore here namely MySQL/Oracle or others.
>>
>> Thanks
>>
>>
>> Dr Mich Talebzadeh
>>
>>
>>
>> LinkedIn * 
>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>>
>>
>>
>> http://talebzadehmich.wordpress.com
>>
>>
>>
>> On 30 March 2016 at 22:53, Gautam <gautamkows...@gmail.com> wrote:
>>
>>> Can you elaborate on where you see the bottleneck?   A general overview
>>> of your access path would be useful. For instance if you'r accessing Hive
>>> metastore via HiveServer2 or from webhcat using embedded cli or something
>>> else.
>>>
>>> Have you tried putting multiple metastores behind a load balancer? It's
>>> just a thrift service over mysql so can have multiple instances pointing to
>>> same backend db.
>>>
>>> On Wed, Mar 30, 2016 at 2:28 PM, Udit Mehta <ume...@groupon.com> wrote:
>>>
>>>> Hi all,
>>>>
>>>> We are currently running Hive in production and staging with the
>>>> metastore connecting to a MySql database in the backend. The traffic in
>>>> production accessing the metastore is more than staging which is expected.
>>>> We have had a sudden increase in traffic which has led to the metastore
>>>> operation taking a lot longer than before. The same query on staging takes
>>>> a lot less due to the lesser traffic on the staging cluster.
>>>>
>>>> We tried increasing the heap space for the metastore process as well as
>>>> bumped up the memory for the mysql database. Both these changes did not
>>>> seem to help much and we still see delays. Is there any other config we can
>>>> increase to counter this increased traffic? I am looking at config for max
>>>> threads as well but im not sure if this is the right path ahead.
>>>>
>>>> Im wondering if the metastore is a bottleneck here or im missing
>>>> something.
>>>>
>>>> Looking forward to your reply,
>>>> Udit
>>>>
>>>
>>>
>>>
>>> --
>>> "If you really want something in this life, you have to work for it.
>>> Now, quiet! They're about to announce the lottery numbers..."
>>>
>>
>>
>
>
> --
> "If you really want something in this life, you have to work for it. Now,
> quiet! They're about to announce the lottery numbers..."
>

Reply via email to