Re: Hive metastore

Kwanghee Park Tue, 14 Jul 2020 23:18:20 -0700

OMG, it's my mistake!

I apologize to all who read my ugly email.


First of all, I examined that why you want to check the health of Hive
MetaStore Process

As far as I know Hive MetaStore(HMS) is stateless architecture (it writes
and reads meta information from DB like MySQL).

Almost all guys will register HMS to the service scheduler like Linux
Systemd in production.

If HMS is killed by crash or exceptions, it will be restarted automatically
by Systemd.

Rather we have to focus on the health of DB, instance of HMS, and Network
more.

I have experienced that HMS hang conditions are below
- Hang the DB or between client - HMS - DB connection pools exhaust since
the query responses of DB are very slow.
- Formed resources(eg. CPU, Memory) race condition by another processes in
HMS Instance
- Frequently occurred timeout by network unavailable

If you consider HA of Hive metastore, use Hive 4.0.0 onwards.
From Hive 4.0, Hive metastore supports Dynamic service discovery with
Zookeeper.
https://cwiki.apache.org/confluence/display/Hive/AdminManual+Metastore+Administration

=======================================================================================================
Korean (한글)

먼저 왜 당신이 하이브 메타스토어 프로세스의 헬스체크를 하려고 하는지 생각해봤습니다

내가 알기론 하이브 메타스토어는 Stateless 구조입니다 (HMS는 MySQL과 같은 DB에서 메타 정보를 읽고 쓰지)

대부분 엔지니어들은 프로덕션 상태의 Linux SystemD와 같은 서비스 스케쥴러에 HMS를 등록합니다

HMS가 크래시나 Exception에 의해서 kill 당해도 Systemd에 의해서 자동으로 재시작 되구요.

차라리 DB, HMS가 설치되어 있는 인스턴스, 그리고 네트워크 상태에 좀더 집중해야 한다고 생각합니다.

하이브 메타스토어가 정상적인 동작을 하지 않는 조건은 아래와 같다고 생각합니다
- DB가 제대로 응답 안 할 때 또는 쿼리 응답이 너무 느려서 Client - HMS - DB 간의 Connection Pool이
가득차서 응답을 못 할 경우
- HMS Instance 안에 다른 프로세스에 의해서 리소스(CPU, Memory) 경쟁 상태가 발생 하는 경우
- 네트워크가 가능하지 않은 상태에서 Timeout이 빈번하게 발생하는 경우

만약 HMS가 불안하다고 느껴져서 HA를 구성해야 한다고 생각하면 Hive 4.0.0 이상 버전을 사용하는게 좋을것 같습니다
Hive 4.0부터는 HMS가 Dynamic Service Discovery 기능을 Zookeeper와 함께 지원합니다.
https://cwiki.apache.org/confluence/display/Hive/AdminManual+Metastore+Administration

On Wed, Jul 15, 2020 at 3:00 PM Kwanghee Park <beb...@gmail.com> wrote:

> First of all, I examined that why you want to check the health of Hive
> Meta Store Process
> 먼저 왜 하이브 메타스토어 프로세스의 헬스체크를 하려고 하는지 생각해봤음
>
> As far I know Hive Meta Store(HMS) is stateless architecture(it writes and
> reads meta information from database like mysql)
> 내가 알기론 하이브메타스토어는 Stateless 구조야 (HMS는 MySQL과 같은 DB에서 메타 정보를 읽고 쓰지)
>
> Almost guys will register HMS to the service scheduler like Linux Systemd.
> 대부분 사람들이 Linux SystemD와 같은 서비스 스케쥴러에 HMS를 등록하지
>
> If HMS is killed by crash or exceptions, it will be restarted
> automatically by Systemd.
> HMS가 크래시나 Exception에 의해서 kill 당해도 Systemd에 의해서 자동으로 재시작 될거야
>
> Rather we have to forcus on the healths of database, instance of HMS, and
> Network.
> 차라리 우리는 DB, HMS가 설치되어 있는 인스턴스, 그리고 네트워크 상태에 좀더 집중해야 한다고 생각해
>
> Hive Metastore hang conditions are below
> - Hang the DB or DB connection pool will be exhausted since the query
> responses of DB are very slow.
> - HM Instance is resource race condition by another processes
> - Network unavailable
>
>
> Except these situations, w
>
>
> So I think we don't need to know HMS health status.
>
> so that you don't need to check  you can make restart it automatically by
> operating system (e.g. linux systemd)
>
> If you consider HA of Hive metastore, use Hive 4.0.0 onwards.
> From Hive 4.0, Hive metastore supports Dynamic service discovery with
> Zookeeper.
>
> On Wed, Jul 15, 2020 at 2:50 AM Sungwoo Park <glap...@gmail.com> wrote:
>
>> Hello,
>>
>> We use just TCP readiness/liveness probes checking the Metastore listener
>> port (specified by hive.metastore.port or metastore.thrift.port). I don't
>> know if an HTTP endpoint is available for Metastore.
>>
>>         readinessProbe:
>>           tcpSocket:
>>             port: 9083
>>           initialDelaySeconds: 10
>>           periodSeconds: 20
>>         livenessProbe:
>>           tcpSocket:
>>             port: 9083
>>           initialDelaySeconds: 20
>>           periodSeconds: 20
>>
>> Best,
>>
>> --- Sungwoo
>>
>> On Wed, Jul 15, 2020 at 2:23 AM Eric Pogash <epog...@salesforce.com>
>> wrote:
>>
>>> Ping on this, does anyone know of a health endpoint?
>>>
>>> Eric
>>>
>>>
>>> On Wed, Jul 8, 2020 at 3:04 PM Eric Pogash <epog...@salesforce.com>
>>> wrote:
>>>
>>>> Hello,
>>>>
>>>> I'm looking to establish a readiness and liveness probe in kubernetes
>>>> where we are hosting a hive standalone metastore. Is there an http health
>>>> endpoint available for the standalone hive metastore that I can use? If
>>>> not, what is the recommended approach here?
>>>>
>>>> Best,
>>>> Eric Pogash
>>>>
>>>

Re: Hive metastore

Reply via email to