Hi,

You have correctly identified most of the important issues with the current
metadata design :). Most of these issues are tracked in various JIRAs
(search for catalog-server label) and we do plan to work on them sometime
in the future. For instance, we're actively working on a per-partition
REFRESH statement that will significantly eliminate the cost of running
REFRESH on a partitioned table when only few partitions have been modified
by external ingestion processes. That said, we do accept patches from
external contributors, so if you'd like to help improving Impala's metadata
management story, you're move than welcome to work on some of these JIRAs.
I would recommend starting with something small or medium size and not
something that overhauls the entire metadata design.

Can you also share some more information on what kind of performance
analysis you're looking for? What are the key dimensions you're interested
in? e.g. cluster size, number of tables/files/blocks/partitions, concurrency

Dimitris

On Wed, Jul 13, 2016 at 6:57 PM, 何天一 <[email protected]> wrote:

> Hi all.
> Thanks for kindly reply.
>
> I can bring up a partial list of problems that we faced in production:
> * metadata refreshing become more expensive when HDFS centralized cache is
> enabled (cached location is also stored, hence file meta tend to change
> from time to time).
> * memory consumption of both impalad, statestored and catalogd grow as
> number of partitions and files increases. which makes scaling out expensive
> (total reserved memory is increasing). also, the issue about size
> <https://issues.cloudera.org/browse/IMPALA-3499>.
> * we took the simplest approach that when user finished their 'INSERT'
> with Hive or other tools, they should then execute 'REFRESH' on Impala.
> this may randomly fail (didn't investigate yet) and leaving meta data
> stale, causing correctness issues on following queries (empty result, for
> example).
> * rolling restart of impalad may exhaust the bandwidth of node running
> statestored (given that size of metadata is large). on the other hand,
> increasing sleeping period between restart operations may brings down
> availability (running queries will fail when any impalad restarts).
>
> I also don't have a theory to solve these problems yet.
>
> But I wonder if it's worth to make the system simpler as tradeoff for
> availability.
> (As you know, it's too early to talk about performance when availability
> does not meet the needs)
>
> Would someone share any analysis on query performance about hive metadata
> cache and HDFS meta cache?
>
>
> On Thu, Jul 14, 2016 at 9:13 AM, Henry Robinson <[email protected]>
> wrote:
>
>> Hi Tianyi -
>>
>> The idea behind the metadata cache is to try to ensure that no RPCs to
>> the metastore are on the critical path of query planning. However, the way
>> we populate and invalidate the cache does cause us some problems; the
>> statestore was never designed to carry this kind of payload and this can
>> cause difficulties in large clusters.
>>
>> We are aware of the problems, and hope to do something about them, but
>> don't have concrete plans just yet. If you have thoughts, please bring them
>> up on this dev@ mailing list and we can discuss them!
>>
>> Thanks,
>> Henry
>>
>> On 12 July 2016 at 13:34, Huaisi Xu <[email protected]> wrote:
>>
>>> Hi Tianyi, thanks for contacting us!
>>>
>>>
>>>
>>> Could you elaborate the biggest problems you are facing with this design?
>>>
>>>
>>>
>>> As we are moving to ASF, you can ask questions here
>>> [email protected].
>>>
>>>
>>>
>>> I think for questions regarding design decisions and future improvement,
>>> +Dimitris and +Henry knows better.
>>>
>>>
>>>
>>>
>>>
>>> Huaisi
>>>
>>>
>>>
>>> *From: *何天一 <[email protected]>
>>> *Date: *Monday, July 11, 2016 at 10:42 PM
>>> *To: *Huaisi Xu <[email protected]>
>>> *Subject: *Looking for OLAP suggestion
>>>
>>>
>>>
>>> Hi, Huaisi.
>>>
>>>
>>>
>>> We communicated before in cloudera JIRA (IMPALA-3499
>>> <https://issues.cloudera.org/browse/IMPALA-3499>). I am currently
>>> working on distributed storage and computing, including OLAP engines, for
>>> 今日头条.
>>>
>>>
>>>
>>> I am looking for technical suggestions and hope you could help.
>>>
>>>
>>>
>>> I see that Impala Catalogd caches metadata from Hive Metastore and HDFS
>>> (or other storage).
>>>
>>> IMHO This can be considered as a good optimization for performance.
>>>
>>> However, in our production environment, this mechanism tend to cause
>>> problem.
>>>
>>> Could you help to explain the design choice behind this?  Why did Impala
>>> cache meta in the first place? And, is there any optimization in progress
>>> to make the mechanism better?
>>>
>>>
>>>
>>> Thanks.
>>>
>>>
>>>
>>>
>>> --
>>>
>>> Cheers,
>>>
>>> Tianyi HE
>>>
>>> (+86) 185 0042 4096
>>>
>>
>>
>>
>> --
>> Henry Robinson
>> Software Engineer
>> Cloudera
>> 415-994-6679
>>
>
>
>
> --
> Cheers,
> Tianyi HE
> (+86) 185 0042 4096
>

Reply via email to