[ 
https://issues.apache.org/jira/browse/IMPALA-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16463587#comment-16463587
 ] 

Balazs Jeszenszky commented on IMPALA-6729:
-------------------------------------------

[~stiga-huang] just commenting on your experiment. I don't think your 
assumption about small files and block count is safe. Averages based on your 
data are:

{code}
1479550865896345 (total size) / 9098905 (file count) bytes ~= 155MB per file
1479550865896345 (total size) / 1799131 (partition count) bytes ~= 784MB per 
partition
{code}

Both of these averages are too low IMO.
After bumping the average file size to 512MB (assumed equivalent reduction rate 
in block count) and average partition size to 4GB, using the 
[estimations|https://github.com/apache/impala/blob/branch-2.12.0/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java#L140-L146],
 I got a catalog size of 3.35GB.

{code:java}
9098905*500+1799131*2048+13621520*150
4549452500 bytes  = 4.2GB  from files
3684620288 bytes  = 3.43GB from partitions
2043228000 bytes  = 2.05GB from blocks
10277300788 bytes = 9.57GB sum

2889748*500+361218*2048+(13621520*0.69)*150
1444874000 bytes  = 1.44GB from files
739774464 bytes   = 0.74GB from partitions
1409827200 bytes  = 1.41GB from blocks
3594475664 bytes  = 3.35GB sum
{code}

Not saying this invalidates your idea, but there is a lot to be gained by 
compaction and by reducing partition count in this case.

> Provide startup option to disable file and block location cache
> ---------------------------------------------------------------
>
>                 Key: IMPALA-6729
>                 URL: https://issues.apache.org/jira/browse/IMPALA-6729
>             Project: IMPALA
>          Issue Type: New Feature
>          Components: Catalog
>            Reporter: Quanlong Huang
>            Priority: Major
>         Attachments: Screen Shot 2018-05-04 at 12.12.21 PM.png
>
>
> In HDFS, scheduling PlanFragments according to block locations can improve 
> the locality of queries. However, every coin has two sides. There’re some 
> scenarios that loading & keeping the block locations brings no benefits, 
> sometimes even becomes a burden.
> {panel:title=Scenario 1}
> In a Hadoop cluster with ~1000 nodes, Impala cluster is only deployed on tens 
> of computation nodes (i.e. with small disks but larger memory and powerful 
> CPUs). Data locality is poor since most of the blocks have no replicas in the 
> Impala nodes. Network bandwidth is 1Gbit/s so it’s ok for remote read. 
> Queries are only required to finish within 5 mins.
>  
> Block location info is useless since the scheduler always comes up with the 
> same plan.
> {panel}
> {panel:title=Scenario 2}
> load_catalog_in_background is set to false since there’re several PB of data 
> in hive warehouse. If it’s set to true, the Impala cluster won’t be able to 
> start up (will waiting for loading block locations and finally full fill the 
> memory of catalogd and crash it).
> Accessing a hive table containing >10,000 partitions at the first time will 
> be stuck for a long time. Sometimes it can’t even finish for some large 
> tables. Users are annoyed when they only want to describe the table or select 
> a few partitions on this table.
>  
> Block location info is a burden here since its loading dominates the query 
> time. Finally, only a little portion of the block location info can be used.
> {panel}
> {panel:title=Scenario 3}
> There’re many ETL pipelines ingesting data into Hive warehouse. Some tables 
> are updated by replacing the whole data set. Some partitioned tables are 
> updated by inserting new partitions.
> Ad hoc queries are used to be served by Presto. When trying to introduce 
> Impala to replace Presto, we should add a REFRESH table step at the end of 
> each pipeline, which takes great efforts (many code changes on the existing 
> warehouse).
> IMPALA-4272 can solve this but has no progress. If file and block location 
> metadata cache can be disabled, things will be simple.
> {panel}
> IMPALA-3127 is relative. But we hope it's possible to not keep the block 
> locations.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to