[jira] [Commented] (HUDI-53) Implement Record level Index to map a record key to a pair #90

Prashant Wason (Jira) Fri, 13 May 2022 17:51:05 -0700


    [ 
https://issues.apache.org/jira/browse/HUDI-53?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17536945#comment-17536945
 ]


Prashant Wason commented on HUDI-53:
------------------------------------

[Schema for the record 
index|https://github.com/apache/hudi/pull/5581/files#diff-66abb79a1d28adea3315c48a6fc334247c7fb9a795bd59f093f9c8bc2da1a91a]
 optimized to save the record_key -> fileID mapping. It currently achieves 
about 48 to 50 bytes per mapping stored in the record index with GZIP 
compression. I think with ZSTD we may reduce it by a few more bytes. 

 

[Various 
configs|https://github.com/apache/hudi/pull/5581/files#diff-11e9ef6bd53ef1001b669a1dc68dde2aba9b33c9eb72cc1e4198750336d79772]
 for the MT record index

 

[A record key 
iterator|https://github.com/apache/hudi/pull/5581/files#diff-afa34d95ad0690283d7a741ccfe1d3fc7df9e2f561bc9cc9c5ba25fa3b57a30b]
 for the various base file readers 
([HFileReader|[https://github.com/apache/hudi/pull/5581/files#diff-0abe0627b252c5eef221374b5e91f34d09f457f52e5d9798aee5ef79111c5adb]]
  , 
[ORCReader|https://github.com/apache/hudi/pull/5581/files#diff-3abdb5ba0f56065ad767e0b5690a80493fcefccdfbc8e3500a3e68f0f8f6ca8b],
 
[ParquetReader|https://github.com/apache/hudi/pull/5581/files#diff-d7264f7fc03aefba56a28e84cf897ad88f1e99e79a107df8ba27b546514ce1e4]).
 This is used for reading the keys while initializing the record index.

 

1. Changed enabling of metadata table partitions
In the current code, we enable and check for metadata table partitions through 
the WriteConfig. This does not bear well for cases of synchronous updates to 
table as a faulty config will render the MT inconsistent.

[Changed this to save the enable state of a MT partition in the 
hoodie.properties file post 
initialization|https://github.com/apache/hudi/pull/5581/files#diff-53ae78ff1f1bd5d8b0f87cb69853299e5228b44f30b770e27f60c0c3c27d4185].
 The checks would now be 
{{table.getMetaClient().getTableConfig().isMetadataTableEnabled() }} instead of 
{{ config.isMetadataTableEnabled()}}

A new Hoodie Index type RECORD_INDEX[ and its 
implementation|[https://github.com/apache/hudi/pull/5581/files#diff-b22610e17825aeccb587f64b3dd0fedfe428d4f33b0d2a25a8d258a23cd66323]].
 This index used the MT record_index partition to perform the update and tag 
operations.

By default, HoodieWriteHandle does not track the written records within 
WriteStatus for memory optimization. But with MT partitions like record_index, 
we need access to the information about records inserted/updated into the 
dataset. Hence, [we need to track written records within 
WriteStatus|https://github.com/apache/hudi/pull/5581/files#diff-63a77e05c924278c190061a1a18a992a7f9480af14f0f34f4328bf72ae673fe9]
 in two cases:
      1. When the HoodieIndex being used is not implicit with storage
      2. If any of the metadata table partitions (record index, etc) which 
require written record tracking are enabled

 

File groups in each partitions are fixed at creation time and we do not want 
them to be split into muliple files
    // ever. [Hence we use a very large basefile 
size|https://github.com/apache/hudi/pull/5581/files#diff-b20dd7a7d374928dc9936cc33789ba1839da3a10883fc65d62fcccf84b81ed4f]
 in metadata table.

In metadata table, the [log blocks should be as large as the log file max 
size|https://github.com/apache/hudi/pull/5581/files#diff-b20dd7a7d374928dc9936cc33789ba1839da3a10883fc65d62fcccf84b81ed4f].
 This reduces the overall number of log blocks and speeds up lookup of keys in 
HFileLogBlocks.

 

[Initializing of record 
index|https://github.com/apache/hudi/pull/5581/files#diff-b20dd7a7d374928dc9936cc33789ba1839da3a10883fc65d62fcccf84b81ed4f]
 for all engines by reading keys from base files.

 

[Estimates the file group count to use for a MT 
partition|https://github.com/apache/hudi/pull/5581/files#diff-b20dd7a7d374928dc9936cc33789ba1839da3a10883fc65d62fcccf84b81ed4f].
 Different partitions save different amount of information and hence need a 
separate file group count. This is hard to estimate manually when thousands of 
datasets are involved in production rollout. This code estimates correct size 
of the file group for a partition by default. The WriteConfig can still be used 
to override or provide a manual value.

 

[BulkInsert for MT when a partition is being 
initialized|https://github.com/apache/hudi/pull/5581/files#diff-51e81a343e90f5c52e69c184b4eb6718542affde99dee3c85af9edb6425a5e19]
 for the first time. This has various benefits for scale:

 - avoids Workload Profile which needs lots of memory and is slow

 - Is fast - 270Billion records indexed in 7.5hrs 

 

[A new BulkInsertParitioner for 
MT|https://github.com/apache/hudi/pull/5581/files#diff-65089796097739c8b1a6b5be58cc4a9d15c8f754f55f25b03e7a2871cfe5e9d3]
 which is required for sharding the records into the correct file groups.

[Metrics for HUDI Bloom 
Indexes|https://github.com/apache/hudi/pull/5581/files#diff-04cb169f456ebe056b91868bcacea9eb8e26a99816b777ca9d84ffe5eb8521a7]
  and 
[HBaseIndex|https://github.com/apache/hudi/pull/5581/files#diff-5fc348b9beea8b086a96808f76f9527a6076334d631d45c5cefb261e7155cad4].
 These are useful for comparison between the indexes as well as for debugging 
ingestion issues.

 

 

[Parallel reading of keys from MT 
|[https://github.com/apache/hudi/pull/5581/files#diff-7c43aea81a02b4f135452b50eaa36d5868081e72b37d43101ca9de1f9ebb5195]]with
 an interface optimized for large amount of reads (millions of tagLocations 
etc). The existing interface uses List<> which is less performant than using Map

 

> Implement Record level Index to map a record key to a <partition path, 
> FileID> pair #90
> ---------------------------------------------------------------------------------------
>
>                 Key: HUDI-53
>                 URL: https://issues.apache.org/jira/browse/HUDI-53
>             Project: Apache Hudi
>          Issue Type: Task
>          Components: metadata, writer-core
>    Affects Versions: 0.9.0
>            Reporter: Vinoth Chandar
>            Assignee: Prashant Wason
>            Priority: Critical
>              Labels: pull-request-available
>             Fix For: 0.12.0
>
>
> [https://github.com/uber/hudi/issues/90] 
>  
> feature-enquiry
>  * [https://github.com/apache/hudi/issues/4058]
>  *  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Commented] (HUDI-53) Implement Record level Index to map a record key to a pair #90

Reply via email to