[jira] [Comment Edited] (HBASE-18161) Incremental Load support for Multiple-Table HFileOutputFormat

Densel Santhmayor (JIRA) Wed, 21 Jun 2017 12:42:18 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-18161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16057745#comment-16057745
 ]


Densel Santhmayor edited comment on HBASE-18161 at 6/21/17 7:41 PM:
--------------------------------------------------------------------

Edit: Comment got added before I had a chance to finish it. Updated below:
[~jerryhe], I initially answered your comment at the top, but I'll put the 
salient differences here for posterity:

1. The configuration object formerly stored only a single set of encodings of 
compression, block encoding etc per column family. and this doesn't seem to be 
done per table. In 
[HBASE-16261|https://issues.apache.org/jira/browse/HBASE-16261], I think 
there's a bug where these settings are stored in the conf object for the column 
families of only the *last* Table object that was created. In my patch, I have 
upgraded these apis to store settings per tablename and column family. In the 
future, namespace support can be easily added as well. 

2. In my patch, I require the multiple Tables to be registered through 
MultiHFileOutputFormat.configureIncrementalLoad. This allows me to check if a 
valid table was specified in the write() function and prevent bad data from 
being inadvertently written to HFiles. 

3. I wrote a public interface using MultiHFileOutputFormat that mimics that of 
HFileOutputFormat2 which allows users to switch between the two with ease. This 
helps us use common code in HFileOutputFormat2 and for tests. This means that 
future features (such as locality sensitivity) that are added later to 
HFileOutputFormat2 will automatically benefit MultiHFileOutputFormat.

The major difference as you mentioned is the output key. A major reason I 
decided to write it this way is to maintain backwards compatibility to be able 
to use the same TotalOrderPartitioner as for the default use case. The 
SimpleTotalOrderPartitioner code is very simple and straightforward, and 
writing a new partitioner would mean a LOT of new code that hasn't been heavily 
vetted. My approach reduces code duplication and maintenance. Further, any 
upgrades to the shared code benefits multiple table support as well. I don’t 
know if there’s a performance gain by using TotalOrderPartitioner on the rowkey 
vs the one created in 
[HBASE-16261|https://issues.apache.org/jira/browse/HBASE-16261]. I’d assume 
that NOT requiring a copy of the row from the Cell for sorting is a positive.

I think that one challenge of my approach is the requirement to choose a valid 
separator. I chose ‘:’ for now since it cannot exist in a tablename. I do not 
make any assumption for rowkey. I search for the first separator from the left 
and then check if the tablename I extracted matches one of the registered 
tables. In rare/inadvertant/malicious cases, for example if the beginning of 
the rowkey contains the tablename and separator, and this rowkey was 
inadvertently sent as output, then a bad key could be ingested as a row, but if 
the provided api is used to create the key, this shouldn’t be an issue 
(especially since the approach in HBASE-16261 also deviates from just sending 
the rowkey as output key. This allows us to incorporate namespaces in the 
future in a clean way, since we only have to modify or create a new createKey 
api and this allows backwards compatibility as well.   We’ve been using this 
approach at Bloomberg in production for over 2 years now without issues, though 
the patch I’m submitting is much cleaner and cohesive. 


was (Author: denselm):
[~jerryhe], I answered a few of your questions after you asked them at the top 
of this comment page, but I'll put them all together here for posterity:

1. The configuration object stores only a single set of encodings of 
compression, block encoding etc per column family. and this doesn't seem to be 
done per table. 

> Incremental Load support for Multiple-Table HFileOutputFormat
> -------------------------------------------------------------
>
>                 Key: HBASE-18161
>                 URL: https://issues.apache.org/jira/browse/HBASE-18161
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Densel Santhmayor
>            Priority: Minor
>         Attachments: MultiHFileOutputFormatSupport_HBASE_18161.patch, 
> MultiHFileOutputFormatSupport_HBASE_18161_v2.patch, 
> MultiHFileOutputFormatSupport_HBASE_18161_v3.patch, 
> MultiHFileOutputFormatSupport_HBASE_18161_v4.patch, 
> MultiHFileOutputFormatSupport_HBASE_18161_v5.patch, 
> MultiHFileOutputFormatSupport_HBASE_18161_v6.patch, 
> MultiHFileOutputFormatSupport_HBASE_18161_v7.patch, 
> MultiHFileOutputFormatSupport_HBASE_18161_v8.patch
>
>
> h2. Introduction
> MapReduce currently supports the ability to write HBase records in bulk to 
> HFiles for a single table. The file(s) can then be uploaded to the relevant 
> RegionServers information with reasonable latency. This feature is useful to 
> make a large set of data available for queries at the same time as well as 
> provides a way to efficiently process very large input into HBase without 
> affecting query latencies.
> There is, however, no support to write variations of the same record key to 
> HFiles belonging to multiple HBase tables from within the same MapReduce job. 
>  
> h2. Goal
> The goal of this JIRA is to extend HFileOutputFormat2 to support writing to 
> HFiles for different tables within the same MapReduce job while single-table 
> HFile features backwards-compatible. 
> For our use case, we needed to write a record key to a smaller HBase table 
> for quicker access, and the same record key with a date appended to a larger 
> table for longer term storage with chronological access. Each of these tables 
> would have different TTL and other settings to support their respective 
> access patterns. We also needed to be able to bulk write records to multiple 
> tables with different subsets of very large input as efficiently as possible. 
> Rather than run the MapReduce job multiple times (one for each table or 
> record structure), it would be useful to be able to parse the input a single 
> time and write to multiple tables simultaneously.
> Additionally, we'd like to maintain backwards compatibility with the existing 
> heavily-used HFileOutputFormat2 interface to allow benefits such as locality 
> sensitivity (that was introduced long after we implemented support for 
> multiple tables) to support both single table and multi table hfile writes. 
> h2. Proposal
> * Backwards compatibility for existing single table support in 
> HFileOutputFormat2 will be maintained and in this case, mappers will need to 
> emit the table rowkey as before. However, a new class - 
> MultiHFileOutputFormat - will provide a helper function to generate a rowkey 
> for mappers that prefixes the desired tablename to the existing rowkey as 
> well as provides configureIncrementalLoad support for multiple tables.
> * HFileOutputFormat2 will be updated in the following way:
> ** configureIncrementalLoad will now accept multiple table descriptor and 
> region locator pairs, analogous to the single pair currently accepted by 
> HFileOutputFormat2. 
> ** Compression, Block Size, Bloom Type and Datablock settings PER column 
> family that are set in the Configuration object are now indexed and retrieved 
> by tablename AND column family
> ** getRegionStartKeys will now support multiple regionlocators and calculate 
> split points and therefore partitions collectively for all tables. Similarly, 
> now the eventual number of Reducers will be equal to the total number of 
> partitions across all tables. 
> ** The RecordWriter class will be able to process rowkeys either with or 
> without the tablename prepended depending on how configureIncrementalLoad was 
> configured with MultiHFileOutputFormat or HFileOutputFormat2.
> * The use of MultiHFileOutputFormat will write the output into HFiles which 
> will match the output format of HFileOutputFormat2. However, while the 
> default use case will keep the existing directory structure with column 
> family name as the directory and HFiles within that directory, in the case of 
> MultiHFileOutputFormat, it will output HFiles in the output directory with 
> the following relative paths: 
> {noformat}
>      --table1 
>        --family1 
>          --HFiles 
>      --table2 
>        --family1 
>        --family2 
>          --HFiles
> {noformat}
> This aims to be a comprehensive solution to the original tickets - HBASE-3727 
> and HBASE-16261. Thanks to [~clayb] for his support.
> The patch will be attached shortly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Comment Edited] (HBASE-18161) Incremental Load support for Multiple-Table HFileOutputFormat

Reply via email to