[ 
https://issues.apache.org/jira/browse/PHOENIX-2209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14715275#comment-14715275
 ] 

Keren Gu commented on PHOENIX-2209:
-----------------------------------

This is from a different run, but also creating local indexes using IndexTool, 
but it seems to be loading hfiles just fine. Then when I look into the index 
table (LC_INDEX_SOJU_PROD_FN in this case), it is empty: 

5/08/25 03:10:35 INFO mapreduce.LoadIncrementalHFiles: Trying to load 
hfile=hdfs://nameservice/user/ubuntu/LC_INDEX_SOJU_PROD_FN/_LOCAL_IDX_PH_SOJU_PROD/0/cf1b91d0c0b44de39abebfa1dc5762fb
 first=\x00\x01Transaction_payment_type\x00\xB9X$: 
last=\x00\x01extras_req_email_domainDomainMX_list_item_5\x00\xA0\xFD\x005
15/08/25 03:10:35 INFO mapreduce.LoadIncrementalHFiles: Trying to load 
hfile=hdfs://nameservice/user/ubuntu/LC_INDEX_SOJU_PROD_FN/_LOCAL_IDX_PH_SOJU_PROD/0/ff09590406c94a2f9f1952db44d7dc60
 first=\x00\x01PerMinute_market_item_click\x00\xF1\x00\x89\x85 
last=\x00\x01Transaction_payment_type\x00\xB9X"5
15/08/25 03:10:35 INFO mapreduce.LoadIncrementalHFiles: Trying to load 
hfile=hdfs://nameservice/user/ubuntu/LC_INDEX_SOJU_PROD_FN/_LOCAL_IDX_PH_SOJU_PROD/0/a7baca724668401b9cc592271ec4c241
 first=\x00\x01EmailBillingNameMatch\x00\xDE\x9A&\x08 
last=\x00\x01HasTxButNoPageView\x00\x9B.\xCB\xC9
15/08/25 03:10:35 INFO index.IndexTool: Removing output directory 
LC_INDEX_SOJU_PROD_FN/_LOCAL_IDX_PH_SOJU_PROD
15/08/25 03:10:36 INFO index.IndexTool:  Updated the status of the index 
LC_INDEX_SOJU_PROD_FN to ACTIVE
15/08/25 03:10:36 INFO client.ConnectionManager$HConnectionImplementation: 
Closing zookeeper sessionid=0x54f4da9c1bcfebf
15/08/25 03:10:36 INFO zookeeper.ZooKeeper: Session: 0x54f4da9c1bcfebf closed
15/08/25 03:10:36 INFO zookeeper.ClientCnxn: EventThread shut down

> Building Local Index Asynchronously via IndexTool fails to populate index 
> table
> -------------------------------------------------------------------------------
>
>                 Key: PHOENIX-2209
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-2209
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.5.0
>         Environment: CDH: 5.4.4
> HBase: 1.0.0
> Phoenix: 4.5.0 (https://github.com/SiftScience/phoenix/tree/4.5-HBase-1.0) 
> with hacks for CDH compatibility. 
>            Reporter: Keren Gu
>              Labels: IndexTool, LocalIndex, index
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Using the Asynchronous Index population tool to create local index (of 1 
> column) on tables with 10 columns, and 65M, 250M, 340M, and 1.3B rows 
> respectively. 
> Table Schema as follows (with generic column names): 
> {quote}
> CREATE TABLE PH_SOJU_SHORT (
> id INT PRIMARY KEY,
> c2 VARCHAR NULL,
> c3 VARCHAR NULL,
> c4 VARCHAR NULL,
> c5 VARCHAR NULL,
> c6 VARCHAR NULL,
> c7 DOUBLE NULL,
> c8 VARCHAR NULL,
> c9 VARCHAR NULL,
> c10 BIGINT NULL
> )
> {quote}
> Example command used (for 65M row table): 
> {quote}
> 0: jdbc:phoenix:localhost> create index LC_INDEX_SOJU_EVAL_FN on 
> PH_SOJU_SHORT(C4) async;
> {quote}
> And MR job started with command: 
> {quote}
> $ hbase org.apache.phoenix.mapreduce.index.IndexTool --data-table 
> PH_SOJU_SHORT --index-table LC_INDEX_SOJU_EVAL_FN --output-path 
> LC_INDEX_SOJU_EVAL_FN_HFILE
> {quote}
> The IndexTool MR jobs finished in 18min, 77min, 77min, and 2hr 34min 
> respectively, but all index tables where empty. 
> For the table with 65M rows, IndexTool had 12 mappers and reducers. MR 
> Counters show Map input and output records = 65M, Reduce Input and output 
> records = 65M. PhoenixJobCounters input and output records are all 65M. 
> IndexTool Reducer Log tail: 
> {quote}
> ...
> 2015-08-25 00:26:44,687 INFO [main] org.apache.hadoop.mapred.Merger: Down to 
> the last merge-pass, with 32 segments left of total size: 22805636866 bytes
> 2015-08-25 00:26:44,693 INFO [main] 
> org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: File Output 
> Committer Algorithm version is 1
> 2015-08-25 00:26:44,765 INFO [main] 
> org.apache.hadoop.conf.Configuration.deprecation: hadoop.native.lib is 
> deprecated. Instead, use io.native.lib.available
> 2015-08-25 00:26:44,908 INFO [main] 
> org.apache.hadoop.conf.Configuration.deprecation: mapred.skip.on is 
> deprecated. Instead, use mapreduce.job.skiprecords
> 2015-08-25 00:26:45,060 INFO [main] 
> org.apache.hadoop.hbase.io.hfile.CacheConfig: CacheConfig:disabled
> 2015-08-25 00:36:43,880 INFO [main] 
> org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2: 
> Writer=hdfs://nameservice/user/ubuntu/LC_INDEX_SOJU_EVAL_FN/_LOCAL_IDX_PH_SOJU_EVAL/_temporary/1/_temporary/attempt_1440094483400_5974_r_000000_0/0/496b926ad624438fa08626ac213d0f92,
>  wrote=10737418236
> 2015-08-25 00:36:45,967 INFO [main] 
> org.apache.hadoop.hbase.io.hfile.CacheConfig: CacheConfig:disabled
> 2015-08-25 00:38:43,095 INFO [main] org.apache.hadoop.mapred.Task: 
> Task:attempt_1440094483400_5974_r_000000_0 is done. And is in the process of 
> committing
> 2015-08-25 00:38:43,123 INFO [main] org.apache.hadoop.mapred.Task: Task 
> attempt_1440094483400_5974_r_000000_0 is allowed to commit now
> 2015-08-25 00:38:43,132 INFO [main] 
> org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: Saved output of 
> task 'attempt_1440094483400_5974_r_000000_0' to 
> hdfs://nameservice/user/ubuntu/LC_INDEX_SOJU_EVAL_FN/_LOCAL_IDX_PH_SOJU_EVAL/_temporary/1/task_1440094483400_5974_r_000000
> 2015-08-25 00:38:43,158 INFO [main] org.apache.hadoop.mapred.Task: Task 
> 'attempt_1440094483400_5974_r_000000_0' done.
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to