Re: monitoring status of CREATE INDEX operation

James Taylor Fri, 12 Aug 2016 07:59:32 -0700

In your IndexTool invocation, try use all caps for your table and index
name. Phoenix normalizes names by upper casing them (unless they're in
double quotes).


One other unrelated question: did you declare your event table with
IMMUTABLE_ROWS=true (assuming it's a write-once table)? If not, you can use
the ALTER TABLE trans.event SET IMMUTABLE_ROWS=true command to change it.
This will give you some performance benefit.

Thanks,
James

On Fri, Aug 12, 2016 at 7:39 AM, Nathan Davis <[email protected]>
wrote:

> Thanks for the detailed info. I took the advice of using the ASYNC method.
> The CREATE statement executes fine and I end up with an index table showing
> in state BUILDING. When I kick off the MR job with `hbase
> org.apache.phoenix.mapreduce.index.IndexTool --schema trans --data-table
> event --index-table event_object_id_idx_b --output-path
> EVENT_OBJECT_ID_IDX_B_HFILES` I get this odd error:
>
> 2016-08-12 14:29:40,073 ERROR [main] index.IndexTool:  An exception
>> occured while performing the indexing job : 
>> java.lang.IllegalArgumentException:
>>  trans.event_object_id_idx_b is not an index table for trans.event
>> at org.apache.phoenix.mapreduce.index.IndexTool.run(IndexTool.java:187)
>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>> at org.apache.phoenix.mapreduce.index.IndexTool.main(IndexTool.java:378)
>
>
>
>
> My CREATE INDEX was as follows:
>
> create index if not exists event_object_id_idx_b on trans.event (
>>     object_id
>> ) ASYNC UPDATE_CACHE_FREQUENCY=60000;
>
>
>
> On Thu, Aug 11, 2016 at 9:40 PM, James Taylor <[email protected]>
> wrote:
>
>> Hi Nathan,
>> If your table is large, I'd recommend creating your index asynchronously.
>> To do that, you'd add the ASYNC keyword to the end of your CREATE INDEX
>> call. In this case, the index will be built through Map Reduce in a more
>> resilient manner (i.e. the client going up or down won't impact it and you
>> have the regular retries of a MR job). On AWS, you'll need to manually
>> start the MR job, but at SFDC we have a CRON job that'll start it for you
>> automatically (this is open source too, so it'd be good to get that up and
>> running in AWS as well). See https://phoenix.apache.org
>> /secondary_indexing.html#Index_Population for details.
>>
>> If you don't run it asynchronously, then you'll need to increase the
>> query timeout (i.e. phoenix.query.timeoutMs config property) to be larger
>> than the time it'll take to build the index. If the client goes down before
>> the CREATE INDEX call finished (or the query times out), then the index
>> build will stop (and unfortunately will need to be run again).
>>
>> To monitor the index build, there are a few ways - if running through MR,
>> then you can monitor the MR job in the standard way. If running
>> synchronously, you can monitor the index table - you'll see new regions
>> created as splits occur, or you could query the SYSTEM.STATS table (which
>> gets populated as splits and compactions happen), or you could run a
>> count(*) query directly against the index table (though that'll put more
>> load on your system because it'll require a full table scan).
>>
>> HTH. Thanks,
>> James
>>
>> On Thu, Aug 11, 2016 at 5:51 PM, Nathan Davis <
>> [email protected]> wrote:
>>
>>> Hi All,
>>> I executed a CREATE INDEX against a fairly large table. And I received a
>>> timeout error after a minute or two, which I understand is expected.
>>> `!tables` in sqlline shows the index is still in BUILDING state after 2
>>> hours, which may be accurate since it is a pretty large table and my
>>> cluster is just a smallish EMR throwaway.
>>>
>>> My question is: Is there some way to verify that the index is in fact
>>> still being built? Perhaps some HBase logs or the UI or some hbase shell
>>> command? Unfortunately I am just as new to HBase as I am to Phoenix itself.
>>>
>>> Thanks,
>>>  -nathan
>>>
>>
>>
>

Re: monitoring status of CREATE INDEX operation

Reply via email to