In your IndexTool invocation, try use all caps for your table and index name. Phoenix normalizes names by upper casing them (unless they're in double quotes).
One other unrelated question: did you declare your event table with IMMUTABLE_ROWS=true (assuming it's a write-once table)? If not, you can use the ALTER TABLE trans.event SET IMMUTABLE_ROWS=true command to change it. This will give you some performance benefit. Thanks, James On Fri, Aug 12, 2016 at 7:39 AM, Nathan Davis <[email protected]> wrote: > Thanks for the detailed info. I took the advice of using the ASYNC method. > The CREATE statement executes fine and I end up with an index table showing > in state BUILDING. When I kick off the MR job with `hbase > org.apache.phoenix.mapreduce.index.IndexTool --schema trans --data-table > event --index-table event_object_id_idx_b --output-path > EVENT_OBJECT_ID_IDX_B_HFILES` I get this odd error: > > 2016-08-12 14:29:40,073 ERROR [main] index.IndexTool: An exception >> occured while performing the indexing job : >> java.lang.IllegalArgumentException: >> trans.event_object_id_idx_b is not an index table for trans.event >> at org.apache.phoenix.mapreduce.index.IndexTool.run(IndexTool.java:187) >> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) >> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) >> at org.apache.phoenix.mapreduce.index.IndexTool.main(IndexTool.java:378) > > > > > My CREATE INDEX was as follows: > > create index if not exists event_object_id_idx_b on trans.event ( >> object_id >> ) ASYNC UPDATE_CACHE_FREQUENCY=60000; > > > > On Thu, Aug 11, 2016 at 9:40 PM, James Taylor <[email protected]> > wrote: > >> Hi Nathan, >> If your table is large, I'd recommend creating your index asynchronously. >> To do that, you'd add the ASYNC keyword to the end of your CREATE INDEX >> call. In this case, the index will be built through Map Reduce in a more >> resilient manner (i.e. the client going up or down won't impact it and you >> have the regular retries of a MR job). On AWS, you'll need to manually >> start the MR job, but at SFDC we have a CRON job that'll start it for you >> automatically (this is open source too, so it'd be good to get that up and >> running in AWS as well). See https://phoenix.apache.org >> /secondary_indexing.html#Index_Population for details. >> >> If you don't run it asynchronously, then you'll need to increase the >> query timeout (i.e. phoenix.query.timeoutMs config property) to be larger >> than the time it'll take to build the index. If the client goes down before >> the CREATE INDEX call finished (or the query times out), then the index >> build will stop (and unfortunately will need to be run again). >> >> To monitor the index build, there are a few ways - if running through MR, >> then you can monitor the MR job in the standard way. If running >> synchronously, you can monitor the index table - you'll see new regions >> created as splits occur, or you could query the SYSTEM.STATS table (which >> gets populated as splits and compactions happen), or you could run a >> count(*) query directly against the index table (though that'll put more >> load on your system because it'll require a full table scan). >> >> HTH. Thanks, >> James >> >> On Thu, Aug 11, 2016 at 5:51 PM, Nathan Davis < >> [email protected]> wrote: >> >>> Hi All, >>> I executed a CREATE INDEX against a fairly large table. And I received a >>> timeout error after a minute or two, which I understand is expected. >>> `!tables` in sqlline shows the index is still in BUILDING state after 2 >>> hours, which may be accurate since it is a pretty large table and my >>> cluster is just a smallish EMR throwaway. >>> >>> My question is: Is there some way to verify that the index is in fact >>> still being built? Perhaps some HBase logs or the UI or some hbase shell >>> command? Unfortunately I am just as new to HBase as I am to Phoenix itself. >>> >>> Thanks, >>> -nathan >>> >> >> >
