Yes, Ted is right , "Error 1102 (XCL02): Cannot get all table regions" happens when Phoenix is not able to get locations of all regions. Assigning that offline region should help.
On Mon, Aug 29, 2016 at 10:22 PM, Ted Yu <[email protected]> wrote: > I searched for "Cannot get all table regions" in hbase repo - no hit. > Seems to be Phoenix error. > > Anyway, the cause could be due to the 1 offline region for this table. > Can you retrieve the encoded region name and search for it in the master > log ? > > Feel free to pastebin snippets of master / region server logs if needed > (with proper redaction). > > See if the following shell command works: > > hbase> assign 'REGIONNAME' > hbase> assign 'ENCODED_REGIONNAME' > > Cheers > > On Mon, Aug 29, 2016 at 9:41 AM, Riesland, Zack <[email protected]> > wrote: > >> Our cluster recently had some issue related to network outages*. >> >> When all the dust settled, Hbase eventually "healed" itself, and almost >> everything is back to working well, with a couple of exceptions. >> >> In particular, we have one table where almost every (Phoenix) query times >> out - which was never the case before. It's very small compared to most of >> our other tables at around 400 million rows. >> >> I have tried with a raw JDBC connection in Java code as well as with Aqua >> Data Studio, both of which usually work fine. >> >> The specific failure is that after 15 minutes (the set timeout), I get a >> one-line error that says: “Error 1102 (XCL02): Cannot get all table regions” >> >> When I look at the GUI tools (like http://<my >> server>:16010/master-status#storeStats) it shows '1' under "offline >> regions" for that table (it has 33 total regions). Almost all the other >> tables show '0'. >> >> Can anyone help me troubleshoot this? >> >> Are there Phoenix tables I can clear out that may be confused? >> >> This isn’t an issue with the schema or skew or anything. The same table >> with the same data was lightning fast before these hbase issues. >> >> I know there is a CLI tool for fixing HBase issues. I'm wondering whether >> that "offline region" is the cause of these timeouts. >> >> If not, how I can I figure it out? >> >> Thanks! >> >> >> >> * FWIW, what happened was that DNS stopped working for a while, so HBase >> started referring to all the region servers by IP address, which somewhat >> worked, until the region servers restarted. Then they were hosed until a >> bit of manual intervention. >> >> >> > >
