RE: How to count table rows from Java?

Riesland, Zack Mon, 29 Jun 2015 09:47:26 -0700

Perfect.

Thank you so much!


-----Original Message-----
From: James Taylor [mailto:[email protected]] 
Sent: Monday, June 29, 2015 11:51 AM
To: user
Subject: Re: How to count table rows from Java?

Server side HBase properties such as hbase.rpc.timeout need to be set in your 
hbase-sites.xml. There's no API to set these.

The phoenix.query.timeoutMs can be set using
http://docs.oracle.com/javase/7/docs/api/java/sql/Statement.html#setQueryTimeout(int)
prior to executing the query. The units in this case is seconds, not 
milliseconds.

The phoenix.query.timeoutMs can also be set at connection-time in the 
properties like this:

        Properties props = new Properties();
        props.setProperty("phoenix.query.timeoutMs", Long.toString("600000"));
        Connection conn = DriverManager.getConnection(url, props);

This would set the timeout of a query for any statement executing through the 
connection to 600000 milliseconds (10mins).

You can also set the phoenix.query.timeoutMs in your client-side 
hbase-sites.xml and it'll be used for the query timeout for all connections.

Thanks,
James


On Mon, Jun 29, 2015 at 2:44 AM, Riesland, Zack <[email protected]> 
wrote:
> Thanks, James!
>
>
>
> Can you point me to some instructions or some syntax for setting those 
> timeout values in Java code?
>
>
>
> I’ve seen lots of information about setting them on the classpath when 
> launching different query clients. But how do I set a high timeout so 
> that I can do a large query via Java/JDBC?
>
>
>
> Thanks!
>
>
>
> From: James Taylor [mailto:[email protected]]
> Sent: Friday, June 26, 2015 2:14 PM
> To: [email protected]
> Subject: Re: How to count table rows from Java?
>
>
>
> Zach,
>
> I wouldn't at all say that doing a count(*) is not recommended. It's 
> important to know that 1) this requires a full table scan and 2) this 
> is done by Phoenix asynchronously. You'll need to set the timeouts 
> high enough for this to complete. Phoenix will be much faster than 
> running a MR job, but the MR job runs asynchronously.
>
>
>
> To ensure your stats are up to date, run a major compaction on your 
> table as that updates the stats. Also, if you have wide rows, consider 
> using multiple column families so that the count(*) doesn't have to 
> traverse all of your data.
>
>
>
> Thanks,
>
> James
>
>
> On Friday, June 26, 2015, Nick Dimiduk <[email protected]> wrote:
>
> RowCounter is s mapreduce program. After the program completes 
> execution of the job, it returns information about that job, including job 
> counters.
> RowCounter includes its counts in the job counters, so they're easily 
> accessed programmatically from the returned object. It's not a 
> ResultSet, but it should work none the less.
>
> On Friday, June 26, 2015, Riesland, Zack <[email protected]> wrote:
>
> I wrote a Java program that runs nightly and collects metrics about 
> our hive tables.
>
>
>
> I would like to include HBase tables in this as well.
>
>
>
> Since select count(*) is slow and not recommended on Phoenix, what are 
> my alternatives from Java?
>
>
>
> Is there a way to call org.apache.hadoop.hbase.mapreduce.RowCounter 
> from java and get results in some kind of result set?
>
>
>
> Thanks for any info!
>
>
>
>

RE: How to count table rows from Java?

Reply via email to