Hi Ted,
6 minutes is too long :(
Will this decrease to seconds if more nodes are added in the cluster?
I got this exception finally(I recall faintly about increasing some timeout
parameter while querying but I didn't want to increase it to a high value) :
Apr 19, 2013 1:05:43 PM
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation
processExecs
WARNING: Error executing for row
java.util.concurrent.ExecutionException:
org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after
attempts=10, exceptions:
Fri Apr 19 12:56:01 IST 2013,
org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77,
java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020
failed on socket timeout exception: java.net.SocketTimeoutException: 60000
millis timeout while waiting for channel to be ready for read. ch :
java.nio.channels.SocketChannel[connected local=/0.0.0.0:1770
remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 12:57:02 IST 2013,
org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77,
java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020
failed on socket timeout exception: java.net.SocketTimeoutException: 60000
millis timeout while waiting for channel to be ready for read. ch :
java.nio.channels.SocketChannel[connected local=/0.0.0.0:1782
remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 12:58:04 IST 2013,
org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77,
java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020
failed on socket timeout exception: java.net.SocketTimeoutException: 60000
millis timeout while waiting for channel to be ready for read. ch :
java.nio.channels.SocketChannel[connected local=/0.0.0.0:1785
remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 12:59:05 IST 2013,
org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77,
java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020
failed on socket timeout exception: java.net.SocketTimeoutException: 60000
millis timeout while waiting for channel to be ready for read. ch :
java.nio.channels.SocketChannel[connected local=/0.0.0.0:1794
remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 13:00:08 IST 2013,
org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77,
java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020
failed on socket timeout exception: java.net.SocketTimeoutException: 60000
millis timeout while waiting for channel to be ready for read. ch :
java.nio.channels.SocketChannel[connected local=/0.0.0.0:1800
remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 13:01:10 IST 2013,
org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77,
java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020
failed on socket timeout exception: java.net.SocketTimeoutException: 60000
millis timeout while waiting for channel to be ready for read. ch :
java.nio.channels.SocketChannel[connected local=/0.0.0.0:1802
remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 13:02:14 IST 2013,
org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77,
java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020
failed on socket timeout exception: java.net.SocketTimeoutException: 60000
millis timeout while waiting for channel to be ready for read. ch :
java.nio.channels.SocketChannel[connected local=/0.0.0.0:1804
remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 13:03:19 IST 2013,
org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77,
java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020
failed on socket timeout exception: java.net.SocketTimeoutException: 60000
millis timeout while waiting for channel to be ready for read. ch :
java.nio.channels.SocketChannel[connected local=/0.0.0.0:1809
remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 13:04:27 IST 2013,
org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77,
java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020
failed on socket timeout exception: java.net.SocketTimeoutException: 60000
millis timeout while waiting for channel to be ready for read. ch :
java.nio.channels.SocketChannel[connected local=/0.0.0.0:1812
remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 13:05:43 IST 2013,
org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77,
java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020
failed on socket timeout exception: java.net.SocketTimeoutException: 60000
millis timeout while waiting for channel to be ready for read. ch :
java.nio.channels.SocketChannel[connected local=/0.0.0.0:1829
remote=cldx-1140-1034/172.25.6.71:60020]
at java.util.concurrent.FutureTask$Sync.innerGet(Unknown Source)
at java.util.concurrent.FutureTask.get(Unknown Source)
at
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processExecs(HConnectionManager.java:1475)
at
org.apache.hadoop.hbase.client.HTable.coprocessorExec(HTable.java:1236)
at
org.apache.hadoop.hbase.client.coprocessor.AggregationClient.rowCount(AggregationClient.java:216)
at client.hbase.HBaseCRUD.getTableCount(HBaseCRUD.java:307)
at client.hbase.HBaseCRUD.main(HBaseCRUD.java:117)
Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed
after attempts=10, exceptions:
Fri Apr 19 12:56:01 IST 2013,
org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77,
java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020
failed on socket timeout exception: java.net.SocketTimeoutException: 60000
millis timeout while waiting for channel to be ready for read. ch :
java.nio.channels.SocketChannel[connected local=/0.0.0.0:1770
remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 12:57:02 IST 2013,
org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77,
java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020
failed on socket timeout exception: java.net.SocketTimeoutException: 60000
millis timeout while waiting for channel to be ready for read. ch :
java.nio.channels.SocketChannel[connected local=/0.0.0.0:1782
remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 12:58:04 IST 2013,
org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77,
java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020
failed on socket timeout exception: java.net.SocketTimeoutException: 60000
millis timeout while waiting for channel to be ready for read. ch :
java.nio.channels.SocketChannel[connected local=/0.0.0.0:1785
remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 12:59:05 IST 2013,
org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77,
java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020
failed on socket timeout exception: java.net.SocketTimeoutException: 60000
millis timeout while waiting for channel to be ready for read. ch :
java.nio.channels.SocketChannel[connected local=/0.0.0.0:1794
remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 13:00:08 IST 2013,
org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77,
java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020
failed on socket timeout exception: java.net.SocketTimeoutException: 60000
millis timeout while waiting for channel to be ready for read. ch :
java.nio.channels.SocketChannel[connected local=/0.0.0.0:1800
remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 13:01:10 IST 2013,
org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77,
java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020
failed on socket timeout exception: java.net.SocketTimeoutException: 60000
millis timeout while waiting for channel to be ready for read. ch :
java.nio.channels.SocketChannel[connected local=/0.0.0.0:1802
remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 13:02:14 IST 2013,
org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77,
java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020
failed on socket timeout exception: java.net.SocketTimeoutException: 60000
millis timeout while waiting for channel to be ready for read. ch :
java.nio.channels.SocketChannel[connected local=/0.0.0.0:1804
remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 13:03:19 IST 2013,
org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77,
java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020
failed on socket timeout exception: java.net.SocketTimeoutException: 60000
millis timeout while waiting for channel to be ready for read. ch :
java.nio.channels.SocketChannel[connected local=/0.0.0.0:1809
remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 13:04:27 IST 2013,
org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77,
java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020
failed on socket timeout exception: java.net.SocketTimeoutException: 60000
millis timeout while waiting for channel to be ready for read. ch :
java.nio.channels.SocketChannel[connected local=/0.0.0.0:1812
remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 13:05:43 IST 2013,
org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77,
java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020
failed on socket timeout exception: java.net.SocketTimeoutException: 60000
millis timeout while waiting for channel to be ready for read. ch :
java.nio.channels.SocketChannel[connected local=/0.0.0.0:1829
remote=cldx-1140-1034/172.25.6.71:60020]
at
org.apache.hadoop.hbase.client.ServerCallable.withRetries(ServerCallable.java:183)
at
org.apache.hadoop.hbase.ipc.ExecRPCInvoker.invoke(ExecRPCInvoker.java:79)
at $Proxy6.getRowNum(Unknown Source)
at
org.apache.hadoop.hbase.client.coprocessor.AggregationClient$3.call(AggregationClient.java:220)
at
org.apache.hadoop.hbase.client.coprocessor.AggregationClient$3.call(AggregationClient.java:217)
at
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$4.call(HConnectionManager.java:1463)
at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
at java.util.concurrent.FutureTask.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Apr 19, 2013 1:05:43 PM
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation
internalClose
INFO: Closed zookeeper sessionid=0x13e185b8ee8003a
org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after
attempts=10, exceptions:
Fri Apr 19 12:56:01 IST 2013,
org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77,
java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020
failed on socket timeout exception: java.net.SocketTimeoutException: 60000
millis timeout while waiting for channel to be ready for read. ch :
java.nio.channels.SocketChannel[connected local=/0.0.0.0:1770
remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 12:57:02 IST 2013,
org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77,
java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020
failed on socket timeout exception: java.net.SocketTimeoutException: 60000
millis timeout while waiting for channel to be ready for read. ch :
java.nio.channels.SocketChannel[connected local=/0.0.0.0:1782
remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 12:58:04 IST 2013,
org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77,
java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020
failed on socket timeout exception: java.net.SocketTimeoutException: 60000
millis timeout while waiting for channel to be ready for read. ch :
java.nio.channels.SocketChannel[connected local=/0.0.0.0:1785
remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 12:59:05 IST 2013,
org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77,
java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020
failed on socket timeout exception: java.net.SocketTimeoutException: 60000
millis timeout while waiting for channel to be ready for read. ch :
java.nio.channels.SocketChannel[connected local=/0.0.0.0:1794
remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 13:00:08 IST 2013,
org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77,
java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020
failed on socket timeout exception: java.net.SocketTimeoutException: 60000
millis timeout while waiting for channel to be ready for read. ch :
java.nio.channels.SocketChannel[connected local=/0.0.0.0:1800
remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 13:01:10 IST 2013,
org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77,
java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020
failed on socket timeout exception: java.net.SocketTimeoutException: 60000
millis timeout while waiting for channel to be ready for read. ch :
java.nio.channels.SocketChannel[connected local=/0.0.0.0:1802
remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 13:02:14 IST 2013,
org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77,
java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020
failed on socket timeout exception: java.net.SocketTimeoutException: 60000
millis timeout while waiting for channel to be ready for read. ch :
java.nio.channels.SocketChannel[connected local=/0.0.0.0:1804
remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 13:03:19 IST 2013,
org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77,
java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020
failed on socket timeout exception: java.net.SocketTimeoutException: 60000
millis timeout while waiting for channel to be ready for read. ch :
java.nio.channels.SocketChannel[connected local=/0.0.0.0:1809
remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 13:04:27 IST 2013,
org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77,
java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020
failed on socket timeout exception: java.net.SocketTimeoutException: 60000
millis timeout while waiting for channel to be ready for read. ch :
java.nio.channels.SocketChannel[connected local=/0.0.0.0:1812
remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 13:05:43 IST 2013,
org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77,
java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020
failed on socket timeout exception: java.net.SocketTimeoutException: 60000
millis timeout while waiting for channel to be ready for read. ch :
java.nio.channels.SocketChannel[connected local=/0.0.0.0:1829
remote=cldx-1140-1034/172.25.6.71:60020]
at
org.apache.hadoop.hbase.client.ServerCallable.withRetries(ServerCallable.java:183)
at
org.apache.hadoop.hbase.ipc.ExecRPCInvoker.invoke(ExecRPCInvoker.java:79)
at $Proxy6.getRowNum(Unknown Source)
at
org.apache.hadoop.hbase.client.coprocessor.AggregationClient$3.call(AggregationClient.java:220)
at
org.apache.hadoop.hbase.client.coprocessor.AggregationClient$3.call(AggregationClient.java:217)
at
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$4.call(HConnectionManager.java:1463)
at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
at java.util.concurrent.FutureTask.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Regards,
Omkar Joshi
-----Original Message-----
From: Ted Yu [mailto:[email protected]]
Sent: Friday, April 19, 2013 3:00 PM
To: [email protected]
Cc: [email protected]
Subject: Re: Speeding up the row count
Since there is only one region in your table, using aggregation coprocessor has
no advantage.
I think there may be some issue with your cluster - row count should finish
within 6 minutes.
Have you checked server logs ?
Thanks
On Apr 19, 2013, at 12:33 AM, Omkar Joshi <[email protected]> wrote:
> Hi,
>
> I'm having a 2-node(VMs) Hadoop cluster atop which HBase is running in the
> distributed mode.
>
> I'm having a table named ORDERS with >100000 rows.
>
> NOTE : Since my cluster is ultra-small, I didn't pre-split the table.
>
> ORDERS
> rowkey : ORDER_ID
>
> column family : ORDER_DETAILS
> columns : CUSTOMER_ID
> PRODUCT_ID
> REQUEST_DATE
> PRODUCT_QUANTITY
> PRICE
> PAYMENT_MODE
>
> The java client code to simply check the count of the records is :
>
> public long getTableCount(String tableName, String columnFamilyName) {
>
> AggregationClient aggregationClient = new
> AggregationClient(config);
> Scan scan = new Scan();
> scan.addFamily(Bytes.toBytes(columnFamilyName));
> scan.setFilter(new FirstKeyOnlyFilter());
>
> long rowCount = 0;
>
> try {
> rowCount =
> aggregationClient.rowCount(Bytes.toBytes(tableName),
> null, scan);
> System.out.println("No. of rows in " + tableName + "
> is "
> + rowCount);
> } catch (Throwable e) {
> // TODO Auto-generated catch block
> e.printStackTrace();
> }
>
> return rowCount;
> }
>
> It is running for more than 6 minutes now :(
>
> What shall I do to speed up the execution to milliseconds(at least a couple
> of seconds)?
>
> Regards,
> Omkar Joshi
>
>
> -----Original Message-----
> From: Vedad Kirlic [mailto:[email protected]]
> Sent: Thursday, April 18, 2013 12:22 AM
> To: [email protected]
> Subject: Re: Speeding up the row count
>
> Hi Omkar,
>
> If you are not interested in occurrences of specific column (e.g. name,
> email ... ), and just want to get total number of rows (regardless of their
> content - i.e. columns), you should avoid adding any columns to the Scan, in
> which case coprocessor implementation for AggregateClient, will add
> FirstKeyOnlyFilter to the Scan, so to avoid loading unnecessary columns, so
> this should result in some speed up.
>
> This is similar approach to what hbase shell 'count' implementation does,
> although reduction in overhead in that case is bigger, since data transfer
> from region server to client (shell) is minimized, whereas in case of
> coprocessor, data does not leave region server, so most of the improvement
> in that case should come from avoiding loading of unnecessary files. Not
> sure how this will apply to your particular case, given that data set per
> row seems to be rather small. Also, in case of AggregateClient you will
> benefit if/when your tables span multiple regions. Essentially, performance
> of this approach will 'degrade' as your table gets bigger, but only to the
> point when it splits, from which point it should be pretty constant. Having
> this in mind, and your type of data, you might consider pre-splitting your
> tables.
>
> DISCLAIMER: this is mostly theoretical, since I'm not an expert in hbase
> internals :), so your best bet is to try it - I'm too lazy to verify impact
> my self ;)
>
> Finally, if your case can tolerate eventual consistency of counters with
> actual number of rows, you can, as already suggested, have RowCounter map
> reduce run every once in a while, write the counter(s) back to hbase, and
> read those when you need to obtain the number of rows.
>
> Regards,
> Vedad
>
>
>
> --
> View this message in context:
> http://apache-hbase.679495.n3.nabble.com/Speeding-up-the-row-count-tp4042378p4042415.html
> Sent from the HBase User mailing list archive at Nabble.com.
>
> The contents of this e-mail and any attachment(s) may contain confidential or
> privileged information for the intended recipient(s). Unintended recipients
> are prohibited from taking action on the basis of information in this e-mail
> and using or disseminating the information, and must notify the sender and
> delete it from their system. L&T Infotech will not accept responsibility or
> liability for the accuracy or completeness of, or the presence of any virus
> or disabling code in this e-mail"