[jira] [Commented] (PHOENIX-998) SocketTimeoutException under high concurrent write access to phoenix indexed table
[ https://issues.apache.org/jira/browse/PHOENIX-998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14072996#comment-14072996 ] Vikas Vishwakarma commented on PHOENIX-998: --- So this is my complete analysis: In the scanner, I was setting the cache size to 6400 rows that comes to ~1.5 MB and I was just iterating over the results once as given in the code snapshot above and discarding them since the goal of my loader client was to create high scan load and I don't really care about the results. With hbase-0.94 build this was working fine which means it was able to complete within 60 seconds but with hbase-0.98 some of the scans are probably taking more than 60 seconds leading to lease expiry and resultant failures. I reduced the cache size to 3200 from 6400 and there were fewer failures and then I reduced it to 1/4th i.e 1600 rows (~0.4 - 0.5 MB) and this passed with 0 failures SocketTimeoutException under high concurrent write access to phoenix indexed table -- Key: PHOENIX-998 URL: https://issues.apache.org/jira/browse/PHOENIX-998 Project: Phoenix Issue Type: Bug Affects Versions: 4.0.0 Environment: HBase 0.98.1-SNAPSHOT, Hadoop 2.3.0-cdh5.0.0 Reporter: wangxianbin Priority: Critical we have a small hbase cluster, which has one master, six slaves, we test phoenix index concurrent write access performance with four write clients, each client has 100 threads, each thread has one phoenix jdbc connection, and we encounter SocketTimeoutException as follow, and it will retry for very long time, how can i deal with such issue? 2014-05-22 17:22:58,490 INFO [storm4.org,60020,1400750242045-index-writer--pool3-t10] client.AsyncProcess: #16016, waiting for some tasks to finish. Expected max=0, tasksSent=13, tasksDone=12, currentTasksDone=12, retries=11 hasError=false, tableName=IPHOENIX10M 2014-05-22 17:23:00,436 INFO [storm4.org,60020,1400750242045-index-writer--pool3-t6] client.AsyncProcess: #16027, waiting for some tasks to finish. Expected max=0, tasksSent=13, tasksDone=12, currentTasksDone=12, retries=11 hasError=false, tableName=IPHOENIX10M 2014-05-22 17:23:00,440 INFO [storm4.org,60020,1400750242045-index-writer--pool3-t1] client.AsyncProcess: #16013, waiting for some tasks to finish. Expected max=0, tasksSent=13, tasksDone=12, currentTasksDone=12, retries=11 hasError=false, tableName=IPHOENIX10M 2014-05-22 17:23:00,449 INFO [storm4.org,60020,1400750242045-index-writer--pool3-t7] client.AsyncProcess: #16028, waiting for some tasks to finish. Expected max=0, tasksSent=13, tasksDone=12, currentTasksDone=12, retries=11 hasError=false, tableName=IPHOENIX10M 2014-05-22 17:23:00,473 INFO [storm4.org,60020,1400750242045-index-writer--pool3-t8] client.AsyncProcess: #16020, waiting for some tasks to finish. Expected max=0, tasksSent=13, tasksDone=12, currentTasksDone=12, retries=11 hasError=false, tableName=IPHOENIX10M 2014-05-22 17:23:00,494 INFO [htable-pool20-t13] client.AsyncProcess: #16016, table=IPHOENIX10M, attempt=12/350 failed 1 ops, last exception: java.net.SocketTimeoutException: Call to storm3.org/172.16.2.23:60020 failed because java.net.SocketTimeoutException: 2000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/172.16.2.24:52017 remote=storm3.org/172.16.2.23:60020] on storm3.org,60020,1400750242156, tracking started Thu May 22 17:21:32 CST 2014, retrying after 20189 ms, replay 1 ops. 2014-05-22 17:23:02,439 INFO [storm4.org,60020,1400750242045-index-writer--pool3-t4] client.AsyncProcess: #16022, waiting for some tasks to finish. Expected max=0, tasksSent=13, tasksDone=12, currentTasksDone=12, retries=11 hasError=false, tableName=IPHOENIX10M 2014-05-22 17:23:02,496 INFO [htable-pool20-t3] client.AsyncProcess: #16013, table=IPHOENIX10M, attempt=12/350 failed 1 ops, last exception: java.net.SocketTimeoutException: Call to storm3.org/172.16.2.23:60020 failed because java.net.SocketTimeoutException: 2000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/172.16.2.24:52017 remote=storm3.org/172.16.2.23:60020] on storm3.org,60020,1400750242156, tracking started Thu May 22 17:21:32 CST 2014, retrying after 20001 ms, replay 1 ops. 2014-05-22 17:23:02,496 INFO [htable-pool20-t16] client.AsyncProcess: #16028, table=IPHOENIX10M, attempt=12/350 failed 1 ops, last exception: java.net.SocketTimeoutException: Call to storm3.org/172.16.2.23:60020 failed because java.net.SocketTimeoutException: 2000 millis timeout while waiting for channel to be ready for read. ch :
[jira] [Commented] (PHOENIX-998) SocketTimeoutException under high concurrent write access to phoenix indexed table
[ https://issues.apache.org/jira/browse/PHOENIX-998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14069897#comment-14069897 ] Vikas Vishwakarma commented on PHOENIX-998: --- The client does scan all the records: ResultScanner scanner = table.getScanner(scan); IteratorResult iterator = scanner.iterator(); while (iterator.hasNext()) { Result next = iterator.next(); next.getRow(); next.getValue(Bytes.toBytes(historyColumnFamily), Bytes.toBytes(historyColumnQualifier)); scancounter++; } Also I don't see this issue when running the same against hbase-0.94 build. From service logs in DEBUG mode I am not getting any more info on this. I will try to put some trace logs in the client and check further. SocketTimeoutException under high concurrent write access to phoenix indexed table -- Key: PHOENIX-998 URL: https://issues.apache.org/jira/browse/PHOENIX-998 Project: Phoenix Issue Type: Bug Affects Versions: 4.0.0 Environment: HBase 0.98.1-SNAPSHOT, Hadoop 2.3.0-cdh5.0.0 Reporter: wangxianbin Priority: Critical we have a small hbase cluster, which has one master, six slaves, we test phoenix index concurrent write access performance with four write clients, each client has 100 threads, each thread has one phoenix jdbc connection, and we encounter SocketTimeoutException as follow, and it will retry for very long time, how can i deal with such issue? 2014-05-22 17:22:58,490 INFO [storm4.org,60020,1400750242045-index-writer--pool3-t10] client.AsyncProcess: #16016, waiting for some tasks to finish. Expected max=0, tasksSent=13, tasksDone=12, currentTasksDone=12, retries=11 hasError=false, tableName=IPHOENIX10M 2014-05-22 17:23:00,436 INFO [storm4.org,60020,1400750242045-index-writer--pool3-t6] client.AsyncProcess: #16027, waiting for some tasks to finish. Expected max=0, tasksSent=13, tasksDone=12, currentTasksDone=12, retries=11 hasError=false, tableName=IPHOENIX10M 2014-05-22 17:23:00,440 INFO [storm4.org,60020,1400750242045-index-writer--pool3-t1] client.AsyncProcess: #16013, waiting for some tasks to finish. Expected max=0, tasksSent=13, tasksDone=12, currentTasksDone=12, retries=11 hasError=false, tableName=IPHOENIX10M 2014-05-22 17:23:00,449 INFO [storm4.org,60020,1400750242045-index-writer--pool3-t7] client.AsyncProcess: #16028, waiting for some tasks to finish. Expected max=0, tasksSent=13, tasksDone=12, currentTasksDone=12, retries=11 hasError=false, tableName=IPHOENIX10M 2014-05-22 17:23:00,473 INFO [storm4.org,60020,1400750242045-index-writer--pool3-t8] client.AsyncProcess: #16020, waiting for some tasks to finish. Expected max=0, tasksSent=13, tasksDone=12, currentTasksDone=12, retries=11 hasError=false, tableName=IPHOENIX10M 2014-05-22 17:23:00,494 INFO [htable-pool20-t13] client.AsyncProcess: #16016, table=IPHOENIX10M, attempt=12/350 failed 1 ops, last exception: java.net.SocketTimeoutException: Call to storm3.org/172.16.2.23:60020 failed because java.net.SocketTimeoutException: 2000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/172.16.2.24:52017 remote=storm3.org/172.16.2.23:60020] on storm3.org,60020,1400750242156, tracking started Thu May 22 17:21:32 CST 2014, retrying after 20189 ms, replay 1 ops. 2014-05-22 17:23:02,439 INFO [storm4.org,60020,1400750242045-index-writer--pool3-t4] client.AsyncProcess: #16022, waiting for some tasks to finish. Expected max=0, tasksSent=13, tasksDone=12, currentTasksDone=12, retries=11 hasError=false, tableName=IPHOENIX10M 2014-05-22 17:23:02,496 INFO [htable-pool20-t3] client.AsyncProcess: #16013, table=IPHOENIX10M, attempt=12/350 failed 1 ops, last exception: java.net.SocketTimeoutException: Call to storm3.org/172.16.2.23:60020 failed because java.net.SocketTimeoutException: 2000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/172.16.2.24:52017 remote=storm3.org/172.16.2.23:60020] on storm3.org,60020,1400750242156, tracking started Thu May 22 17:21:32 CST 2014, retrying after 20001 ms, replay 1 ops. 2014-05-22 17:23:02,496 INFO [htable-pool20-t16] client.AsyncProcess: #16028, table=IPHOENIX10M, attempt=12/350 failed 1 ops, last exception: java.net.SocketTimeoutException: Call to storm3.org/172.16.2.23:60020 failed because java.net.SocketTimeoutException: 2000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/172.16.2.24:52017
[jira] [Commented] (PHOENIX-998) SocketTimeoutException under high concurrent write access to phoenix indexed table
[ https://issues.apache.org/jira/browse/PHOENIX-998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14068678#comment-14068678 ] Andrew Purtell commented on PHOENIX-998: The lease timeouts will be legitimate if the scanner doesn't iterate for that length of time. SocketTimeoutException under high concurrent write access to phoenix indexed table -- Key: PHOENIX-998 URL: https://issues.apache.org/jira/browse/PHOENIX-998 Project: Phoenix Issue Type: Bug Affects Versions: 4.0.0 Environment: HBase 0.98.1-SNAPSHOT, Hadoop 2.3.0-cdh5.0.0 Reporter: wangxianbin Priority: Critical we have a small hbase cluster, which has one master, six slaves, we test phoenix index concurrent write access performance with four write clients, each client has 100 threads, each thread has one phoenix jdbc connection, and we encounter SocketTimeoutException as follow, and it will retry for very long time, how can i deal with such issue? 2014-05-22 17:22:58,490 INFO [storm4.org,60020,1400750242045-index-writer--pool3-t10] client.AsyncProcess: #16016, waiting for some tasks to finish. Expected max=0, tasksSent=13, tasksDone=12, currentTasksDone=12, retries=11 hasError=false, tableName=IPHOENIX10M 2014-05-22 17:23:00,436 INFO [storm4.org,60020,1400750242045-index-writer--pool3-t6] client.AsyncProcess: #16027, waiting for some tasks to finish. Expected max=0, tasksSent=13, tasksDone=12, currentTasksDone=12, retries=11 hasError=false, tableName=IPHOENIX10M 2014-05-22 17:23:00,440 INFO [storm4.org,60020,1400750242045-index-writer--pool3-t1] client.AsyncProcess: #16013, waiting for some tasks to finish. Expected max=0, tasksSent=13, tasksDone=12, currentTasksDone=12, retries=11 hasError=false, tableName=IPHOENIX10M 2014-05-22 17:23:00,449 INFO [storm4.org,60020,1400750242045-index-writer--pool3-t7] client.AsyncProcess: #16028, waiting for some tasks to finish. Expected max=0, tasksSent=13, tasksDone=12, currentTasksDone=12, retries=11 hasError=false, tableName=IPHOENIX10M 2014-05-22 17:23:00,473 INFO [storm4.org,60020,1400750242045-index-writer--pool3-t8] client.AsyncProcess: #16020, waiting for some tasks to finish. Expected max=0, tasksSent=13, tasksDone=12, currentTasksDone=12, retries=11 hasError=false, tableName=IPHOENIX10M 2014-05-22 17:23:00,494 INFO [htable-pool20-t13] client.AsyncProcess: #16016, table=IPHOENIX10M, attempt=12/350 failed 1 ops, last exception: java.net.SocketTimeoutException: Call to storm3.org/172.16.2.23:60020 failed because java.net.SocketTimeoutException: 2000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/172.16.2.24:52017 remote=storm3.org/172.16.2.23:60020] on storm3.org,60020,1400750242156, tracking started Thu May 22 17:21:32 CST 2014, retrying after 20189 ms, replay 1 ops. 2014-05-22 17:23:02,439 INFO [storm4.org,60020,1400750242045-index-writer--pool3-t4] client.AsyncProcess: #16022, waiting for some tasks to finish. Expected max=0, tasksSent=13, tasksDone=12, currentTasksDone=12, retries=11 hasError=false, tableName=IPHOENIX10M 2014-05-22 17:23:02,496 INFO [htable-pool20-t3] client.AsyncProcess: #16013, table=IPHOENIX10M, attempt=12/350 failed 1 ops, last exception: java.net.SocketTimeoutException: Call to storm3.org/172.16.2.23:60020 failed because java.net.SocketTimeoutException: 2000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/172.16.2.24:52017 remote=storm3.org/172.16.2.23:60020] on storm3.org,60020,1400750242156, tracking started Thu May 22 17:21:32 CST 2014, retrying after 20001 ms, replay 1 ops. 2014-05-22 17:23:02,496 INFO [htable-pool20-t16] client.AsyncProcess: #16028, table=IPHOENIX10M, attempt=12/350 failed 1 ops, last exception: java.net.SocketTimeoutException: Call to storm3.org/172.16.2.23:60020 failed because java.net.SocketTimeoutException: 2000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/172.16.2.24:52017 remote=storm3.org/172.16.2.23:60020] on storm3.org,60020,1400750242156, tracking started Thu May 22 17:21:37 CST 2014, retrying after 20095 ms, replay 1 ops. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PHOENIX-998) SocketTimeoutException under high concurrent write access to phoenix indexed table
[ https://issues.apache.org/jira/browse/PHOENIX-998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14067447#comment-14067447 ] Vikas Vishwakarma commented on PHOENIX-998: --- I was able to reproduce and identify this issue. It is related to LeaseExpiry in RegionServers. You will see something like this in the RegionServer logs when this issue occurs. There is some refactoring around leaseExpiry in hbase-0.98 and may have introduced some race condition (refer https://issues.apache.org/jira/browse/HBASE-8449) 2014-07-19 05:20:03,696 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner 120 lease expired on region PB4_1,EXCHG71:TOPIC79:,1405745722171.631771d92412b9744e342aeffa61b880. 2014-07-19 05:20:03,907 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner 117 lease expired on region PB4_1,EXCHG41:TOPIC79:,1405745722171.dd30b2115b87f2c1f8aadcefa52d28ee. 2014-07-19 05:20:04,008 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner 122 lease expired on region ... 2014-07-19 05:20:04,008 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner 119 lease expired on region PB4_1,EXCHG53:TOPIC79:,1405745722171.a55923b2b74ed6a61d4cad2c6afb4dd0. 2014-07-19 05:20:21,420 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner 146 lease expired on region PB4_1,EXCHG27:TOPIC79:,1405745722170.89dcec9f29c6c36664dcca0b51bb477a. 2014-07-19 05:20:21,420 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner 145 lease expired on region PB4_1,EXCHG25:TOPIC79:,1405745722170.cda36ed3c880f0cdc95b0729306fdd56. SocketTimeoutException under high concurrent write access to phoenix indexed table -- Key: PHOENIX-998 URL: https://issues.apache.org/jira/browse/PHOENIX-998 Project: Phoenix Issue Type: Bug Affects Versions: 4.0.0 Environment: HBase 0.98.1-SNAPSHOT, Hadoop 2.3.0-cdh5.0.0 Reporter: wangxianbin Priority: Critical we have a small hbase cluster, which has one master, six slaves, we test phoenix index concurrent write access performance with four write clients, each client has 100 threads, each thread has one phoenix jdbc connection, and we encounter SocketTimeoutException as follow, and it will retry for very long time, how can i deal with such issue? 2014-05-22 17:22:58,490 INFO [storm4.org,60020,1400750242045-index-writer--pool3-t10] client.AsyncProcess: #16016, waiting for some tasks to finish. Expected max=0, tasksSent=13, tasksDone=12, currentTasksDone=12, retries=11 hasError=false, tableName=IPHOENIX10M 2014-05-22 17:23:00,436 INFO [storm4.org,60020,1400750242045-index-writer--pool3-t6] client.AsyncProcess: #16027, waiting for some tasks to finish. Expected max=0, tasksSent=13, tasksDone=12, currentTasksDone=12, retries=11 hasError=false, tableName=IPHOENIX10M 2014-05-22 17:23:00,440 INFO [storm4.org,60020,1400750242045-index-writer--pool3-t1] client.AsyncProcess: #16013, waiting for some tasks to finish. Expected max=0, tasksSent=13, tasksDone=12, currentTasksDone=12, retries=11 hasError=false, tableName=IPHOENIX10M 2014-05-22 17:23:00,449 INFO [storm4.org,60020,1400750242045-index-writer--pool3-t7] client.AsyncProcess: #16028, waiting for some tasks to finish. Expected max=0, tasksSent=13, tasksDone=12, currentTasksDone=12, retries=11 hasError=false, tableName=IPHOENIX10M 2014-05-22 17:23:00,473 INFO [storm4.org,60020,1400750242045-index-writer--pool3-t8] client.AsyncProcess: #16020, waiting for some tasks to finish. Expected max=0, tasksSent=13, tasksDone=12, currentTasksDone=12, retries=11 hasError=false, tableName=IPHOENIX10M 2014-05-22 17:23:00,494 INFO [htable-pool20-t13] client.AsyncProcess: #16016, table=IPHOENIX10M, attempt=12/350 failed 1 ops, last exception: java.net.SocketTimeoutException: Call to storm3.org/172.16.2.23:60020 failed because java.net.SocketTimeoutException: 2000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/172.16.2.24:52017 remote=storm3.org/172.16.2.23:60020] on storm3.org,60020,1400750242156, tracking started Thu May 22 17:21:32 CST 2014, retrying after 20189 ms, replay 1 ops. 2014-05-22 17:23:02,439 INFO [storm4.org,60020,1400750242045-index-writer--pool3-t4] client.AsyncProcess: #16022, waiting for some tasks to finish. Expected max=0, tasksSent=13, tasksDone=12, currentTasksDone=12, retries=11 hasError=false, tableName=IPHOENIX10M 2014-05-22 17:23:02,496 INFO [htable-pool20-t3] client.AsyncProcess: #16013, table=IPHOENIX10M, attempt=12/350 failed 1 ops, last exception: java.net.SocketTimeoutException: Call to