Try setting hbase.regionserver.lease.period on the region servers to a higher value like 6000000.
Also PHOENIX-2357 (currently under review) will make this unnecessary, and James was spot on about drop column once we do the column name indirection (PHOENIX-1598 and PHOENIX-1940), though you'd likely still want to put delete markers on each column cell, but you could do it asynchronously. Thanks, James On Wednesday, November 11, 2015, James Heather <[email protected] <javascript:_e(%7B%7D,'cvml','[email protected]');>> wrote: > I don't know what the answer is to your question, but I have hit this > before. > > It seems that adding a column is a lazy operation, and results in changing > just the metadata, so it returns almost immediately; but dropping a column > is not. In fact, if you add a column and then immediately drop it, it takes > ages to do the drop, presumably because Phoenix has to check each row to > see if there's anything it needs to remove. > > I don't know if it would be possible to implement a lazy drop, so that the > data isn't really removed from the row until the row is accessed. Obviously > some care would be needed if a column was added with the same name before > the previous one had been completely removed. > > I suspect that this will be much improved if the Phoenix crew manage to > implement the level of indirection that @JT mentioned for column names. > This would mean that the columns in HBase would have uniquely generated > names, and the Phoenix names would be used to map to these HBase names. > Lazy dropping would be easier in that world, because the column couldn't > ever be accessed after it had been dropped, and any write to the column > could be set to remove any data from columns that no longer exist. > > James > > On 11/11/15 15:08, Lukáš Lalinský wrote: > >> When running "ALTER TABLE xxx DROP COLUMN yyy" on a table with about >> 6M rows (which I considered small enough), it's always timing out and >> I can't see how to get it execute at least once successfully. >> >> I was getting some internal Phoenix timeouts, but after setting the >> following properties, it changed: >> >> hbase.client.scanner.timeout.period=6000000 >> phoenix.query.timeoutMs=6000000 >> hbase.rpc.timeout=6000000 >> >> Now it fails with errors like this: >> >> Wed Nov 11 13:44:25 UTC 2015, >> RpcRetryingCaller{globalStartTime=1447246894248, pause=100, >> retries=35}, java.io.IOException: Call to XXX/XXX:16020 failed on >> local exception: org.apache.hadoop.hbase.ipc.CallTimeoutException: >> Call id=1303, waitTime=60001, operationTimeout=60000 expired. >> Wed Nov 11 13:45:45 UTC 2015, >> RpcRetryingCaller{globalStartTime=1447246894248, pause=100, >> retries=35}, java.io.IOException: Call to XXX/XXX:16020 failed on >> local exception: org.apache.hadoop.hbase.ipc.CallTimeoutException: >> Call id=1341, waitTime=60001, operationTimeout=60000 expired. >> >> at >> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:147) >> at >> org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture.run(ResultBoundedCompletionService.java:64) >> ... 3 more >> Caused by: java.io.IOException: Call to XXX/XXX:16020 failed on local >> exception: org.apache.hadoop.hbase.ipc.CallTimeoutException: Call >> id=1341, waitTime=60001, operationTimeout=60000 expired. >> at >> org.apache.hadoop.hbase.ipc.RpcClientImpl.wrapException(RpcClientImpl.java:1232) >> at org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1200) >> at >> org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:213) >> at >> org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:287) >> at >> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.scan(ClientProtos.java:32651) >> at >> org.apache.hadoop.hbase.client.ScannerCallable.openScanner(ScannerCallable.java:372) >> at >> org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:199) >> at >> org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:62) >> at >> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200) >> at >> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:369) >> at >> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:343) >> at >> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:126) >> ... 4 more >> Caused by: org.apache.hadoop.hbase.ipc.CallTimeoutException: Call >> id=1341, waitTime=60001, operationTimeout=60000 expired. >> at org.apache.hadoop.hbase.ipc.Call.checkAndSetTimeout(Call.java:70) >> at org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1174) >> ... 14 more >> >> While it's still running, I see log entries like this on region servers: >> >> 2015-11-11 15:00:49,059 WARN >> [B.defaultRpcServer.handler=9,queue=0,port=16020] >> coprocessor.UngroupedAggregateRegionObserver: Committing bactch of >> 1000 mutations for MEDIA >> 2015-11-11 15:00:49,259 WARN >> [B.defaultRpcServer.handler=12,queue=0,port=16020] >> coprocessor.UngroupedAggregateRegionObserver: Committing bactch of >> 1000 mutations for MEDIA >> 2015-11-11 15:00:49,537 WARN >> [B.defaultRpcServer.handler=9,queue=0,port=16020] >> coprocessor.UngroupedAggregateRegionObserver: Committing bactch of >> 1000 mutations for MEDIA >> 2015-11-11 15:00:49,766 WARN >> [B.defaultRpcServer.handler=12,queue=0,port=16020] >> coprocessor.UngroupedAggregateRegionObserver: Committing bactch of >> 1000 mutations for MEDIA >> 2015-11-11 15:00:49,960 WARN >> [B.defaultRpcServer.handler=9,queue=0,port=16020] >> coprocessor.UngroupedAggregateRegionObserver: Committing bactch of >> 1000 mutations for MEDIA >> 2015-11-11 15:00:50,212 WARN >> [B.defaultRpcServer.handler=12,queue=0,port=16020] >> coprocessor.UngroupedAggregateRegionObserver: Committing bactch of >> 1000 mutations for MEDIA >> >> Any ideas how to solve this? I'd be even fine with having just a way >> to remove the column from the Phoenix metadata and keep the values in >> HBase, but I don't see how to do it except for running DROP COLUMN and >> waiting for it to time out. >> >> Lukas >> > >
