[jira] [Commented] (HBASE-8771) ensure replication_scope's value is either local(0) or global(1)
[ https://issues.apache.org/jira/browse/HBASE-8771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13688904#comment-13688904 ] Chris Trezzo commented on HBASE-8771: - [~nidmhbase] forgive me if I am being slow, but I am still not quite sure I understand how it will still work. The setScope method is called in the HColumnDescriptor constructor, so any time you try to get a column descriptor for a column and the replication scope for that column is 2, this seems like it will fail. Check out the rest interface for example. If I try to get a HTableDescriptor on an existing table that has a column with replication scope 2, the getTableDescriptor method will blow up with an IllegalArgumentException. Does that make sense, or am I missing something? ensure replication_scope's value is either local(0) or global(1) Key: HBASE-8771 URL: https://issues.apache.org/jira/browse/HBASE-8771 Project: HBase Issue Type: Bug Components: Replication Affects Versions: 0.94.8 Reporter: Demai Ni Priority: Minor Fix For: 0.94.9 Attachments: HBASE-8771-0.94.8-v0.patch For replication_scope, only two values are meaningful: {code} public static final int REPLICATION_SCOPE_LOCAL = 0; public static final int REPLICATION_SCOPE_GLOBAL = 1; {code} However, there is no checking for that, so currently user can set it to any integer value. And all non-zero value will be treated as 1(GLOBAL). This jira is to add a checking in HColumnDescriptor#setScope() so that only 0 and 1 will be accept during create_table or alter_table. In the future, we can leverage replication_scope to store for info. For example: -1: A columnfam is replicated from another cluster in MASTER_SLAVE setup (i.e readonly) 2 : A columnfam is set MASTER_MASTER Probably a major improve JIRA is needed for the future usage. It will be good to ensure the scope value at this moment. {code:title=Testing|borderStyle=solid} hbase(main):002:0 create 't1_dn',{NAME='cf1',REPLICATION_SCOPE=2} ERROR: java.lang.IllegalArgumentException: Replication Scope must be either 0(local) or 1(global) ... hbase(main):004:0 alter 't1_dn',{NAME='cf1',REPLICATION_SCOPE=-1} ERROR: java.lang.IllegalArgumentException: Replication Scope must be either 0(local) or 1(global) ... {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8667) Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine.
[ https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13688933#comment-13688933 ] rajeshbabu commented on HBASE-8667: --- [~stack] https://issues.apache.org/jira/secure/attachment/12587780/HBASE-8667_trunk.patch is the latest patch I have tested.I think you are reviewing https://issues.apache.org/jira/secure/attachment/12587092/HBASE-8667_Trunk-V2.patch. Sorry for the patch name Stack, it should be something like HBASE-8667_trunk_v3.patch. Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine. -- Key: HBASE-8667 URL: https://issues.apache.org/jira/browse/HBASE-8667 Project: HBase Issue Type: Bug Components: IPC/RPC Reporter: rajeshbabu Fix For: 0.98.0, 0.95.2, 0.94.9 Attachments: HBASE-8667_trunk.patch, HBASE-8667_Trunk.patch, HBASE-8667_Trunk-V2.patch While testing HBASE-8640 fix found that master and regionserver running on different interfaces are not communicating properly. I have two interfaces 1) lo 2) eth0 in my machine and default hostname interface is lo. I have configured master ipc address to ip of eth0 interface. Started master and regionserver on the same machine. 1) master rpc server bound to eth0 and RS rpc server bound to lo 2) Since rpc client is not binding to any ip address, when RS is reporting RS startup its getting registered with eth0 ip address(but actually it should register localhost) Here are RS logs: {code} 2013-05-31 06:05:28,608 WARN [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; sleeping and then retrying. 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to Master server at 192.168.0.100,6,1369960497008 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 192.168.0.100,6,1369960497008 that we are up with port=60020, startcode=1369960502544 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.rootdir=hdfs://localhost:2851/hbase 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: fs.default.name=hdfs://localhost:2851 2013-05-31 06:05:31,618 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us a different hostname to use; was=localhost, but now=192.168.0.100 {code} Here are master logs: {code} 2013-05-31 06:05:31,615 INFO [IPC Server handler 9 on 6] org.apache.hadoop.hbase.master.ServerManager: Registering server=192.168.0.100,60020,1369960502544 {code} Since master has wrong rpc server address of RS, META is not getting assigned. {code} 2013-05-31 06:05:34,362 DEBUG [master-192.168.0.100,6,1369960497008] org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan was found (or we are ignoring an existing plan) for .META.,,1.1028785192 so generated a random one; hri=.META.,,1.1028785192, src=, dest=192.168.0.100,60020,1369960502544; 1 (online=1, available=1) available servers, forceNewPlan=false - org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of .META.,,1.1028785192 to 192.168.0.100,60020,1369960502544, trying to assign elsewhere instead; try=1 of 10 java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:549) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:813) at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1422) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1315) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1532) at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1587) at org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.openRegion(AdminProtos.java:15039) at org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:627) at
[jira] [Commented] (HBASE-8755) A new write thread model for HLog to improve the overall HBase write throughput
[ https://issues.apache.org/jira/browse/HBASE-8755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13688936#comment-13688936 ] Feng Honghua commented on HBASE-8755: - [~zjushch] We run the same tests as yours, and below are the result: 1). One YCSB client with 5/50/200 write threads respectively 2). One RS with 300 RPC handlers, 20 regions (5 data-nodes back-end HDFS running CDH 4.1.1) 3). row-size = 150 bytes threads row-count new-throughputnew-latency old-throughput old-latency --- 5 20 3191 1.551(ms) 3172 1.561(ms) 50 200 23215 2.131(ms) 7437 6.693(ms) 200 200 35793 5.450(ms) 10816 18.312(ms) --- A). the difference is negligible when 5 threads of YCSB client B). new-model still has 3X+ improvement compared to old-model when threads are 50/200 Anybody else can help do the similar tests using the same test configuration as Chunhui? A new write thread model for HLog to improve the overall HBase write throughput --- Key: HBASE-8755 URL: https://issues.apache.org/jira/browse/HBASE-8755 Project: HBase Issue Type: Improvement Components: wal Reporter: Feng Honghua Attachments: HBASE-8755-0.94-V0.patch, HBASE-8755-0.94-V1.patch, HBASE-8755-trunk-V0.patch In current write model, each write handler thread (executing put()) will individually go through a full 'append (hlog local buffer) = HLog writer append (write to hdfs) = HLog writer sync (sync hdfs)' cycle for each write, which incurs heavy race condition on updateLock and flushLock. The only optimization where checking if current syncTillHere txid in expectation for other thread help write/sync its own txid to hdfs and omitting the write/sync actually help much less than expectation. Three of my colleagues(Ye Hangjun / Wu Zesheng / Zhang Peng) at Xiaomi proposed a new write thread model for writing hdfs sequence file and the prototype implementation shows a 4X improvement for throughput (from 17000 to 7+). I apply this new write thread model in HLog and the performance test in our test cluster shows about 3X throughput improvement (from 12150 to 31520 for 1 RS, from 22000 to 7 for 5 RS), the 1 RS write throughput (1K row-size) even beats the one of BigTable (Precolator published in 2011 says Bigtable's write throughput then is 31002). I can provide the detailed performance test results if anyone is interested. The change for new write thread model is as below: 1 All put handler threads append the edits to HLog's local pending buffer; (it notifies AsyncWriter thread that there is new edits in local buffer) 2 All put handler threads wait in HLog.syncer() function for underlying threads to finish the sync that contains its txid; 3 An single AsyncWriter thread is responsible for retrieve all the buffered edits in HLog's local pending buffer and write to the hdfs (hlog.writer.append); (it notifies AsyncFlusher thread that there is new writes to hdfs that needs a sync) 4 An single AsyncFlusher thread is responsible for issuing a sync to hdfs to persist the writes by AsyncWriter; (it notifies the AsyncNotifier thread that sync watermark increases) 5 An single AsyncNotifier thread is responsible for notifying all pending put handler threads which are waiting in the HLog.syncer() function 6 No LogSyncer thread any more (since there is always AsyncWriter/AsyncFlusher threads do the same job it does) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7667) Support stripe compaction
[ https://issues.apache.org/jira/browse/HBASE-7667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13688945#comment-13688945 ] stack commented on HBASE-7667: -- Rereading the design doc and how-to-use. They are very nice. Can go into the book. High-level, and I think you have suggested this yourself elsewhere, it'd be coolio if user didn't have to choose between size and count -- if it'd just figure itself based off incoming load. I've seen case where a compaction produces a zero-length file (all deletes) so would that mess w/ this invariant: Compaction mustproduce at least one file(seeHBASE-6059). or ...No stripe can everbe leftwith0 files... I almost asked a few questions you'd already answered above in my previous read of the doc (smile). How would region merge work? We'd just drop all files into L0? Sounds like we'd have to drop references if we are not to break snapshotting. You think this true? stripescheme useslarger number of files than default to ensure all compactions are small, which can affect verywidescans. Any measure of how much? Should stripe be on by default? Or have it as experimental for now until we get more data? How to use doc is excellent (though too many configs). Will review patch again next. Support stripe compaction - Key: HBASE-7667 URL: https://issues.apache.org/jira/browse/HBASE-7667 Project: HBase Issue Type: New Feature Components: Compaction Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: stripe-cdf.pdf, Stripe compaction perf evaluation.pdf, Stripe compaction perf evaluation.pdf, Stripe compaction perf evaluation.pdf, Stripe compactions.pdf, Stripe compactions.pdf, Stripe compactions.pdf, Stripe compactions.pdf, Using stripe compactions.pdf, Using stripe compactions.pdf, Using stripe compactions.pdf So I was thinking about having many regions as the way to make compactions more manageable, and writing the level db doc about how level db range overlap and data mixing breaks seqNum sorting, and discussing it with Jimmy, Matteo and Ted, and thinking about how to avoid Level DB I/O multiplication factor. And I suggest the following idea, let's call it stripe compactions. It's a mix between level db ideas and having many small regions. It allows us to have a subset of benefits of many regions (wrt reads and compactions) without many of the drawbacks (managing and current memstore/etc. limitation). It also doesn't break seqNum-based file sorting for any one key. It works like this. The region key space is separated into configurable number of fixed-boundary stripes (determined the first time we stripe the data, see below). All the data from memstores is written to normal files with all keys present (not striped), similar to L0 in LevelDb, or current files. Compaction policy does 3 types of compactions. First is L0 compaction, which takes all L0 files and breaks them down by stripe. It may be optimized by adding more small files from different stripes, but the main logical outcome is that there are no more L0 files and all data is striped. Second is exactly similar to current compaction, but compacting one single stripe. In future, nothing prevents us from applying compaction rules and compacting part of the stripe (e.g. similar to current policy with rations and stuff, tiers, whatever), but for the first cut I'd argue let it major compact the entire stripe. Or just have the ratio and no more complexity. Finally, the third addresses the concern of the fixed boundaries causing stripes to be very unbalanced. It's exactly like the 2nd, except it takes 2+ adjacent stripes and writes the results out with different boundaries. There's a tradeoff here - if we always take 2 adjacent stripes, compactions will be smaller but rebalancing will take ridiculous amount of I/O. If we take many stripes we are essentially getting into the epic-major-compaction problem again. Some heuristics will have to be in place. In general, if, before stripes are determined, we initially let L0 grow before determining the stripes, we will get better boundaries. Also, unless unbalancing is really large we don't need to rebalance really. Obviously this scheme (as well as level) is not applicable for all scenarios, e.g. if timestamp is your key it completely falls apart. The end result: - many small compactions that can be spread out in time. - reads still read from a small number of files (one stripe + L0). - region splits become marvelously simple (if we could move files between regions, no references would be needed). Main advantage over Level (for HBase)
[jira] [Commented] (HBASE-8701) distributedLogReplay need to apply wal edits in the receiving order of those edits
[ https://issues.apache.org/jira/browse/HBASE-8701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13688954#comment-13688954 ] Jeffrey Zhong commented on HBASE-8701: -- Thanks [~saint@gmail.com] for the comments. {quote} How will compactions deal with the -ve sequenceid {quote} The sequence ids of hfile are intact as before. {quote} Sometimes its a boolean and other times its a ts? {quote} decodeMemstoreTS is boolean. It's used to indicate hfilereader whether to decode memtoreTS(mvcc) number. An existing optimization to skip mvcc number decoding by using the following logic. Since we use negative mvcc, the optimization may skip decode mvcc number from a hfile. {code} Bytes.toLong(fileInfo.get(HFileWriterV2.MAX_MEMSTORE_TS_KEY)) 0; {code} {quote} Regards 200M. {quote} This part will be updated later by 8741. I left the code there is to let one of my new test case pass where we test same version update comes during recovery. {quote} Is that safe presumption to make in replay? Is this the least sequenceid of the batch? Again, what is the difference between these two sequenceids? Do we have to add it to WALEdit at all? {quote} I think we may not need the origSequneceNumber because mvcc is part of KV and should be already written into WAL? Let me try to see if I can cut the origSequenceNumber. {quote} Is this 'if it is present'? {quote} Yes. {quote} We only do this stuff for Puts and Deletes? Don't we have other types out in the WAL? {quote} Only puts and deletes are used for recovery purpose in WAL. distributedLogReplay need to apply wal edits in the receiving order of those edits -- Key: HBASE-8701 URL: https://issues.apache.org/jira/browse/HBASE-8701 Project: HBase Issue Type: Bug Components: MTTR Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.98.0, 0.95.2 Attachments: 8701-v3.txt, hbase-8701-v4.patch, hbase-8701-v5.patch, hbase-8701-v6.patch, hbase-8701-v7.patch This issue happens in distributedLogReplay mode when recovering multiple puts of the same key + version(timestamp). After replay, the value is nondeterministic of the key h5. The original concern situation raised from [~eclark]: For all edits the rowkey is the same. There's a log with: [ A (ts = 0), B (ts = 0) ] Replay the first half of the log. A user puts in C (ts = 0) Memstore has to flush A new Hfile will be created with [ C, A ] and MaxSequenceId = C's seqid. Replay the rest of the Log. Flush The issue will happen in similar situation like Put(key, t=T) in WAL1 and Put(key,t=T) in WAL2 h5. Below is the option(proposed by Ted) I'd like to use: a) During replay, we pass original wal sequence number of each edit to the receiving RS b) In receiving RS, we store negative original sequence number of wal edits into mvcc field of KVs of wal edits c) Add handling of negative MVCC in KVScannerComparator and KVComparator d) In receiving RS, write original sequence number into an optional field of wal file for chained RS failure situation e) When opening a region, we add a safety bumper(a large number) in order for the new sequence number of a newly opened region not to collide with old sequence numbers. In the future, when we stores sequence number along with KVs, we can adjust the above solution a little bit by avoiding to overload MVCC field. h5. The other alternative options are listed below for references: Option one a) disallow writes during recovery b) during replay, we pass original wal sequence ids c) hold flush till all wals of a recovering region are replayed. Memstore should hold because we only recover unflushed wal edits. For edits with same key + version, whichever with larger sequence Id wins. Option two a) During replay, we pass original wal sequence ids b) for each wal edit, we store each edit's original sequence id along with its key. c) during scanning, we use the original sequence id if it's present otherwise its store file sequence Id d) compaction can just leave put with max sequence id Please let me know if you have better ideas. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8667) Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine.
[ https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13688953#comment-13688953 ] stack commented on HBASE-8667: -- Whoops. My fault. Why not just pass this.isa rather than wrap it in a new InetSocketAddress (which will do a new resolve -- could do http://download.java.net/jdk7/archive/b123/docs/api/java/net/InetSocketAddress.html#createUnresolved(java.lang.String, int) I suppose)? {code} +rpcClient = new RpcClient(conf, clusterId, new InetSocketAddress(this.isa.getHostName(), 0)); {code} Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine. -- Key: HBASE-8667 URL: https://issues.apache.org/jira/browse/HBASE-8667 Project: HBase Issue Type: Bug Components: IPC/RPC Reporter: rajeshbabu Fix For: 0.98.0, 0.95.2, 0.94.9 Attachments: HBASE-8667_trunk.patch, HBASE-8667_Trunk.patch, HBASE-8667_Trunk-V2.patch While testing HBASE-8640 fix found that master and regionserver running on different interfaces are not communicating properly. I have two interfaces 1) lo 2) eth0 in my machine and default hostname interface is lo. I have configured master ipc address to ip of eth0 interface. Started master and regionserver on the same machine. 1) master rpc server bound to eth0 and RS rpc server bound to lo 2) Since rpc client is not binding to any ip address, when RS is reporting RS startup its getting registered with eth0 ip address(but actually it should register localhost) Here are RS logs: {code} 2013-05-31 06:05:28,608 WARN [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; sleeping and then retrying. 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to Master server at 192.168.0.100,6,1369960497008 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 192.168.0.100,6,1369960497008 that we are up with port=60020, startcode=1369960502544 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.rootdir=hdfs://localhost:2851/hbase 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: fs.default.name=hdfs://localhost:2851 2013-05-31 06:05:31,618 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us a different hostname to use; was=localhost, but now=192.168.0.100 {code} Here are master logs: {code} 2013-05-31 06:05:31,615 INFO [IPC Server handler 9 on 6] org.apache.hadoop.hbase.master.ServerManager: Registering server=192.168.0.100,60020,1369960502544 {code} Since master has wrong rpc server address of RS, META is not getting assigned. {code} 2013-05-31 06:05:34,362 DEBUG [master-192.168.0.100,6,1369960497008] org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan was found (or we are ignoring an existing plan) for .META.,,1.1028785192 so generated a random one; hri=.META.,,1.1028785192, src=, dest=192.168.0.100,60020,1369960502544; 1 (online=1, available=1) available servers, forceNewPlan=false - org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of .META.,,1.1028785192 to 192.168.0.100,60020,1369960502544, trying to assign elsewhere instead; try=1 of 10 java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:549) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:813) at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1422) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1315) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1532) at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1587) at org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.openRegion(AdminProtos.java:15039) at
[jira] [Commented] (HBASE-8755) A new write thread model for HLog to improve the overall HBase write throughput
[ https://issues.apache.org/jira/browse/HBASE-8755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13688957#comment-13688957 ] Feng Honghua commented on HBASE-8755: - [~zjushch]: We run the same tests as yours, and below are the result: 1). One YCSB client with 5/50/200 write threads respectively 2). One RS with 300 RPC handlers, 20 regions (5 data-nodes back-end HDFS running CDH 4.1.1) 3). row-size = 150 bytes ||client-threads ||row-count ||new-model throughput ||new-model latency ||old-model throughput||old-model latency|| |5 |20 |3191|1.551(ms) |3172 |1.561(ms)| |50 |200 |23215 |2.131(ms) |7437 |6.693(ms)| |200 |200 |35793 |5.450(ms) |10816 |18.312(ms)| A). the difference is negligible when 5 threads of YCSB client, this is because B). new-model still has 3X+ improvement compared to old-model when threads are 50/200 Can anybody else help do the tests using the same configurations as Chunhui? Another guess is the HDFS used by chunhui has much better performance on HLog's write/sync, which makes the new model in HBase has less impact. Just guess. A new write thread model for HLog to improve the overall HBase write throughput --- Key: HBASE-8755 URL: https://issues.apache.org/jira/browse/HBASE-8755 Project: HBase Issue Type: Improvement Components: wal Reporter: Feng Honghua Attachments: HBASE-8755-0.94-V0.patch, HBASE-8755-0.94-V1.patch, HBASE-8755-trunk-V0.patch In current write model, each write handler thread (executing put()) will individually go through a full 'append (hlog local buffer) = HLog writer append (write to hdfs) = HLog writer sync (sync hdfs)' cycle for each write, which incurs heavy race condition on updateLock and flushLock. The only optimization where checking if current syncTillHere txid in expectation for other thread help write/sync its own txid to hdfs and omitting the write/sync actually help much less than expectation. Three of my colleagues(Ye Hangjun / Wu Zesheng / Zhang Peng) at Xiaomi proposed a new write thread model for writing hdfs sequence file and the prototype implementation shows a 4X improvement for throughput (from 17000 to 7+). I apply this new write thread model in HLog and the performance test in our test cluster shows about 3X throughput improvement (from 12150 to 31520 for 1 RS, from 22000 to 7 for 5 RS), the 1 RS write throughput (1K row-size) even beats the one of BigTable (Precolator published in 2011 says Bigtable's write throughput then is 31002). I can provide the detailed performance test results if anyone is interested. The change for new write thread model is as below: 1 All put handler threads append the edits to HLog's local pending buffer; (it notifies AsyncWriter thread that there is new edits in local buffer) 2 All put handler threads wait in HLog.syncer() function for underlying threads to finish the sync that contains its txid; 3 An single AsyncWriter thread is responsible for retrieve all the buffered edits in HLog's local pending buffer and write to the hdfs (hlog.writer.append); (it notifies AsyncFlusher thread that there is new writes to hdfs that needs a sync) 4 An single AsyncFlusher thread is responsible for issuing a sync to hdfs to persist the writes by AsyncWriter; (it notifies the AsyncNotifier thread that sync watermark increases) 5 An single AsyncNotifier thread is responsible for notifying all pending put handler threads which are waiting in the HLog.syncer() function 6 No LogSyncer thread any more (since there is always AsyncWriter/AsyncFlusher threads do the same job it does) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8667) Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine.
[ https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13688964#comment-13688964 ] rajeshbabu commented on HBASE-8667: --- [~ram_krish] bq. So after this patch the RPC server and the rpc client on the RS connects using the same host? Yes Ram. If we dont pass bind address in connect call,presently it will pass null internally. {code} // connection time out is 20s NetUtils.connect(this.socket, remoteId.getAddress(), getSocketTimeout(conf)); {code} {code} public static void connect(Socket socket, SocketAddress address, int timeout) throws IOException { connect(socket, address, null, timeout); } {code} Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine. -- Key: HBASE-8667 URL: https://issues.apache.org/jira/browse/HBASE-8667 Project: HBase Issue Type: Bug Components: IPC/RPC Reporter: rajeshbabu Fix For: 0.98.0, 0.95.2, 0.94.9 Attachments: HBASE-8667_trunk.patch, HBASE-8667_Trunk.patch, HBASE-8667_Trunk-V2.patch While testing HBASE-8640 fix found that master and regionserver running on different interfaces are not communicating properly. I have two interfaces 1) lo 2) eth0 in my machine and default hostname interface is lo. I have configured master ipc address to ip of eth0 interface. Started master and regionserver on the same machine. 1) master rpc server bound to eth0 and RS rpc server bound to lo 2) Since rpc client is not binding to any ip address, when RS is reporting RS startup its getting registered with eth0 ip address(but actually it should register localhost) Here are RS logs: {code} 2013-05-31 06:05:28,608 WARN [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; sleeping and then retrying. 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to Master server at 192.168.0.100,6,1369960497008 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 192.168.0.100,6,1369960497008 that we are up with port=60020, startcode=1369960502544 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.rootdir=hdfs://localhost:2851/hbase 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: fs.default.name=hdfs://localhost:2851 2013-05-31 06:05:31,618 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us a different hostname to use; was=localhost, but now=192.168.0.100 {code} Here are master logs: {code} 2013-05-31 06:05:31,615 INFO [IPC Server handler 9 on 6] org.apache.hadoop.hbase.master.ServerManager: Registering server=192.168.0.100,60020,1369960502544 {code} Since master has wrong rpc server address of RS, META is not getting assigned. {code} 2013-05-31 06:05:34,362 DEBUG [master-192.168.0.100,6,1369960497008] org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan was found (or we are ignoring an existing plan) for .META.,,1.1028785192 so generated a random one; hri=.META.,,1.1028785192, src=, dest=192.168.0.100,60020,1369960502544; 1 (online=1, available=1) available servers, forceNewPlan=false - org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of .META.,,1.1028785192 to 192.168.0.100,60020,1369960502544, trying to assign elsewhere instead; try=1 of 10 java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:549) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:813) at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1422) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1315) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1532) at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1587) at
[jira] [Commented] (HBASE-8755) A new write thread model for HLog to improve the overall HBase write throughput
[ https://issues.apache.org/jira/browse/HBASE-8755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13688962#comment-13688962 ] chunhui shen commented on HBASE-8755: - As the above tests, try to find out why the old-throughput is so low. Do your client run on the regionserver or another separated server? A new write thread model for HLog to improve the overall HBase write throughput --- Key: HBASE-8755 URL: https://issues.apache.org/jira/browse/HBASE-8755 Project: HBase Issue Type: Improvement Components: wal Reporter: Feng Honghua Attachments: HBASE-8755-0.94-V0.patch, HBASE-8755-0.94-V1.patch, HBASE-8755-trunk-V0.patch In current write model, each write handler thread (executing put()) will individually go through a full 'append (hlog local buffer) = HLog writer append (write to hdfs) = HLog writer sync (sync hdfs)' cycle for each write, which incurs heavy race condition on updateLock and flushLock. The only optimization where checking if current syncTillHere txid in expectation for other thread help write/sync its own txid to hdfs and omitting the write/sync actually help much less than expectation. Three of my colleagues(Ye Hangjun / Wu Zesheng / Zhang Peng) at Xiaomi proposed a new write thread model for writing hdfs sequence file and the prototype implementation shows a 4X improvement for throughput (from 17000 to 7+). I apply this new write thread model in HLog and the performance test in our test cluster shows about 3X throughput improvement (from 12150 to 31520 for 1 RS, from 22000 to 7 for 5 RS), the 1 RS write throughput (1K row-size) even beats the one of BigTable (Precolator published in 2011 says Bigtable's write throughput then is 31002). I can provide the detailed performance test results if anyone is interested. The change for new write thread model is as below: 1 All put handler threads append the edits to HLog's local pending buffer; (it notifies AsyncWriter thread that there is new edits in local buffer) 2 All put handler threads wait in HLog.syncer() function for underlying threads to finish the sync that contains its txid; 3 An single AsyncWriter thread is responsible for retrieve all the buffered edits in HLog's local pending buffer and write to the hdfs (hlog.writer.append); (it notifies AsyncFlusher thread that there is new writes to hdfs that needs a sync) 4 An single AsyncFlusher thread is responsible for issuing a sync to hdfs to persist the writes by AsyncWriter; (it notifies the AsyncNotifier thread that sync watermark increases) 5 An single AsyncNotifier thread is responsible for notifying all pending put handler threads which are waiting in the HLog.syncer() function 6 No LogSyncer thread any more (since there is always AsyncWriter/AsyncFlusher threads do the same job it does) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8667) Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine.
[ https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13688967#comment-13688967 ] rajeshbabu commented on HBASE-8667: --- [~stack] bq. could do http://download.java.net/jdk7/archive/b123/docs/api/java/net/InetSocketAddress.html#createUnresolved(java.lang.String, int) I suppose)? This is good. I will change and update the patch. Thanks. Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine. -- Key: HBASE-8667 URL: https://issues.apache.org/jira/browse/HBASE-8667 Project: HBase Issue Type: Bug Components: IPC/RPC Reporter: rajeshbabu Fix For: 0.98.0, 0.95.2, 0.94.9 Attachments: HBASE-8667_trunk.patch, HBASE-8667_Trunk.patch, HBASE-8667_Trunk-V2.patch While testing HBASE-8640 fix found that master and regionserver running on different interfaces are not communicating properly. I have two interfaces 1) lo 2) eth0 in my machine and default hostname interface is lo. I have configured master ipc address to ip of eth0 interface. Started master and regionserver on the same machine. 1) master rpc server bound to eth0 and RS rpc server bound to lo 2) Since rpc client is not binding to any ip address, when RS is reporting RS startup its getting registered with eth0 ip address(but actually it should register localhost) Here are RS logs: {code} 2013-05-31 06:05:28,608 WARN [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; sleeping and then retrying. 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to Master server at 192.168.0.100,6,1369960497008 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 192.168.0.100,6,1369960497008 that we are up with port=60020, startcode=1369960502544 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.rootdir=hdfs://localhost:2851/hbase 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: fs.default.name=hdfs://localhost:2851 2013-05-31 06:05:31,618 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us a different hostname to use; was=localhost, but now=192.168.0.100 {code} Here are master logs: {code} 2013-05-31 06:05:31,615 INFO [IPC Server handler 9 on 6] org.apache.hadoop.hbase.master.ServerManager: Registering server=192.168.0.100,60020,1369960502544 {code} Since master has wrong rpc server address of RS, META is not getting assigned. {code} 2013-05-31 06:05:34,362 DEBUG [master-192.168.0.100,6,1369960497008] org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan was found (or we are ignoring an existing plan) for .META.,,1.1028785192 so generated a random one; hri=.META.,,1.1028785192, src=, dest=192.168.0.100,60020,1369960502544; 1 (online=1, available=1) available servers, forceNewPlan=false - org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of .META.,,1.1028785192 to 192.168.0.100,60020,1369960502544, trying to assign elsewhere instead; try=1 of 10 java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:549) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:813) at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1422) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1315) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1532) at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1587) at org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.openRegion(AdminProtos.java:15039) at org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:627) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1826) at
[jira] [Commented] (HBASE-8755) A new write thread model for HLog to improve the overall HBase write throughput
[ https://issues.apache.org/jira/browse/HBASE-8755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13688973#comment-13688973 ] Feng Honghua commented on HBASE-8755: - Our comparison tests have only the RS bits different, and all others(client/HDFS/cluster/row-size...) remain the same. The client runs on a different machine other than the RS, we don't run client on RS because almost all our applications using HBase run their application in their own machines different from the HBase cluster. Actually we never saw a such high throughput as 18018/24691 for a single RS in our cluster. It's really weird :). A new write thread model for HLog to improve the overall HBase write throughput --- Key: HBASE-8755 URL: https://issues.apache.org/jira/browse/HBASE-8755 Project: HBase Issue Type: Improvement Components: wal Reporter: Feng Honghua Attachments: HBASE-8755-0.94-V0.patch, HBASE-8755-0.94-V1.patch, HBASE-8755-trunk-V0.patch In current write model, each write handler thread (executing put()) will individually go through a full 'append (hlog local buffer) = HLog writer append (write to hdfs) = HLog writer sync (sync hdfs)' cycle for each write, which incurs heavy race condition on updateLock and flushLock. The only optimization where checking if current syncTillHere txid in expectation for other thread help write/sync its own txid to hdfs and omitting the write/sync actually help much less than expectation. Three of my colleagues(Ye Hangjun / Wu Zesheng / Zhang Peng) at Xiaomi proposed a new write thread model for writing hdfs sequence file and the prototype implementation shows a 4X improvement for throughput (from 17000 to 7+). I apply this new write thread model in HLog and the performance test in our test cluster shows about 3X throughput improvement (from 12150 to 31520 for 1 RS, from 22000 to 7 for 5 RS), the 1 RS write throughput (1K row-size) even beats the one of BigTable (Precolator published in 2011 says Bigtable's write throughput then is 31002). I can provide the detailed performance test results if anyone is interested. The change for new write thread model is as below: 1 All put handler threads append the edits to HLog's local pending buffer; (it notifies AsyncWriter thread that there is new edits in local buffer) 2 All put handler threads wait in HLog.syncer() function for underlying threads to finish the sync that contains its txid; 3 An single AsyncWriter thread is responsible for retrieve all the buffered edits in HLog's local pending buffer and write to the hdfs (hlog.writer.append); (it notifies AsyncFlusher thread that there is new writes to hdfs that needs a sync) 4 An single AsyncFlusher thread is responsible for issuing a sync to hdfs to persist the writes by AsyncWriter; (it notifies the AsyncNotifier thread that sync watermark increases) 5 An single AsyncNotifier thread is responsible for notifying all pending put handler threads which are waiting in the HLog.syncer() function 6 No LogSyncer thread any more (since there is always AsyncWriter/AsyncFlusher threads do the same job it does) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8759) Family Delete Markers not getting purged after major compaction
[ https://issues.apache.org/jira/browse/HBASE-8759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13688988#comment-13688988 ] Lars Hofhansl commented on HBASE-8759: -- NP. I should also be a bit more explicit how family delete markets are actually deleted. The logic is this: Each compaction registers the timestamp of the oldest put in the created hfile. The next (major) compaction then removes all family markers that are older than the oldest put. So as long as the client does not keep puts, eventually all family markers drop out of the compacted files. Family Delete Markers not getting purged after major compaction --- Key: HBASE-8759 URL: https://issues.apache.org/jira/browse/HBASE-8759 Project: HBase Issue Type: Bug Components: Compaction Affects Versions: 0.94.7 Reporter: Mujtaba Chohan Priority: Minor On table with VERSIONS = '1', KEEP_DELETED_CELLS = 'true'. Family Delete Markers does not get purged after put delete major compaction (they keep on incrementing after every put delete major compaction) Following is the raw scan output after 10 iterations of put delete major compaction. ROW COLUMN+CELL Acolumn=CF:, timestamp=1371512706683, type=DeleteFamily Acolumn=CF:, timestamp=1371512706394, type=DeleteFamily Acolumn=CF:, timestamp=1371512706054, type=DeleteFamily Acolumn=CF:, timestamp=1371512705763, type=DeleteFamily Acolumn=CF:, timestamp=1371512705457, type=DeleteFamily Acolumn=CF:, timestamp=1371512705149, type=DeleteFamily Acolumn=CF:, timestamp=1371512704836, type=DeleteFamily Acolumn=CF:, timestamp=1371512704518, type=DeleteFamily Acolumn=CF:, timestamp=1371512704162, type=DeleteFamily Acolumn=CF:, timestamp=1371512703779, type=DeleteFamily Acolumn=CF:COL, timestamp=1371512706682, value=X [~lhofhansl] Code to repro this issue: http://phoenix-bin.github.io/client/code/delete.java -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-8773) Can be setup the COMPRESSION base on HTable in meta or user set in Configuration
jay wong created HBASE-8773: --- Summary: Can be setup the COMPRESSION base on HTable in meta or user set in Configuration Key: HBASE-8773 URL: https://issues.apache.org/jira/browse/HBASE-8773 Project: HBase Issue Type: New Feature Components: HFile Affects Versions: 0.94.8 Reporter: jay wong Fix For: 0.94.9 when I want create HFile with the ImportTsv. I found that if i set the compression in the Configuration or not, It's always invalid。 It because of the method 'configureIncrementalLoad' in HFileOutputFormat will set the compression with the HTable in meta. So if add a configuration to switch use set compression with HTable or Not -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8773) Can be setup the COMPRESSION base on HTable in meta or user set in Configuration
[ https://issues.apache.org/jira/browse/HBASE-8773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jay wong updated HBASE-8773: Status: Open (was: Patch Available) Can be setup the COMPRESSION base on HTable in meta or user set in Configuration Key: HBASE-8773 URL: https://issues.apache.org/jira/browse/HBASE-8773 Project: HBase Issue Type: New Feature Components: HFile Affects Versions: 0.94.8 Reporter: jay wong Fix For: 0.94.9 when I want create HFile with the ImportTsv. I found that if i set the compression in the Configuration or not, It's always invalid。 It because of the method 'configureIncrementalLoad' in HFileOutputFormat will set the compression with the HTable in meta. So if add a configuration to switch use set compression with HTable or Not -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8773) Can be setup the COMPRESSION base on HTable in meta or user set in Configuration
[ https://issues.apache.org/jira/browse/HBASE-8773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jay wong updated HBASE-8773: Labels: (was: patch) Can be setup the COMPRESSION base on HTable in meta or user set in Configuration Key: HBASE-8773 URL: https://issues.apache.org/jira/browse/HBASE-8773 Project: HBase Issue Type: New Feature Components: HFile Affects Versions: 0.94.8 Reporter: jay wong Fix For: 0.94.9 when I want create HFile with the ImportTsv. I found that if i set the compression in the Configuration or not, It's always invalid。 It because of the method 'configureIncrementalLoad' in HFileOutputFormat will set the compression with the HTable in meta. So if add a configuration to switch use set compression with HTable or Not -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8773) Can be setup the COMPRESSION base on HTable in meta or user set in Configuration
[ https://issues.apache.org/jira/browse/HBASE-8773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jay wong updated HBASE-8773: Labels: patch (was: ) Hadoop Flags: Reviewed Status: Patch Available (was: Open) Can be setup the COMPRESSION base on HTable in meta or user set in Configuration Key: HBASE-8773 URL: https://issues.apache.org/jira/browse/HBASE-8773 Project: HBase Issue Type: New Feature Components: HFile Affects Versions: 0.94.8 Reporter: jay wong Labels: patch Fix For: 0.94.9 when I want create HFile with the ImportTsv. I found that if i set the compression in the Configuration or not, It's always invalid。 It because of the method 'configureIncrementalLoad' in HFileOutputFormat will set the compression with the HTable in meta. So if add a configuration to switch use set compression with HTable or Not -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8773) Can be setup the COMPRESSION base on HTable in meta or user set in Configuration
[ https://issues.apache.org/jira/browse/HBASE-8773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jay wong updated HBASE-8773: Attachment: HBASE-8773.patch Can be setup the COMPRESSION base on HTable in meta or user set in Configuration Key: HBASE-8773 URL: https://issues.apache.org/jira/browse/HBASE-8773 Project: HBase Issue Type: New Feature Components: HFile Affects Versions: 0.94.8 Reporter: jay wong Fix For: 0.94.9 Attachments: HBASE-8773.patch when I want create HFile with the ImportTsv. I found that if i set the compression in the Configuration or not, It's always invalid。 It because of the method 'configureIncrementalLoad' in HFileOutputFormat will set the compression with the HTable in meta. So if add a configuration to switch use set compression with HTable or Not -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8773) Can be setup the COMPRESSION base on HTable in meta or user set in Configuration
[ https://issues.apache.org/jira/browse/HBASE-8773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jay wong updated HBASE-8773: Hadoop Flags: (was: Reviewed) Can be setup the COMPRESSION base on HTable in meta or user set in Configuration Key: HBASE-8773 URL: https://issues.apache.org/jira/browse/HBASE-8773 Project: HBase Issue Type: New Feature Components: HFile Affects Versions: 0.94.8 Reporter: jay wong Fix For: 0.94.9 Attachments: HBASE-8773.patch when I want create HFile with the ImportTsv. I found that if i set the compression in the Configuration or not, It's always invalid。 It because of the method 'configureIncrementalLoad' in HFileOutputFormat will set the compression with the HTable in meta. So if add a configuration to switch use set compression with HTable or Not -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8667) Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine.
[ https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] rajeshbabu updated HBASE-8667: -- Attachment: HBASE-8667_trunk_v4.patch Patch addressing Stack's comments. Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine. -- Key: HBASE-8667 URL: https://issues.apache.org/jira/browse/HBASE-8667 Project: HBase Issue Type: Bug Components: IPC/RPC Reporter: rajeshbabu Fix For: 0.98.0, 0.95.2, 0.94.9 Attachments: HBASE-8667_trunk.patch, HBASE-8667_Trunk.patch, HBASE-8667_Trunk-V2.patch, HBASE-8667_trunk_v4.patch While testing HBASE-8640 fix found that master and regionserver running on different interfaces are not communicating properly. I have two interfaces 1) lo 2) eth0 in my machine and default hostname interface is lo. I have configured master ipc address to ip of eth0 interface. Started master and regionserver on the same machine. 1) master rpc server bound to eth0 and RS rpc server bound to lo 2) Since rpc client is not binding to any ip address, when RS is reporting RS startup its getting registered with eth0 ip address(but actually it should register localhost) Here are RS logs: {code} 2013-05-31 06:05:28,608 WARN [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; sleeping and then retrying. 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to Master server at 192.168.0.100,6,1369960497008 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 192.168.0.100,6,1369960497008 that we are up with port=60020, startcode=1369960502544 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.rootdir=hdfs://localhost:2851/hbase 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: fs.default.name=hdfs://localhost:2851 2013-05-31 06:05:31,618 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us a different hostname to use; was=localhost, but now=192.168.0.100 {code} Here are master logs: {code} 2013-05-31 06:05:31,615 INFO [IPC Server handler 9 on 6] org.apache.hadoop.hbase.master.ServerManager: Registering server=192.168.0.100,60020,1369960502544 {code} Since master has wrong rpc server address of RS, META is not getting assigned. {code} 2013-05-31 06:05:34,362 DEBUG [master-192.168.0.100,6,1369960497008] org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan was found (or we are ignoring an existing plan) for .META.,,1.1028785192 so generated a random one; hri=.META.,,1.1028785192, src=, dest=192.168.0.100,60020,1369960502544; 1 (online=1, available=1) available servers, forceNewPlan=false - org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of .META.,,1.1028785192 to 192.168.0.100,60020,1369960502544, trying to assign elsewhere instead; try=1 of 10 java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:549) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:813) at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1422) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1315) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1532) at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1587) at org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.openRegion(AdminProtos.java:15039) at org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:627) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1826) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1453) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1432)
[jira] [Assigned] (HBASE-8667) Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine.
[ https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] rajeshbabu reassigned HBASE-8667: - Assignee: rajeshbabu Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine. -- Key: HBASE-8667 URL: https://issues.apache.org/jira/browse/HBASE-8667 Project: HBase Issue Type: Bug Components: IPC/RPC Reporter: rajeshbabu Assignee: rajeshbabu Fix For: 0.98.0, 0.95.2, 0.94.9 Attachments: HBASE-8667_trunk.patch, HBASE-8667_Trunk.patch, HBASE-8667_Trunk-V2.patch, HBASE-8667_trunk_v4.patch While testing HBASE-8640 fix found that master and regionserver running on different interfaces are not communicating properly. I have two interfaces 1) lo 2) eth0 in my machine and default hostname interface is lo. I have configured master ipc address to ip of eth0 interface. Started master and regionserver on the same machine. 1) master rpc server bound to eth0 and RS rpc server bound to lo 2) Since rpc client is not binding to any ip address, when RS is reporting RS startup its getting registered with eth0 ip address(but actually it should register localhost) Here are RS logs: {code} 2013-05-31 06:05:28,608 WARN [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; sleeping and then retrying. 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to Master server at 192.168.0.100,6,1369960497008 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 192.168.0.100,6,1369960497008 that we are up with port=60020, startcode=1369960502544 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.rootdir=hdfs://localhost:2851/hbase 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: fs.default.name=hdfs://localhost:2851 2013-05-31 06:05:31,618 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us a different hostname to use; was=localhost, but now=192.168.0.100 {code} Here are master logs: {code} 2013-05-31 06:05:31,615 INFO [IPC Server handler 9 on 6] org.apache.hadoop.hbase.master.ServerManager: Registering server=192.168.0.100,60020,1369960502544 {code} Since master has wrong rpc server address of RS, META is not getting assigned. {code} 2013-05-31 06:05:34,362 DEBUG [master-192.168.0.100,6,1369960497008] org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan was found (or we are ignoring an existing plan) for .META.,,1.1028785192 so generated a random one; hri=.META.,,1.1028785192, src=, dest=192.168.0.100,60020,1369960502544; 1 (online=1, available=1) available servers, forceNewPlan=false - org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of .META.,,1.1028785192 to 192.168.0.100,60020,1369960502544, trying to assign elsewhere instead; try=1 of 10 java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:549) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:813) at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1422) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1315) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1532) at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1587) at org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.openRegion(AdminProtos.java:15039) at org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:627) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1826) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1453) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1432) at
[jira] [Commented] (HBASE-8667) Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine.
[ https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689011#comment-13689011 ] Hadoop QA commented on HBASE-8667: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12588784/HBASE-8667_trunk_v4.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6083//console This message is automatically generated. Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine. -- Key: HBASE-8667 URL: https://issues.apache.org/jira/browse/HBASE-8667 Project: HBase Issue Type: Bug Components: IPC/RPC Reporter: rajeshbabu Assignee: rajeshbabu Fix For: 0.98.0, 0.95.2, 0.94.9 Attachments: HBASE-8667_trunk.patch, HBASE-8667_Trunk.patch, HBASE-8667_Trunk-V2.patch, HBASE-8667_trunk_v4.patch While testing HBASE-8640 fix found that master and regionserver running on different interfaces are not communicating properly. I have two interfaces 1) lo 2) eth0 in my machine and default hostname interface is lo. I have configured master ipc address to ip of eth0 interface. Started master and regionserver on the same machine. 1) master rpc server bound to eth0 and RS rpc server bound to lo 2) Since rpc client is not binding to any ip address, when RS is reporting RS startup its getting registered with eth0 ip address(but actually it should register localhost) Here are RS logs: {code} 2013-05-31 06:05:28,608 WARN [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; sleeping and then retrying. 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to Master server at 192.168.0.100,6,1369960497008 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 192.168.0.100,6,1369960497008 that we are up with port=60020, startcode=1369960502544 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.rootdir=hdfs://localhost:2851/hbase 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: fs.default.name=hdfs://localhost:2851 2013-05-31 06:05:31,618 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us a different hostname to use; was=localhost, but now=192.168.0.100 {code} Here are master logs: {code} 2013-05-31 06:05:31,615 INFO [IPC Server handler 9 on 6] org.apache.hadoop.hbase.master.ServerManager: Registering server=192.168.0.100,60020,1369960502544 {code} Since master has wrong rpc server address of RS, META is not getting assigned. {code} 2013-05-31 06:05:34,362 DEBUG [master-192.168.0.100,6,1369960497008] org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan was found (or we are ignoring an existing plan) for .META.,,1.1028785192 so generated a random one; hri=.META.,,1.1028785192, src=, dest=192.168.0.100,60020,1369960502544; 1 (online=1, available=1) available servers, forceNewPlan=false - org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of .META.,,1.1028785192 to 192.168.0.100,60020,1369960502544, trying to assign elsewhere instead; try=1 of 10 java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:549) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:813) at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1422) at
[jira] [Commented] (HBASE-7404) Bucket Cache:A solution about CMS,Heap Fragment and Big Cache on HBASE
[ https://issues.apache.org/jira/browse/HBASE-7404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689013#comment-13689013 ] Wei Li commented on HBASE-7404: --- The default value of bucketCachePercentage is 0 currently, I suggest set it to hfile.block.cache.size if combinedWithLru is true. Bucket Cache:A solution about CMS,Heap Fragment and Big Cache on HBASE -- Key: HBASE-7404 URL: https://issues.apache.org/jira/browse/HBASE-7404 Project: HBase Issue Type: New Feature Affects Versions: 0.94.3 Reporter: chunhui shen Assignee: chunhui shen Fix For: 0.95.0 Attachments: 7404-trunk-v10.patch, 7404-trunk-v11.patch, 7404-trunk-v12.patch, 7404-trunk-v13.patch, 7404-trunk-v13.txt, 7404-trunk-v14.patch, BucketCache.pdf, hbase-7404-94v2.patch, HBASE-7404-backport-0.94.patch, hbase-7404-trunkv2.patch, hbase-7404-trunkv9.patch, Introduction of Bucket Cache.pdf First, thanks @neil from Fusion-IO share the source code. Usage: 1.Use bucket cache as main memory cache, configured as the following: –hbase.bucketcache.ioengine heap –hbase.bucketcache.size 0.4 (size for bucket cache, 0.4 is a percentage of max heap size) 2.Use bucket cache as a secondary cache, configured as the following: –hbase.bucketcache.ioengine file:/disk1/hbase/cache.data(The file path where to store the block data) –hbase.bucketcache.size 1024 (size for bucket cache, unit is MB, so 1024 means 1GB) –hbase.bucketcache.combinedcache.enabled false (default value being true) See more configurations from org.apache.hadoop.hbase.io.hfile.CacheConfig and org.apache.hadoop.hbase.io.hfile.bucket.BucketCache What's Bucket Cache? It could greatly decrease CMS and heap fragment by GC It support a large cache space for High Read Performance by using high speed disk like Fusion-io 1.An implementation of block cache like LruBlockCache 2.Self manage blocks' storage position through Bucket Allocator 3.The cached blocks could be stored in the memory or file system 4.Bucket Cache could be used as a mainly block cache(see CombinedBlockCache), combined with LruBlockCache to decrease CMS and fragment by GC. 5.BucketCache also could be used as a secondary cache(e.g. using Fusionio to store block) to enlarge cache space How about SlabCache? We have studied and test SlabCache first, but the result is bad, because: 1.SlabCache use SingleSizeCache, its use ratio of memory is low because kinds of block size, especially using DataBlockEncoding 2.SlabCache is uesd in DoubleBlockCache, block is cached both in SlabCache and LruBlockCache, put the block to LruBlockCache again if hit in SlabCache , it causes CMS and heap fragment don't get any better 3.Direct heap performance is not good as heap, and maybe cause OOM, so we recommend using heap engine See more in the attachment and in the patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Comment Edited] (HBASE-8759) Family Delete Markers not getting purged after major compaction
[ https://issues.apache.org/jira/browse/HBASE-8759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13688988#comment-13688988 ] Lars Hofhansl edited comment on HBASE-8759 at 6/20/13 9:05 AM: --- NP. I should also be a bit more explicit how family delete markets are actually deleted. The logic is this: Each compaction registers the timestamp of the oldest put in the created hfile. The next (major) compaction then removes all family markers that are older than the oldest put. So as long as the client does not keep backdating puts, eventually all family markers drop out of the compacted files. was (Author: lhofhansl): NP. I should also be a bit more explicit how family delete markets are actually deleted. The logic is this: Each compaction registers the timestamp of the oldest put in the created hfile. The next (major) compaction then removes all family markers that are older than the oldest put. So as long as the client does not keep puts, eventually all family markers drop out of the compacted files. Family Delete Markers not getting purged after major compaction --- Key: HBASE-8759 URL: https://issues.apache.org/jira/browse/HBASE-8759 Project: HBase Issue Type: Bug Components: Compaction Affects Versions: 0.94.7 Reporter: Mujtaba Chohan Priority: Minor On table with VERSIONS = '1', KEEP_DELETED_CELLS = 'true'. Family Delete Markers does not get purged after put delete major compaction (they keep on incrementing after every put delete major compaction) Following is the raw scan output after 10 iterations of put delete major compaction. ROW COLUMN+CELL Acolumn=CF:, timestamp=1371512706683, type=DeleteFamily Acolumn=CF:, timestamp=1371512706394, type=DeleteFamily Acolumn=CF:, timestamp=1371512706054, type=DeleteFamily Acolumn=CF:, timestamp=1371512705763, type=DeleteFamily Acolumn=CF:, timestamp=1371512705457, type=DeleteFamily Acolumn=CF:, timestamp=1371512705149, type=DeleteFamily Acolumn=CF:, timestamp=1371512704836, type=DeleteFamily Acolumn=CF:, timestamp=1371512704518, type=DeleteFamily Acolumn=CF:, timestamp=1371512704162, type=DeleteFamily Acolumn=CF:, timestamp=1371512703779, type=DeleteFamily Acolumn=CF:COL, timestamp=1371512706682, value=X [~lhofhansl] Code to repro this issue: http://phoenix-bin.github.io/client/code/delete.java -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-8759) Family Delete Markers not getting purged after major compaction
[ https://issues.apache.org/jira/browse/HBASE-8759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl resolved HBASE-8759. -- Resolution: Not A Problem Family Delete Markers not getting purged after major compaction --- Key: HBASE-8759 URL: https://issues.apache.org/jira/browse/HBASE-8759 Project: HBase Issue Type: Bug Components: Compaction Affects Versions: 0.94.7 Reporter: Mujtaba Chohan Priority: Minor On table with VERSIONS = '1', KEEP_DELETED_CELLS = 'true'. Family Delete Markers does not get purged after put delete major compaction (they keep on incrementing after every put delete major compaction) Following is the raw scan output after 10 iterations of put delete major compaction. ROW COLUMN+CELL Acolumn=CF:, timestamp=1371512706683, type=DeleteFamily Acolumn=CF:, timestamp=1371512706394, type=DeleteFamily Acolumn=CF:, timestamp=1371512706054, type=DeleteFamily Acolumn=CF:, timestamp=1371512705763, type=DeleteFamily Acolumn=CF:, timestamp=1371512705457, type=DeleteFamily Acolumn=CF:, timestamp=1371512705149, type=DeleteFamily Acolumn=CF:, timestamp=1371512704836, type=DeleteFamily Acolumn=CF:, timestamp=1371512704518, type=DeleteFamily Acolumn=CF:, timestamp=1371512704162, type=DeleteFamily Acolumn=CF:, timestamp=1371512703779, type=DeleteFamily Acolumn=CF:COL, timestamp=1371512706682, value=X [~lhofhansl] Code to repro this issue: http://phoenix-bin.github.io/client/code/delete.java -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8721) Deletes can mask puts that happen after the delete
[ https://issues.apache.org/jira/browse/HBASE-8721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689043#comment-13689043 ] Lars Hofhansl commented on HBASE-8721: -- If your clients use the timestamp not as a timestamp, can they add whatever their values are to the rowkey? bq. do you mean some users rely on the KEEP_DELETED_CELLS and want the feature Delete can mask puts that happen after the delete? Exactly. When KEEP_DELETED_CELLS is enabled you can do true time range queries in HBase. For example you get the exact state of your data as of last week, or an hour ago, etc; even when data was deleted via delete markers. I think adding a global or maybe column family config option to change this behavior is fine, as long as the code does not get too convoluted. In that case we need to make sure then that all other HBase features such as replication, WAL replay, as-of-time queries, bulk loading HFiles, etc still work as expected. Also need to check that the HFile metadata is still correct as the timerange of the included KVs is used to exclude HFile from scans in some situations (if if you put a Delete marker at MAX_LONG this HFile would not be excluded for queries on new data, unless we add some other special logic). Even in that case I'd still be -0 on this (but I would no longer veto it with a -1) - this looks like a very app specific use case to me. You would need to find one or two committers who are ready to +1 this feature and patch to get it committed. Deletes can mask puts that happen after the delete -- Key: HBASE-8721 URL: https://issues.apache.org/jira/browse/HBASE-8721 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Feng Honghua Attachments: HBASE-8721-0.94-V0.patch this fix aims for bug mentioned in http://hbase.apache.org/book.html 5.8.2.1: Deletes mask puts, even puts that happened after the delete was entered. Remember that a delete writes a tombstone, which only disappears after then next major compaction has run. Suppose you do a delete of everything = T. After this you do a new put with a timestamp = T. This put, even if it happened after the delete, will be masked by the delete tombstone. Performing the put will not fail, but when you do a get you will notice the put did have no effect. It will start working again after the major compaction has run. These issues should not be a problem if you use always-increasing versions for new puts to a row. But they can occur even if you do not care about time: just do delete and put immediately after each other, and there is some chance they happen within the same millisecond. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8060) Num compacting KVs diverges from num compacted KVs over time
[ https://issues.apache.org/jira/browse/HBASE-8060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689047#comment-13689047 ] Lars Hofhansl commented on HBASE-8060: -- What is the meaning of totalCompactingKVs? Overall total? Or just the total for the last compaction? Does the patch change the meaning by resetting to current compacted count? Num compacting KVs diverges from num compacted KVs over time Key: HBASE-8060 URL: https://issues.apache.org/jira/browse/HBASE-8060 Project: HBase Issue Type: Bug Components: Compaction, UI Affects Versions: 0.94.6, 0.95.0, 0.95.2 Reporter: Andrew Purtell Assignee: Sergey Shelukhin Attachments: HBASE-8060-v0.patch, screenshot.png I have been running what amounts to an ingestion test for a day or so. This is an all-in-one cluster launched with './bin/hbase master start' from sources. In the RS stats on the master UI, the num compacting KVs has diverged from num compacted KVs even though compaction has been completed from perspective of selection, no compaction tasks are running on the RS. I think this could be confusing -- is compaction happening or not? Or maybe I'm misunderstanding what this is supposed to show? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6295) Possible performance improvement in client batch operations: presplit and send in background
[ https://issues.apache.org/jira/browse/HBASE-6295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689055#comment-13689055 ] Nicolas Liochon commented on HBASE-6295: [~jmspaggi] I'm waiting for your feedback then. BTW, if you have time ( :-) ), publishing a comparison between the 0.95 without this patch 0.94 might be useful. I'm saying this because if we have a performance degradation with the 0.94 this patch will hide it... Possible performance improvement in client batch operations: presplit and send in background Key: HBASE-6295 URL: https://issues.apache.org/jira/browse/HBASE-6295 Project: HBase Issue Type: Improvement Components: Client, Performance Affects Versions: 0.95.2 Reporter: Nicolas Liochon Assignee: Nicolas Liochon Labels: noob Fix For: 0.98.0 Attachments: 6295.v11.patch, 6295.v12.patch, 6295.v14.patch, 6295.v1.patch, 6295.v2.patch, 6295.v3.patch, 6295.v4.patch, 6295.v5.patch, 6295.v6.patch, 6295.v8.patch, 6295.v9.patch today batch algo is: {noformat} for Operation o: ListOp{ add o to todolist if todolist maxsize or o last in list split todolist per location send split lists to region servers clear todolist wait } {noformat} We could: - create immediately the final object instead of an intermediate array - split per location immediately - instead of sending when the list as a whole is full, send it when there is enough data for a single location It would be: {noformat} for Operation o: ListOp{ get location add o to todo location.todolist if (location.todolist maxLocationSize) send location.todolist to region server clear location.todolist // don't wait, continue the loop } send remaining wait {noformat} It's not trivial to write if you add error management: retried list must be shared with the operations added in the todolist. But it's doable. It's interesting mainly for 'big' writes -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8667) Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine.
[ https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] rajeshbabu updated HBASE-8667: -- Attachment: HBASE-8667_trunk_v5.patch [~stack] InetSocketAddress.createUnresolved is creating unresolved socket address which can be used only in some circumstances like connecting through proxy. Any way avoided extra resolving by passing InetAddress instead of hostname. {code} +rpcClient = new RpcClient(conf, clusterId, new InetSocketAddress( +this.isa.getAddress(), 0)); {code} Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine. -- Key: HBASE-8667 URL: https://issues.apache.org/jira/browse/HBASE-8667 Project: HBase Issue Type: Bug Components: IPC/RPC Reporter: rajeshbabu Assignee: rajeshbabu Fix For: 0.98.0, 0.95.2, 0.94.9 Attachments: HBASE-8667_trunk.patch, HBASE-8667_Trunk.patch, HBASE-8667_Trunk-V2.patch, HBASE-8667_trunk_v4.patch, HBASE-8667_trunk_v5.patch While testing HBASE-8640 fix found that master and regionserver running on different interfaces are not communicating properly. I have two interfaces 1) lo 2) eth0 in my machine and default hostname interface is lo. I have configured master ipc address to ip of eth0 interface. Started master and regionserver on the same machine. 1) master rpc server bound to eth0 and RS rpc server bound to lo 2) Since rpc client is not binding to any ip address, when RS is reporting RS startup its getting registered with eth0 ip address(but actually it should register localhost) Here are RS logs: {code} 2013-05-31 06:05:28,608 WARN [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; sleeping and then retrying. 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to Master server at 192.168.0.100,6,1369960497008 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 192.168.0.100,6,1369960497008 that we are up with port=60020, startcode=1369960502544 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.rootdir=hdfs://localhost:2851/hbase 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: fs.default.name=hdfs://localhost:2851 2013-05-31 06:05:31,618 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us a different hostname to use; was=localhost, but now=192.168.0.100 {code} Here are master logs: {code} 2013-05-31 06:05:31,615 INFO [IPC Server handler 9 on 6] org.apache.hadoop.hbase.master.ServerManager: Registering server=192.168.0.100,60020,1369960502544 {code} Since master has wrong rpc server address of RS, META is not getting assigned. {code} 2013-05-31 06:05:34,362 DEBUG [master-192.168.0.100,6,1369960497008] org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan was found (or we are ignoring an existing plan) for .META.,,1.1028785192 so generated a random one; hri=.META.,,1.1028785192, src=, dest=192.168.0.100,60020,1369960502544; 1 (online=1, available=1) available servers, forceNewPlan=false - org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of .META.,,1.1028785192 to 192.168.0.100,60020,1369960502544, trying to assign elsewhere instead; try=1 of 10 java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:549) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:813) at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1422) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1315) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1532) at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1587) at org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.openRegion(AdminProtos.java:15039)
[jira] [Commented] (HBASE-8773) Can be setup the COMPRESSION base on HTable in meta or user set in Configuration
[ https://issues.apache.org/jira/browse/HBASE-8773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689068#comment-13689068 ] Anoop Sam John commented on HBASE-8773: --- So u dont want to use the compression based on the config param. There is some compression scheme already for the HTable and u just want ot continue with that? Can be setup the COMPRESSION base on HTable in meta or user set in Configuration Key: HBASE-8773 URL: https://issues.apache.org/jira/browse/HBASE-8773 Project: HBase Issue Type: New Feature Components: HFile Affects Versions: 0.94.8 Reporter: jay wong Fix For: 0.94.9 Attachments: HBASE-8773.patch when I want create HFile with the ImportTsv. I found that if i set the compression in the Configuration or not, It's always invalid。 It because of the method 'configureIncrementalLoad' in HFileOutputFormat will set the compression with the HTable in meta. So if add a configuration to switch use set compression with HTable or Not -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-8774) Add BatchSize and Filter to Thrift2
Hamed Madani created HBASE-8774: --- Summary: Add BatchSize and Filter to Thrift2 Key: HBASE-8774 URL: https://issues.apache.org/jira/browse/HBASE-8774 Project: HBase Issue Type: Improvement Components: Thrift Affects Versions: 0.95.1 Reporter: Hamed Madani Priority: Minor Attachments: HBASE_8774.patch Attached Patch will add BatchSize and Filter support to Thrift2 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8774) Add BatchSize and Filter to Thrift2
[ https://issues.apache.org/jira/browse/HBASE-8774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hamed Madani updated HBASE-8774: Attachment: HBASE_8774.patch Add BatchSize and Filter to Thrift2 --- Key: HBASE-8774 URL: https://issues.apache.org/jira/browse/HBASE-8774 Project: HBase Issue Type: Improvement Components: Thrift Affects Versions: 0.95.1 Reporter: Hamed Madani Priority: Minor Attachments: HBASE_8774.patch Attached Patch will add BatchSize and Filter support to Thrift2 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8755) A new write thread model for HLog to improve the overall HBase write throughput
[ https://issues.apache.org/jira/browse/HBASE-8755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689094#comment-13689094 ] Feng Honghua commented on HBASE-8755: - If possible, would anybody else help do the same comparison test as Chunhui/me? Thanks in advance. [~lhofhansl] [~yuzhih...@gmail.com] [~sershe] [~stack] A new write thread model for HLog to improve the overall HBase write throughput --- Key: HBASE-8755 URL: https://issues.apache.org/jira/browse/HBASE-8755 Project: HBase Issue Type: Improvement Components: wal Reporter: Feng Honghua Attachments: HBASE-8755-0.94-V0.patch, HBASE-8755-0.94-V1.patch, HBASE-8755-trunk-V0.patch In current write model, each write handler thread (executing put()) will individually go through a full 'append (hlog local buffer) = HLog writer append (write to hdfs) = HLog writer sync (sync hdfs)' cycle for each write, which incurs heavy race condition on updateLock and flushLock. The only optimization where checking if current syncTillHere txid in expectation for other thread help write/sync its own txid to hdfs and omitting the write/sync actually help much less than expectation. Three of my colleagues(Ye Hangjun / Wu Zesheng / Zhang Peng) at Xiaomi proposed a new write thread model for writing hdfs sequence file and the prototype implementation shows a 4X improvement for throughput (from 17000 to 7+). I apply this new write thread model in HLog and the performance test in our test cluster shows about 3X throughput improvement (from 12150 to 31520 for 1 RS, from 22000 to 7 for 5 RS), the 1 RS write throughput (1K row-size) even beats the one of BigTable (Precolator published in 2011 says Bigtable's write throughput then is 31002). I can provide the detailed performance test results if anyone is interested. The change for new write thread model is as below: 1 All put handler threads append the edits to HLog's local pending buffer; (it notifies AsyncWriter thread that there is new edits in local buffer) 2 All put handler threads wait in HLog.syncer() function for underlying threads to finish the sync that contains its txid; 3 An single AsyncWriter thread is responsible for retrieve all the buffered edits in HLog's local pending buffer and write to the hdfs (hlog.writer.append); (it notifies AsyncFlusher thread that there is new writes to hdfs that needs a sync) 4 An single AsyncFlusher thread is responsible for issuing a sync to hdfs to persist the writes by AsyncWriter; (it notifies the AsyncNotifier thread that sync watermark increases) 5 An single AsyncNotifier thread is responsible for notifying all pending put handler threads which are waiting in the HLog.syncer() function 6 No LogSyncer thread any more (since there is always AsyncWriter/AsyncFlusher threads do the same job it does) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8721) Deletes can mask puts that happen after the delete
[ https://issues.apache.org/jira/browse/HBASE-8721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689098#comment-13689098 ] Hangjun Ye commented on HBASE-8721: --- Nice to know adding a config is acceptable at least! You pointed out many features that we need to be careful not to break, we should do that as you suggested. Back to the feature of KEEP_DELETED_CELLS, my perception is even we disable Delete can mask puts that happen after the delete (whether by a config or by other ways), KEEP_DELETED_CELLS still works as you expect. Sounds they are basically independent features? Deletes can mask puts that happen after the delete -- Key: HBASE-8721 URL: https://issues.apache.org/jira/browse/HBASE-8721 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Feng Honghua Attachments: HBASE-8721-0.94-V0.patch this fix aims for bug mentioned in http://hbase.apache.org/book.html 5.8.2.1: Deletes mask puts, even puts that happened after the delete was entered. Remember that a delete writes a tombstone, which only disappears after then next major compaction has run. Suppose you do a delete of everything = T. After this you do a new put with a timestamp = T. This put, even if it happened after the delete, will be masked by the delete tombstone. Performing the put will not fail, but when you do a get you will notice the put did have no effect. It will start working again after the major compaction has run. These issues should not be a problem if you use always-increasing versions for new puts to a row. But they can occur even if you do not care about time: just do delete and put immediately after each other, and there is some chance they happen within the same millisecond. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8774) Add BatchSize and Filter to Thrift2
[ https://issues.apache.org/jira/browse/HBASE-8774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hamed Madani updated HBASE-8774: Status: Patch Available (was: Open) Add BatchSize and Filter to Thrift2 --- Key: HBASE-8774 URL: https://issues.apache.org/jira/browse/HBASE-8774 Project: HBase Issue Type: Improvement Components: Thrift Affects Versions: 0.95.1 Reporter: Hamed Madani Priority: Minor Attachments: HBASE_8774.patch Attached Patch will add BatchSize and Filter support to Thrift2 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8774) Add BatchSize and Filter to Thrift2
[ https://issues.apache.org/jira/browse/HBASE-8774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hamed Madani updated HBASE-8774: Issue Type: New Feature (was: Improvement) Add BatchSize and Filter to Thrift2 --- Key: HBASE-8774 URL: https://issues.apache.org/jira/browse/HBASE-8774 Project: HBase Issue Type: New Feature Components: Thrift Affects Versions: 0.95.1 Reporter: Hamed Madani Priority: Minor Attachments: HBASE_8774.patch Attached Patch will add BatchSize and Filter support to Thrift2 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8774) Add BatchSize and Filter to Thrift2
[ https://issues.apache.org/jira/browse/HBASE-8774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hamed Madani updated HBASE-8774: Priority: Major (was: Minor) Add BatchSize and Filter to Thrift2 --- Key: HBASE-8774 URL: https://issues.apache.org/jira/browse/HBASE-8774 Project: HBase Issue Type: New Feature Components: Thrift Affects Versions: 0.95.1 Reporter: Hamed Madani Attachments: HBASE_8774.patch Attached Patch will add BatchSize and Filter support to Thrift2 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8774) Add BatchSize and Filter to Thrift2
[ https://issues.apache.org/jira/browse/HBASE-8774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689107#comment-13689107 ] Jieshan Bean commented on HBASE-8774: - See HBASE-6073. It's also regarding on adding filter to Thrift2. 1 minor problem in the patch: {code} +boolean this_present_filterString = true this.isSetFilterString(); +boolean that_present_filterString = true that.isSetFilterString(); {code} true is redundant. In addition, I suggest to add 1 unit test. Anyway, it's a nice patch. Add BatchSize and Filter to Thrift2 --- Key: HBASE-8774 URL: https://issues.apache.org/jira/browse/HBASE-8774 Project: HBase Issue Type: New Feature Components: Thrift Affects Versions: 0.95.1 Reporter: Hamed Madani Attachments: HBASE_8774.patch Attached Patch will add BatchSize and Filter support to Thrift2 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8627) HBCK can not fix meta not assigned issue
[ https://issues.apache.org/jira/browse/HBASE-8627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689108#comment-13689108 ] Anoop Sam John commented on HBASE-8627: --- [~jmhsieh], [~jxiang] , [~sershe] comments? HBCK can not fix meta not assigned issue Key: HBASE-8627 URL: https://issues.apache.org/jira/browse/HBASE-8627 Project: HBase Issue Type: Bug Components: hbck Affects Versions: 0.95.0 Reporter: Anoop Sam John Assignee: Anoop Sam John Attachments: HBASE-8627_Trunk.patch, HBASE-8627_Trunk-V2.patch When meta table region is not assigned to any RS, HBCK run will get exception. I can see code added in checkMetaRegion() to solve this issue but it wont work. It still refers to ROOT region! -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8705) RS holding META when restarted in a single node setup may hang infinitely without META assignment
[ https://issues.apache.org/jira/browse/HBASE-8705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689139#comment-13689139 ] ramkrishna.s.vasudevan commented on HBASE-8705: --- Am not able to reproduce this scenario every time. But after seeing the logs the restart of RS will not solve the problem because the META location is already unset in the ZK. I can commit this patch unless objections. RS holding META when restarted in a single node setup may hang infinitely without META assignment - Key: HBASE-8705 URL: https://issues.apache.org/jira/browse/HBASE-8705 Project: HBase Issue Type: Bug Affects Versions: 0.95.1 Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Priority: Minor Fix For: 0.98.0 Attachments: HBASE-8705_1.patch, HBASE-8705_2.patch, HBASE-8705.patch This bug may be minor as it likely to happen in a single node setup. I restarted the RS holding META. The master tried assigning META using MetaSSH. But tried this before the new RS came up. So as not region plan is found {code} if (plan == null) { LOG.warn(Unable to determine a plan to assign + region); if (tomActivated){ this.timeoutMonitor.setAllRegionServersOffline(true); } else { regionStates.updateRegionState(region, RegionState.State.FAILED_OPEN); } return; } {code} we just return without assigment. And this being the META the small cluster just hangs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8721) Deletes can mask puts that happen after the delete
[ https://issues.apache.org/jira/browse/HBASE-8721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689160#comment-13689160 ] Feng Honghua commented on HBASE-8721: - I list some merits with behavior 'Delete can't mask puts that happen after the delete': 1) Can avoid the inconsistency such as I mentioned above, with our patch, user can always read the put by 4. It's more natural and intuitive: 1 put a kv (timestamp = T0), and flush; 2 delete that kv using a DeleteColumn type kv with timestamp T0 (or any timestamp = T0), and flush; 3 a major compact occurs [or not]; 4 put that kv again (timestamp = T0); 5 read that kv; === a) if a major compact occurs at step 3, then step 5 will get the put written at step 4 b) if no major compact occurs at step 3, then step 5 get nothing 2) Can provide strong guarantee for such operation: I don't know which/how-many versions in a cell, now I (by removing all existing ones) just want to put a new version into it and ensure only this new put in the cell regardless of the ts comparison with old existing ones (I think this operation/guarantee is useful in many scenarios). Current delete behavior can't provide such guarantee. 3) 'delete latest version'(deleteColumn() without ts) can be tuned to remove the read (latest version for its ts) during 'deleteColumn'. Current delete behavior can't be tuned to remove the read operation during 'deleteColumn' 4) 'new put can't be masked (disappear) by old/existing delete' itself is a merit for many use-cases / application since it's more natural and intuitive. I ever explained many times to different customers for the old semantics of version/delete and without exception all the first responses from them are weird... why so? Per my understanding, contrary to [~lhofhansl] and [~sershe], 'timestamp' is just a long type to determine versions' ordering using the rule of 'the bigger/later wins', and it happens the timestamp in 'time' semantic is a long type and new put with its 'current' timestamp has bigger timestamp, and in most cases new put versions knock out older ones. And for many use cases time-semantic for 'timestamp' is enough for the real-world requirement, but by design it's not always the case, otherwise the timestamp won't be exposed for user to set it explicitly. In a word, as long as user knows 'timestamp' is just only the dimension of long type to determine the version ordering using the rule 'the bigger wins', he can reason out the result of any operation sequences. In essence 'timestamp as a dimension for version ordering' doesn't related to delete semantic. -- I know my understanding is arguable for many guys, since the old delete semantic and behavior has existed for so long and everybody has already taken it for granted (I mean no offence here) At last I also list the downside of proposed optional solutions I received: A 'KEEP_DELETE_CELLS' is definitely a nice feature, but many users don't need this feature (to time-travel or trace-back action history) and this feature prevent major-compact to shrink data-set by collecting. B disallow user explicitly set timestamp, this treatment limits HBase's schema flexibility, and prohibit many innovative design such as facebook's message search index, and at last it can't guarantee unique timestamp hence can still lead to tricky / confusing behavior. Deletes can mask puts that happen after the delete -- Key: HBASE-8721 URL: https://issues.apache.org/jira/browse/HBASE-8721 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Feng Honghua Attachments: HBASE-8721-0.94-V0.patch this fix aims for bug mentioned in http://hbase.apache.org/book.html 5.8.2.1: Deletes mask puts, even puts that happened after the delete was entered. Remember that a delete writes a tombstone, which only disappears after then next major compaction has run. Suppose you do a delete of everything = T. After this you do a new put with a timestamp = T. This put, even if it happened after the delete, will be masked by the delete tombstone. Performing the put will not fail, but when you do a get you will notice the put did have no effect. It will start working again after the major compaction has run. These issues should not be a problem if you use always-increasing versions for new puts to a row. But they can occur even if you do not care about time: just do delete and put immediately after each other, and there is some chance they happen within the same millisecond. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8617) Introducing a new config to disable writes during recovering
[ https://issues.apache.org/jira/browse/HBASE-8617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689174#comment-13689174 ] Hudson commented on HBASE-8617: --- Integrated in hbase-0.95-on-hadoop2 #139 (See [https://builds.apache.org/job/hbase-0.95-on-hadoop2/139/]) HBASE-8617: Introducing a new config to disable writes during recovering (Revision 1494814) Result = FAILURE jeffreyz : Files : * /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/HConstants.java * /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java * /hbase/branches/0.95/hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestDistributedLogSplitting.java Introducing a new config to disable writes during recovering - Key: HBASE-8617 URL: https://issues.apache.org/jira/browse/HBASE-8617 Project: HBase Issue Type: Improvement Components: regionserver Affects Versions: 0.98.0, 0.95.1 Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.98.0, 0.95.2 Attachments: HBASE-8617.patch, HBASE-8617-v2.patch, hbase-8617-v3.patch In distributedLogReplay(hbase-7006), we allow writes even when a region is in recovering. It may cause undesired behavior when applications(or deployments) already are near its write capacity because distributedLogReplay generates more write traffic to remaining region servers. The new config hbase.regionserver.disallow.writes.when.recovering tries to address the above situation so that recovering won't be affected by application normal write traffic. The default value of this config is false(meaning allow writes in recovery) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6295) Possible performance improvement in client batch operations: presplit and send in background
[ https://issues.apache.org/jira/browse/HBASE-6295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689196#comment-13689196 ] Jean-Marc Spaggiari commented on HBASE-6295: Tests crashed yesterday because of some ZK obscure reasons... So I had to restart it. It should be done now. I will add 0.95 on the list, and run it. Which mean I should have all the results this evening (EST). I will take the required time to provide the feedback today. Possible performance improvement in client batch operations: presplit and send in background Key: HBASE-6295 URL: https://issues.apache.org/jira/browse/HBASE-6295 Project: HBase Issue Type: Improvement Components: Client, Performance Affects Versions: 0.95.2 Reporter: Nicolas Liochon Assignee: Nicolas Liochon Labels: noob Fix For: 0.98.0 Attachments: 6295.v11.patch, 6295.v12.patch, 6295.v14.patch, 6295.v1.patch, 6295.v2.patch, 6295.v3.patch, 6295.v4.patch, 6295.v5.patch, 6295.v6.patch, 6295.v8.patch, 6295.v9.patch today batch algo is: {noformat} for Operation o: ListOp{ add o to todolist if todolist maxsize or o last in list split todolist per location send split lists to region servers clear todolist wait } {noformat} We could: - create immediately the final object instead of an intermediate array - split per location immediately - instead of sending when the list as a whole is full, send it when there is enough data for a single location It would be: {noformat} for Operation o: ListOp{ get location add o to todo location.todolist if (location.todolist maxLocationSize) send location.todolist to region server clear location.todolist // don't wait, continue the loop } send remaining wait {noformat} It's not trivial to write if you add error management: retried list must be shared with the operations added in the todolist. But it's doable. It's interesting mainly for 'big' writes -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8617) Introducing a new config to disable writes during recovering
[ https://issues.apache.org/jira/browse/HBASE-8617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689227#comment-13689227 ] Hudson commented on HBASE-8617: --- Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #574 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/574/]) HBASE-8617: Introducing a new config to disable writes during recovering (Revision 1494804) Result = FAILURE jeffreyz : Files : * /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/HConstants.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestDistributedLogSplitting.java Introducing a new config to disable writes during recovering - Key: HBASE-8617 URL: https://issues.apache.org/jira/browse/HBASE-8617 Project: HBase Issue Type: Improvement Components: regionserver Affects Versions: 0.98.0, 0.95.1 Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.98.0, 0.95.2 Attachments: HBASE-8617.patch, HBASE-8617-v2.patch, hbase-8617-v3.patch In distributedLogReplay(hbase-7006), we allow writes even when a region is in recovering. It may cause undesired behavior when applications(or deployments) already are near its write capacity because distributedLogReplay generates more write traffic to remaining region servers. The new config hbase.regionserver.disallow.writes.when.recovering tries to address the above situation so that recovering won't be affected by application normal write traffic. The default value of this config is false(meaning allow writes in recovery) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8753) Provide new delete flag which can delete all cells under a column-family which have a same designated timestamp
[ https://issues.apache.org/jira/browse/HBASE-8753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689229#comment-13689229 ] Feng Honghua commented on HBASE-8753: - [~lhofhansl] For the backwards-compatibility, when old RS processes the DeleteFamilyVersion type kv (either written from new client, or the two scenarios you mentioned regarding rolling restart), the DeleteFamilyVersion can enter ScanDeleteTracker, and its only effect it has is when no DeleteColumn for null column with the same timestamp as this DeleteFamilyVersion, this DeleteFamilyVersion can delete the KV (column=null) with the same timestamp (a bit like the Delete(DeleteVersion) with the same timestamp), and no other side-effect. In summary: DeleteFamilyVersion masks all the versions with a given timestamp under a CF, and when an old RS receives it(written from new client, or the two scenarios mentioned regarding rolling restart), the old RS treats it like it's a Delete(DeleteVersion) for null column. Nothing else. I think this side-effect is acceptable. Your opinion? Provide new delete flag which can delete all cells under a column-family which have a same designated timestamp --- Key: HBASE-8753 URL: https://issues.apache.org/jira/browse/HBASE-8753 Project: HBase Issue Type: New Feature Components: Deletes Reporter: Feng Honghua Attachments: HBASE-8753-0.94-V0.patch, HBASE-8753-trunk-V0.patch In one of our production scenario (Xiaomi message search), multiple cells will be put in batch using a same timestamp with different column names under a specific column-family. And after some time these cells also need to be deleted in batch by given a specific timestamp. But the column names are parsed tokens which can be arbitrary words , so such batch delete is impossible without first retrieving all KVs from that CF and get the column name list which has KV with that given timestamp, and then issuing individual deleteColumn for each column in that column-list. Though it's possible to do such batch delete, its performance is poor, and customers also find their code is quite clumsy by first retrieving and populating the column list and then issuing a deleteColumn for each column in that column-list. This feature resolves this problem by introducing a new delete flag: DeleteFamilyVersion. 1). When you need to delete all KVs under a column-family with a given timestamp, just call Delete.deleteFamilyVersion(cfName, timestamp); only a DeleteFamilyVersion type KV is put to HBase (like DeleteFamily / DeleteColumn / Delete) without read operation; 2). Like other delete types, DeleteFamilyVersion takes effect in get/scan/flush/compact operations, the ScanDeleteTracker now parses out and uses DeleteFamilyVersion to prevent all KVs under the specific CF which has the same timestamp as the DeleteFamilyVersion KV to pop-up as part of a get/scan result (also in flush/compact). Our customers find this feature efficient, clean and easy-to-use since it does its work without knowing the exact column names list that needs to be deleted. This feature has been running smoothly for a couple of months in our production clusters. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8755) A new write thread model for HLog to improve the overall HBase write throughput
[ https://issues.apache.org/jira/browse/HBASE-8755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689305#comment-13689305 ] stack commented on HBASE-8755: -- [~jmspaggi] You want to set up a rig to test this one? A new write thread model for HLog to improve the overall HBase write throughput --- Key: HBASE-8755 URL: https://issues.apache.org/jira/browse/HBASE-8755 Project: HBase Issue Type: Improvement Components: wal Reporter: Feng Honghua Attachments: HBASE-8755-0.94-V0.patch, HBASE-8755-0.94-V1.patch, HBASE-8755-trunk-V0.patch In current write model, each write handler thread (executing put()) will individually go through a full 'append (hlog local buffer) = HLog writer append (write to hdfs) = HLog writer sync (sync hdfs)' cycle for each write, which incurs heavy race condition on updateLock and flushLock. The only optimization where checking if current syncTillHere txid in expectation for other thread help write/sync its own txid to hdfs and omitting the write/sync actually help much less than expectation. Three of my colleagues(Ye Hangjun / Wu Zesheng / Zhang Peng) at Xiaomi proposed a new write thread model for writing hdfs sequence file and the prototype implementation shows a 4X improvement for throughput (from 17000 to 7+). I apply this new write thread model in HLog and the performance test in our test cluster shows about 3X throughput improvement (from 12150 to 31520 for 1 RS, from 22000 to 7 for 5 RS), the 1 RS write throughput (1K row-size) even beats the one of BigTable (Precolator published in 2011 says Bigtable's write throughput then is 31002). I can provide the detailed performance test results if anyone is interested. The change for new write thread model is as below: 1 All put handler threads append the edits to HLog's local pending buffer; (it notifies AsyncWriter thread that there is new edits in local buffer) 2 All put handler threads wait in HLog.syncer() function for underlying threads to finish the sync that contains its txid; 3 An single AsyncWriter thread is responsible for retrieve all the buffered edits in HLog's local pending buffer and write to the hdfs (hlog.writer.append); (it notifies AsyncFlusher thread that there is new writes to hdfs that needs a sync) 4 An single AsyncFlusher thread is responsible for issuing a sync to hdfs to persist the writes by AsyncWriter; (it notifies the AsyncNotifier thread that sync watermark increases) 5 An single AsyncNotifier thread is responsible for notifying all pending put handler threads which are waiting in the HLog.syncer() function 6 No LogSyncer thread any more (since there is always AsyncWriter/AsyncFlusher threads do the same job it does) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8667) Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine.
[ https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689306#comment-13689306 ] stack commented on HBASE-8667: -- [~rajesh23] v5 has this still: -rpcClient = new RpcClient(conf, clusterId); +rpcClient = new RpcClient(conf, clusterId, new InetSocketAddress( +this.isa.getAddress(), 0)); You cannot do? rpcClient = new RpcClient(conf, clusterId, this.isa); Thanks for doing this fixup. Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine. -- Key: HBASE-8667 URL: https://issues.apache.org/jira/browse/HBASE-8667 Project: HBase Issue Type: Bug Components: IPC/RPC Reporter: rajeshbabu Assignee: rajeshbabu Fix For: 0.98.0, 0.95.2, 0.94.9 Attachments: HBASE-8667_trunk.patch, HBASE-8667_Trunk.patch, HBASE-8667_Trunk-V2.patch, HBASE-8667_trunk_v4.patch, HBASE-8667_trunk_v5.patch While testing HBASE-8640 fix found that master and regionserver running on different interfaces are not communicating properly. I have two interfaces 1) lo 2) eth0 in my machine and default hostname interface is lo. I have configured master ipc address to ip of eth0 interface. Started master and regionserver on the same machine. 1) master rpc server bound to eth0 and RS rpc server bound to lo 2) Since rpc client is not binding to any ip address, when RS is reporting RS startup its getting registered with eth0 ip address(but actually it should register localhost) Here are RS logs: {code} 2013-05-31 06:05:28,608 WARN [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; sleeping and then retrying. 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to Master server at 192.168.0.100,6,1369960497008 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 192.168.0.100,6,1369960497008 that we are up with port=60020, startcode=1369960502544 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.rootdir=hdfs://localhost:2851/hbase 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: fs.default.name=hdfs://localhost:2851 2013-05-31 06:05:31,618 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us a different hostname to use; was=localhost, but now=192.168.0.100 {code} Here are master logs: {code} 2013-05-31 06:05:31,615 INFO [IPC Server handler 9 on 6] org.apache.hadoop.hbase.master.ServerManager: Registering server=192.168.0.100,60020,1369960502544 {code} Since master has wrong rpc server address of RS, META is not getting assigned. {code} 2013-05-31 06:05:34,362 DEBUG [master-192.168.0.100,6,1369960497008] org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan was found (or we are ignoring an existing plan) for .META.,,1.1028785192 so generated a random one; hri=.META.,,1.1028785192, src=, dest=192.168.0.100,60020,1369960502544; 1 (online=1, available=1) available servers, forceNewPlan=false - org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of .META.,,1.1028785192 to 192.168.0.100,60020,1369960502544, trying to assign elsewhere instead; try=1 of 10 java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:549) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:813) at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1422) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1315) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1532) at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1587) at org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.openRegion(AdminProtos.java:15039) at
[jira] [Commented] (HBASE-8701) distributedLogReplay need to apply wal edits in the receiving order of those edits
[ https://issues.apache.org/jira/browse/HBASE-8701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689316#comment-13689316 ] stack commented on HBASE-8701: -- bq. The sequence ids of hfile are intact as before. But some can be -ve? So they will be out of order? (I don't see special handling in v7 -- I may have missed it). Thanks. distributedLogReplay need to apply wal edits in the receiving order of those edits -- Key: HBASE-8701 URL: https://issues.apache.org/jira/browse/HBASE-8701 Project: HBase Issue Type: Bug Components: MTTR Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.98.0, 0.95.2 Attachments: 8701-v3.txt, hbase-8701-v4.patch, hbase-8701-v5.patch, hbase-8701-v6.patch, hbase-8701-v7.patch This issue happens in distributedLogReplay mode when recovering multiple puts of the same key + version(timestamp). After replay, the value is nondeterministic of the key h5. The original concern situation raised from [~eclark]: For all edits the rowkey is the same. There's a log with: [ A (ts = 0), B (ts = 0) ] Replay the first half of the log. A user puts in C (ts = 0) Memstore has to flush A new Hfile will be created with [ C, A ] and MaxSequenceId = C's seqid. Replay the rest of the Log. Flush The issue will happen in similar situation like Put(key, t=T) in WAL1 and Put(key,t=T) in WAL2 h5. Below is the option(proposed by Ted) I'd like to use: a) During replay, we pass original wal sequence number of each edit to the receiving RS b) In receiving RS, we store negative original sequence number of wal edits into mvcc field of KVs of wal edits c) Add handling of negative MVCC in KVScannerComparator and KVComparator d) In receiving RS, write original sequence number into an optional field of wal file for chained RS failure situation e) When opening a region, we add a safety bumper(a large number) in order for the new sequence number of a newly opened region not to collide with old sequence numbers. In the future, when we stores sequence number along with KVs, we can adjust the above solution a little bit by avoiding to overload MVCC field. h5. The other alternative options are listed below for references: Option one a) disallow writes during recovery b) during replay, we pass original wal sequence ids c) hold flush till all wals of a recovering region are replayed. Memstore should hold because we only recover unflushed wal edits. For edits with same key + version, whichever with larger sequence Id wins. Option two a) During replay, we pass original wal sequence ids b) for each wal edit, we store each edit's original sequence id along with its key. c) during scanning, we use the original sequence id if it's present otherwise its store file sequence Id d) compaction can just leave put with max sequence id Please let me know if you have better ideas. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8667) Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine.
[ https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689329#comment-13689329 ] rajeshbabu commented on HBASE-8667: --- If we use this.isa directly we will get BindException because rpc server already binding to the port(60010). Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine. -- Key: HBASE-8667 URL: https://issues.apache.org/jira/browse/HBASE-8667 Project: HBase Issue Type: Bug Components: IPC/RPC Reporter: rajeshbabu Assignee: rajeshbabu Fix For: 0.98.0, 0.95.2, 0.94.9 Attachments: HBASE-8667_trunk.patch, HBASE-8667_Trunk.patch, HBASE-8667_Trunk-V2.patch, HBASE-8667_trunk_v4.patch, HBASE-8667_trunk_v5.patch While testing HBASE-8640 fix found that master and regionserver running on different interfaces are not communicating properly. I have two interfaces 1) lo 2) eth0 in my machine and default hostname interface is lo. I have configured master ipc address to ip of eth0 interface. Started master and regionserver on the same machine. 1) master rpc server bound to eth0 and RS rpc server bound to lo 2) Since rpc client is not binding to any ip address, when RS is reporting RS startup its getting registered with eth0 ip address(but actually it should register localhost) Here are RS logs: {code} 2013-05-31 06:05:28,608 WARN [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; sleeping and then retrying. 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to Master server at 192.168.0.100,6,1369960497008 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 192.168.0.100,6,1369960497008 that we are up with port=60020, startcode=1369960502544 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.rootdir=hdfs://localhost:2851/hbase 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: fs.default.name=hdfs://localhost:2851 2013-05-31 06:05:31,618 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us a different hostname to use; was=localhost, but now=192.168.0.100 {code} Here are master logs: {code} 2013-05-31 06:05:31,615 INFO [IPC Server handler 9 on 6] org.apache.hadoop.hbase.master.ServerManager: Registering server=192.168.0.100,60020,1369960502544 {code} Since master has wrong rpc server address of RS, META is not getting assigned. {code} 2013-05-31 06:05:34,362 DEBUG [master-192.168.0.100,6,1369960497008] org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan was found (or we are ignoring an existing plan) for .META.,,1.1028785192 so generated a random one; hri=.META.,,1.1028785192, src=, dest=192.168.0.100,60020,1369960502544; 1 (online=1, available=1) available servers, forceNewPlan=false - org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of .META.,,1.1028785192 to 192.168.0.100,60020,1369960502544, trying to assign elsewhere instead; try=1 of 10 java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:549) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:813) at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1422) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1315) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1532) at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1587) at org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.openRegion(AdminProtos.java:15039) at org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:627) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1826) at
[jira] [Commented] (HBASE-8627) HBCK can not fix meta not assigned issue
[ https://issues.apache.org/jira/browse/HBASE-8627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689404#comment-13689404 ] Jimmy Xiang commented on HBASE-8627: For #deleteMetaRegion, do you plan to use the last two parameters? HBCK can not fix meta not assigned issue Key: HBASE-8627 URL: https://issues.apache.org/jira/browse/HBASE-8627 Project: HBase Issue Type: Bug Components: hbck Affects Versions: 0.95.0 Reporter: Anoop Sam John Assignee: Anoop Sam John Attachments: HBASE-8627_Trunk.patch, HBASE-8627_Trunk-V2.patch When meta table region is not assigned to any RS, HBCK run will get exception. I can see code added in checkMetaRegion() to solve this issue but it wont work. It still refers to ROOT region! -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8015) Support for Namespaces
[ https://issues.apache.org/jira/browse/HBASE-8015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689417#comment-13689417 ] Francis Liu commented on HBASE-8015: [~saint@gmail.com], I'm leaning towards having the migration operation done manually by calling a script as well. Which options do we provide the user? Also it might be better if the script is portable enough that they can run on an existing 0.94 cluster so they don't have to find out during the actual upgrade process. Support for Namespaces -- Key: HBASE-8015 URL: https://issues.apache.org/jira/browse/HBASE-8015 Project: HBase Issue Type: New Feature Reporter: Francis Liu Assignee: Francis Liu Attachments: HBASE-8015_draft_94.patch, Namespace Design.pdf -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8667) Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine.
[ https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689418#comment-13689418 ] stack commented on HBASE-8667: -- Ok. Makes sense. I am up for trying it. Thanks [~rajesh23]. Anyone else want to take a look? Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine. -- Key: HBASE-8667 URL: https://issues.apache.org/jira/browse/HBASE-8667 Project: HBase Issue Type: Bug Components: IPC/RPC Reporter: rajeshbabu Assignee: rajeshbabu Fix For: 0.98.0, 0.95.2, 0.94.9 Attachments: HBASE-8667_trunk.patch, HBASE-8667_Trunk.patch, HBASE-8667_Trunk-V2.patch, HBASE-8667_trunk_v4.patch, HBASE-8667_trunk_v5.patch While testing HBASE-8640 fix found that master and regionserver running on different interfaces are not communicating properly. I have two interfaces 1) lo 2) eth0 in my machine and default hostname interface is lo. I have configured master ipc address to ip of eth0 interface. Started master and regionserver on the same machine. 1) master rpc server bound to eth0 and RS rpc server bound to lo 2) Since rpc client is not binding to any ip address, when RS is reporting RS startup its getting registered with eth0 ip address(but actually it should register localhost) Here are RS logs: {code} 2013-05-31 06:05:28,608 WARN [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; sleeping and then retrying. 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to Master server at 192.168.0.100,6,1369960497008 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 192.168.0.100,6,1369960497008 that we are up with port=60020, startcode=1369960502544 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.rootdir=hdfs://localhost:2851/hbase 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: fs.default.name=hdfs://localhost:2851 2013-05-31 06:05:31,618 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us a different hostname to use; was=localhost, but now=192.168.0.100 {code} Here are master logs: {code} 2013-05-31 06:05:31,615 INFO [IPC Server handler 9 on 6] org.apache.hadoop.hbase.master.ServerManager: Registering server=192.168.0.100,60020,1369960502544 {code} Since master has wrong rpc server address of RS, META is not getting assigned. {code} 2013-05-31 06:05:34,362 DEBUG [master-192.168.0.100,6,1369960497008] org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan was found (or we are ignoring an existing plan) for .META.,,1.1028785192 so generated a random one; hri=.META.,,1.1028785192, src=, dest=192.168.0.100,60020,1369960502544; 1 (online=1, available=1) available servers, forceNewPlan=false - org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of .META.,,1.1028785192 to 192.168.0.100,60020,1369960502544, trying to assign elsewhere instead; try=1 of 10 java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:549) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:813) at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1422) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1315) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1532) at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1587) at org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.openRegion(AdminProtos.java:15039) at org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:627) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1826) at
[jira] [Commented] (HBASE-8015) Support for Namespaces
[ https://issues.apache.org/jira/browse/HBASE-8015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689419#comment-13689419 ] Francis Liu commented on HBASE-8015: I thought of a way to implement [~sershe] idea. It was simple enough so I thought I'd give it a try. Essentially keep an in-memory list of tables which make use of the delimiter (ie '.') and consider these tables as exceptions to the namespace rule and handle them properly to make sure they are part of the default namespace. Have an added constraint that prevent creation of namespaces and tables that would conflict with any of the exception tables (ie ns1 and ns1.foo). Surprise here is: - you can't create tables with the delimiter no longer unless you create the appropriate namespace. - you can't create tables/namespace which conflict the exception tables/namespaces - the exception list is derived by scanning the default namespace directories in .tmp, .data and .archive Here's a sample of how it works. I've updated the TestNamespaceUpgrade test to verify that it works: https://github.com/francisliu/hbase_namespace/tree/core_8408_exception_list Support for Namespaces -- Key: HBASE-8015 URL: https://issues.apache.org/jira/browse/HBASE-8015 Project: HBase Issue Type: New Feature Reporter: Francis Liu Assignee: Francis Liu Attachments: HBASE-8015_draft_94.patch, Namespace Design.pdf -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8060) Num compacting KVs diverges from num compacted KVs over time
[ https://issues.apache.org/jira/browse/HBASE-8060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689424#comment-13689424 ] Sergey Shelukhin commented on HBASE-8060: - The meaning of total compacting KVs is the number of KVs in the input for last-started compaction (this historically assumes one compaction at a time per store). I.e. either current compaction or the last finished one. It is an estimate of the number of KVs compaction will see, used to track progress. Patch does change meaning of it to reconcile the (often incorrect) estimate with reality.. Num compacting KVs diverges from num compacted KVs over time Key: HBASE-8060 URL: https://issues.apache.org/jira/browse/HBASE-8060 Project: HBase Issue Type: Bug Components: Compaction, UI Affects Versions: 0.94.6, 0.95.0, 0.95.2 Reporter: Andrew Purtell Assignee: Sergey Shelukhin Attachments: HBASE-8060-v0.patch, screenshot.png I have been running what amounts to an ingestion test for a day or so. This is an all-in-one cluster launched with './bin/hbase master start' from sources. In the RS stats on the master UI, the num compacting KVs has diverged from num compacted KVs even though compaction has been completed from perspective of selection, no compaction tasks are running on the RS. I think this could be confusing -- is compaction happening or not? Or maybe I'm misunderstanding what this is supposed to show? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8721) Deletes can mask puts that happen after the delete
[ https://issues.apache.org/jira/browse/HBASE-8721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689435#comment-13689435 ] Lars Hofhansl commented on HBASE-8721: -- KEEP_DELETED_CELLS would still work fine, but their main goal is to allow correct point-in-time-queries, which among others is important for consistent backups. Regarding all the points above. Let's please not go overboard. Now we're extending this to Puts as well, and are saying that a Put that hits the RegionServer later should be considered newer even if its TS is old, this opens another can of worms. It is unlikely that this will be changed as you have to a find committers to +1 this. All we got up to this points are a -1 unless it is configurable and a couple of -0s. Deletes can mask puts that happen after the delete -- Key: HBASE-8721 URL: https://issues.apache.org/jira/browse/HBASE-8721 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Feng Honghua Attachments: HBASE-8721-0.94-V0.patch this fix aims for bug mentioned in http://hbase.apache.org/book.html 5.8.2.1: Deletes mask puts, even puts that happened after the delete was entered. Remember that a delete writes a tombstone, which only disappears after then next major compaction has run. Suppose you do a delete of everything = T. After this you do a new put with a timestamp = T. This put, even if it happened after the delete, will be masked by the delete tombstone. Performing the put will not fail, but when you do a get you will notice the put did have no effect. It will start working again after the major compaction has run. These issues should not be a problem if you use always-increasing versions for new puts to a row. But they can occur even if you do not care about time: just do delete and put immediately after each other, and there is some chance they happen within the same millisecond. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8771) ensure replication_scope's value is either local(0) or global(1)
[ https://issues.apache.org/jira/browse/HBASE-8771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689439#comment-13689439 ] Demai Ni commented on HBASE-8771: - [~ctrezzo] , sorry that I didn't explain clear the first time. Although the setScope is currently only used in such the columndescriptor constructor, the setScope() is a public method. so user can do this {code} ... HTableDescriptor ht = new HTableDescriptor( t3_dn ); HColumnDescriptor cfd = new HColumnDescriptor( cf1 ); cfd.setScope(-1000); ht.addFamily( cfd ); ... {code} so if the checking is put inside the constructor (similar as the logic for minVersions and maxVersions), the above code will not be caught. ensure replication_scope's value is either local(0) or global(1) Key: HBASE-8771 URL: https://issues.apache.org/jira/browse/HBASE-8771 Project: HBase Issue Type: Bug Components: Replication Affects Versions: 0.94.8 Reporter: Demai Ni Priority: Minor Fix For: 0.94.9 Attachments: HBASE-8771-0.94.8-v0.patch For replication_scope, only two values are meaningful: {code} public static final int REPLICATION_SCOPE_LOCAL = 0; public static final int REPLICATION_SCOPE_GLOBAL = 1; {code} However, there is no checking for that, so currently user can set it to any integer value. And all non-zero value will be treated as 1(GLOBAL). This jira is to add a checking in HColumnDescriptor#setScope() so that only 0 and 1 will be accept during create_table or alter_table. In the future, we can leverage replication_scope to store for info. For example: -1: A columnfam is replicated from another cluster in MASTER_SLAVE setup (i.e readonly) 2 : A columnfam is set MASTER_MASTER Probably a major improve JIRA is needed for the future usage. It will be good to ensure the scope value at this moment. {code:title=Testing|borderStyle=solid} hbase(main):002:0 create 't1_dn',{NAME='cf1',REPLICATION_SCOPE=2} ERROR: java.lang.IllegalArgumentException: Replication Scope must be either 0(local) or 1(global) ... hbase(main):004:0 alter 't1_dn',{NAME='cf1',REPLICATION_SCOPE=-1} ERROR: java.lang.IllegalArgumentException: Replication Scope must be either 0(local) or 1(global) ... {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8753) Provide new delete flag which can delete all cells under a column-family which have a same designated timestamp
[ https://issues.apache.org/jira/browse/HBASE-8753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689441#comment-13689441 ] Lars Hofhansl commented on HBASE-8753: -- May you not run into this case in isDeleted: {code} int ret = Bytes.compareTo(deleteBuffer, deleteOffset, deleteLength, buffer, qualifierOffset, qualifierLength); if (ret == 0) { ... } else if(ret 0){ ... } else { throw new IllegalStateException(isDelete failed: deleteBuffer= ... {code} In any case, we should just test it: Write an HFile with a new RS, start an old RS, scan that file, check that it works fine. Provide new delete flag which can delete all cells under a column-family which have a same designated timestamp --- Key: HBASE-8753 URL: https://issues.apache.org/jira/browse/HBASE-8753 Project: HBase Issue Type: New Feature Components: Deletes Reporter: Feng Honghua Attachments: HBASE-8753-0.94-V0.patch, HBASE-8753-trunk-V0.patch In one of our production scenario (Xiaomi message search), multiple cells will be put in batch using a same timestamp with different column names under a specific column-family. And after some time these cells also need to be deleted in batch by given a specific timestamp. But the column names are parsed tokens which can be arbitrary words , so such batch delete is impossible without first retrieving all KVs from that CF and get the column name list which has KV with that given timestamp, and then issuing individual deleteColumn for each column in that column-list. Though it's possible to do such batch delete, its performance is poor, and customers also find their code is quite clumsy by first retrieving and populating the column list and then issuing a deleteColumn for each column in that column-list. This feature resolves this problem by introducing a new delete flag: DeleteFamilyVersion. 1). When you need to delete all KVs under a column-family with a given timestamp, just call Delete.deleteFamilyVersion(cfName, timestamp); only a DeleteFamilyVersion type KV is put to HBase (like DeleteFamily / DeleteColumn / Delete) without read operation; 2). Like other delete types, DeleteFamilyVersion takes effect in get/scan/flush/compact operations, the ScanDeleteTracker now parses out and uses DeleteFamilyVersion to prevent all KVs under the specific CF which has the same timestamp as the DeleteFamilyVersion KV to pop-up as part of a get/scan result (also in flush/compact). Our customers find this feature efficient, clean and easy-to-use since it does its work without knowing the exact column names list that needs to be deleted. This feature has been running smoothly for a couple of months in our production clusters. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8060) Num compacting KVs diverges from num compacted KVs over time
[ https://issues.apache.org/jira/browse/HBASE-8060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689442#comment-13689442 ] Lars Hofhansl commented on HBASE-8060: -- I see. So it does not really change the meaning - it's still the number of KV compacted in last compaction - it just corrects the value. Num compacting KVs diverges from num compacted KVs over time Key: HBASE-8060 URL: https://issues.apache.org/jira/browse/HBASE-8060 Project: HBase Issue Type: Bug Components: Compaction, UI Affects Versions: 0.94.6, 0.95.0, 0.95.2 Reporter: Andrew Purtell Assignee: Sergey Shelukhin Attachments: HBASE-8060-v0.patch, screenshot.png I have been running what amounts to an ingestion test for a day or so. This is an all-in-one cluster launched with './bin/hbase master start' from sources. In the RS stats on the master UI, the num compacting KVs has diverged from num compacted KVs even though compaction has been completed from perspective of selection, no compaction tasks are running on the RS. I think this could be confusing -- is compaction happening or not? Or maybe I'm misunderstanding what this is supposed to show? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8771) ensure replication_scope's value is either local(0) or global(1)
[ https://issues.apache.org/jira/browse/HBASE-8771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689449#comment-13689449 ] Demai Ni commented on HBASE-8771: - [~anoop.hbase], thanks for you comments. since we plan to use values other than 0 or 1 in the future, it may be better to block other values now to avoid conflict in the future. For example, today a userA can use value -1 and 2 for scope, and hbase code will treat it as '1' (global replication). Then, a future JIRA gives value '2' another feature. userA will face a difficult replication behavior. With that, it is better to block values such as -1 and 2 earlier to reduce such potential issues. ensure replication_scope's value is either local(0) or global(1) Key: HBASE-8771 URL: https://issues.apache.org/jira/browse/HBASE-8771 Project: HBase Issue Type: Bug Components: Replication Affects Versions: 0.94.8 Reporter: Demai Ni Priority: Minor Fix For: 0.94.9 Attachments: HBASE-8771-0.94.8-v0.patch For replication_scope, only two values are meaningful: {code} public static final int REPLICATION_SCOPE_LOCAL = 0; public static final int REPLICATION_SCOPE_GLOBAL = 1; {code} However, there is no checking for that, so currently user can set it to any integer value. And all non-zero value will be treated as 1(GLOBAL). This jira is to add a checking in HColumnDescriptor#setScope() so that only 0 and 1 will be accept during create_table or alter_table. In the future, we can leverage replication_scope to store for info. For example: -1: A columnfam is replicated from another cluster in MASTER_SLAVE setup (i.e readonly) 2 : A columnfam is set MASTER_MASTER Probably a major improve JIRA is needed for the future usage. It will be good to ensure the scope value at this moment. {code:title=Testing|borderStyle=solid} hbase(main):002:0 create 't1_dn',{NAME='cf1',REPLICATION_SCOPE=2} ERROR: java.lang.IllegalArgumentException: Replication Scope must be either 0(local) or 1(global) ... hbase(main):004:0 alter 't1_dn',{NAME='cf1',REPLICATION_SCOPE=-1} ERROR: java.lang.IllegalArgumentException: Replication Scope must be either 0(local) or 1(global) ... {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8774) Add BatchSize and Filter to Thrift2
[ https://issues.apache.org/jira/browse/HBASE-8774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689454#comment-13689454 ] Hamed Madani commented on HBASE-8774: - 'true' is there because the rest of the thrift and thrift2 had that format. For example {code} boolean this_present_columns = true this.isSetColumns(); boolean that_present_columns = true that.isSetColumns(); {code} HBASE-6073 patch is missing the modification to thrift2.generated.TScan.java and thrift2.generated.TGet.java. Add BatchSize and Filter to Thrift2 --- Key: HBASE-8774 URL: https://issues.apache.org/jira/browse/HBASE-8774 Project: HBase Issue Type: New Feature Components: Thrift Affects Versions: 0.95.1 Reporter: Hamed Madani Attachments: HBASE_8774.patch Attached Patch will add BatchSize and Filter support to Thrift2 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8015) Support for Namespaces
[ https://issues.apache.org/jira/browse/HBASE-8015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689468#comment-13689468 ] Francis Liu commented on HBASE-8015: Oops sorry I guess I'm talking about two scripts. One to check if some surprising migration needs to be done and provide links/options. And another that does the actual migration. Support for Namespaces -- Key: HBASE-8015 URL: https://issues.apache.org/jira/browse/HBASE-8015 Project: HBase Issue Type: New Feature Reporter: Francis Liu Assignee: Francis Liu Attachments: HBASE-8015_draft_94.patch, Namespace Design.pdf -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6295) Possible performance improvement in client batch operations: presplit and send in background
[ https://issues.apache.org/jira/browse/HBASE-6295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689480#comment-13689480 ] Jean-Marc Spaggiari commented on HBASE-6295: ||Test||Trunk||Nic||0.95|| |org.apache.hadoop.hbase.PerformanceEvaluation$RandomReadTest|761449.8|738362.4|754100| |org.apache.hadoop.hbase.PerformanceEvaluation$RandomScanWithRange100Test|21858.7|22356.7|22400.7| |org.apache.hadoop.hbase.PerformanceEvaluation$RandomSeekScanTest|13.6|138179.3|134186.7| |org.apache.hadoop.hbase.PerformanceEvaluation$RandomWriteTest|114272.9|76990.3|114798.1| |org.apache.hadoop.hbase.PerformanceEvaluation$SequentialWriteTest|77144.275|24582.425|79107.25| so Trunk and 0.95 are consistent, while Nic's version show a nice improvement on the write operations (both Random and Sequentials), and a very small degradation on SeekScan. Also a small improvement on RandomRead. Do you need the IntegrationTestBigLinkedList for the 3 releases too? Possible performance improvement in client batch operations: presplit and send in background Key: HBASE-6295 URL: https://issues.apache.org/jira/browse/HBASE-6295 Project: HBase Issue Type: Improvement Components: Client, Performance Affects Versions: 0.95.2 Reporter: Nicolas Liochon Assignee: Nicolas Liochon Labels: noob Fix For: 0.98.0 Attachments: 6295.v11.patch, 6295.v12.patch, 6295.v14.patch, 6295.v1.patch, 6295.v2.patch, 6295.v3.patch, 6295.v4.patch, 6295.v5.patch, 6295.v6.patch, 6295.v8.patch, 6295.v9.patch today batch algo is: {noformat} for Operation o: ListOp{ add o to todolist if todolist maxsize or o last in list split todolist per location send split lists to region servers clear todolist wait } {noformat} We could: - create immediately the final object instead of an intermediate array - split per location immediately - instead of sending when the list as a whole is full, send it when there is enough data for a single location It would be: {noformat} for Operation o: ListOp{ get location add o to todo location.todolist if (location.todolist maxLocationSize) send location.todolist to region server clear location.todolist // don't wait, continue the loop } send remaining wait {noformat} It's not trivial to write if you add error management: retried list must be shared with the operations added in the todolist. But it's doable. It's interesting mainly for 'big' writes -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-8775) Throttle online schema changes.
Shane Hogan created HBASE-8775: -- Summary: Throttle online schema changes. Key: HBASE-8775 URL: https://issues.apache.org/jira/browse/HBASE-8775 Project: HBase Issue Type: Improvement Components: master Affects Versions: 0.89-fb Reporter: Shane Hogan Priority: Minor Fix For: 0.89-fb Throttle the open and close of the regions after an online schema change -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8755) A new write thread model for HLog to improve the overall HBase write throughput
[ https://issues.apache.org/jira/browse/HBASE-8755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689492#comment-13689492 ] Jean-Marc Spaggiari commented on HBASE-8755: Sure! Let me prepare that. I will read this JIRA from the beginning and try to start the tests today. A new write thread model for HLog to improve the overall HBase write throughput --- Key: HBASE-8755 URL: https://issues.apache.org/jira/browse/HBASE-8755 Project: HBase Issue Type: Improvement Components: wal Reporter: Feng Honghua Attachments: HBASE-8755-0.94-V0.patch, HBASE-8755-0.94-V1.patch, HBASE-8755-trunk-V0.patch In current write model, each write handler thread (executing put()) will individually go through a full 'append (hlog local buffer) = HLog writer append (write to hdfs) = HLog writer sync (sync hdfs)' cycle for each write, which incurs heavy race condition on updateLock and flushLock. The only optimization where checking if current syncTillHere txid in expectation for other thread help write/sync its own txid to hdfs and omitting the write/sync actually help much less than expectation. Three of my colleagues(Ye Hangjun / Wu Zesheng / Zhang Peng) at Xiaomi proposed a new write thread model for writing hdfs sequence file and the prototype implementation shows a 4X improvement for throughput (from 17000 to 7+). I apply this new write thread model in HLog and the performance test in our test cluster shows about 3X throughput improvement (from 12150 to 31520 for 1 RS, from 22000 to 7 for 5 RS), the 1 RS write throughput (1K row-size) even beats the one of BigTable (Precolator published in 2011 says Bigtable's write throughput then is 31002). I can provide the detailed performance test results if anyone is interested. The change for new write thread model is as below: 1 All put handler threads append the edits to HLog's local pending buffer; (it notifies AsyncWriter thread that there is new edits in local buffer) 2 All put handler threads wait in HLog.syncer() function for underlying threads to finish the sync that contains its txid; 3 An single AsyncWriter thread is responsible for retrieve all the buffered edits in HLog's local pending buffer and write to the hdfs (hlog.writer.append); (it notifies AsyncFlusher thread that there is new writes to hdfs that needs a sync) 4 An single AsyncFlusher thread is responsible for issuing a sync to hdfs to persist the writes by AsyncWriter; (it notifies the AsyncNotifier thread that sync watermark increases) 5 An single AsyncNotifier thread is responsible for notifying all pending put handler threads which are waiting in the HLog.syncer() function 6 No LogSyncer thread any more (since there is always AsyncWriter/AsyncFlusher threads do the same job it does) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8771) ensure replication_scope's value is either local(0) or global(1)
[ https://issues.apache.org/jira/browse/HBASE-8771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689489#comment-13689489 ] Demai Ni commented on HBASE-8771: - [~ctrezzo], another interesting finding while I was playing different approaches (shell vs constructor). Using maxVersion as example, which has code like below to check value: {code} if (maxVersions = 0) { // TODO: Allow maxVersion of 0 to be the way you say Keep all versions. // Until there is support, consider 0 or 0 -- a configuration error. throw new IllegalArgumentException(Maximum versions must be positive); } {code} Above code can catch the illegal arg only when user call the HColumnDescriptor constructor, but won't work in hbase shell or call setMaxVersion() direclty. {code:title=set Max Version = -1 in Shell. No Error thrown because shell called setMaxVersions directly} hbase(main):016:0 create 't5_dn',{NAME='cf1',VERSIONS=-1} 0 row(s) in 1.0420 seconds hbase(main):017:0 put 't5_dn','row1','cf1:q1','row1cf1_v1' 0 row(s) in 0.0700 seconds hbase(main):018:0 scan 't5_dn' ROW COLUMN+CELL 0 row(s) in 0.0090 seconds hbase(main):019:0 describe 't5_dn' DESCRIPTION ENABLED 't5_dn', {NAME = 'cf1', VERSIONS = '-1',...} {code} {code:title=set Max Version = -999 through constructor. Error caught inside } HTableDescriptor ht = new HTableDescriptor( t3_dn ); HColumnDescriptor cfd = new HColumnDescriptor(Bytes.toBytes( cf1),-999,NONE,false,false,100,NONE); ... Exception in thread main java.lang.IllegalArgumentException: Maximum versions must be positive at org.apache.hadoop.hbase.HColumnDescriptor.init(HColumnDescriptor.java:386) at org.apache.hadoop.hbase.HColumnDescriptor.init(HColumnDescriptor.java:334) at org.apache.hadoop.hbase.HColumnDescriptor.init(HColumnDescriptor.java:302) at CreateTable_version.main(CreateTable_version.java:23) {code} ensure replication_scope's value is either local(0) or global(1) Key: HBASE-8771 URL: https://issues.apache.org/jira/browse/HBASE-8771 Project: HBase Issue Type: Bug Components: Replication Affects Versions: 0.94.8 Reporter: Demai Ni Priority: Minor Fix For: 0.94.9 Attachments: HBASE-8771-0.94.8-v0.patch For replication_scope, only two values are meaningful: {code} public static final int REPLICATION_SCOPE_LOCAL = 0; public static final int REPLICATION_SCOPE_GLOBAL = 1; {code} However, there is no checking for that, so currently user can set it to any integer value. And all non-zero value will be treated as 1(GLOBAL). This jira is to add a checking in HColumnDescriptor#setScope() so that only 0 and 1 will be accept during create_table or alter_table. In the future, we can leverage replication_scope to store for info. For example: -1: A columnfam is replicated from another cluster in MASTER_SLAVE setup (i.e readonly) 2 : A columnfam is set MASTER_MASTER Probably a major improve JIRA is needed for the future usage. It will be good to ensure the scope value at this moment. {code:title=Testing|borderStyle=solid} hbase(main):002:0 create 't1_dn',{NAME='cf1',REPLICATION_SCOPE=2} ERROR: java.lang.IllegalArgumentException: Replication Scope must be either 0(local) or 1(global) ... hbase(main):004:0 alter 't1_dn',{NAME='cf1',REPLICATION_SCOPE=-1} ERROR: java.lang.IllegalArgumentException: Replication Scope must be either 0(local) or 1(global) ... {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8627) HBCK can not fix meta not assigned issue
[ https://issues.apache.org/jira/browse/HBASE-8627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689495#comment-13689495 ] Sergey Shelukhin commented on HBASE-8627: - lgtm HBCK can not fix meta not assigned issue Key: HBASE-8627 URL: https://issues.apache.org/jira/browse/HBASE-8627 Project: HBase Issue Type: Bug Components: hbck Affects Versions: 0.95.0 Reporter: Anoop Sam John Assignee: Anoop Sam John Attachments: HBASE-8627_Trunk.patch, HBASE-8627_Trunk-V2.patch When meta table region is not assigned to any RS, HBCK run will get exception. I can see code added in checkMetaRegion() to solve this issue but it wont work. It still refers to ROOT region! -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8015) Support for Namespaces
[ https://issues.apache.org/jira/browse/HBASE-8015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689501#comment-13689501 ] Sergey Shelukhin commented on HBASE-8015: - {code}exceptionNS.add(tableName.getNamespaceAsString()); {code} What is the current thinking on dots in namespaces and names? Presumably one table could prevent the creation of multiple namespaces if dots are allowed in namespace name, which I thought they are Support for Namespaces -- Key: HBASE-8015 URL: https://issues.apache.org/jira/browse/HBASE-8015 Project: HBase Issue Type: New Feature Reporter: Francis Liu Assignee: Francis Liu Attachments: HBASE-8015_draft_94.patch, Namespace Design.pdf -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8662) [rest] support impersonation
[ https://issues.apache.org/jira/browse/HBASE-8662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HBASE-8662: --- Status: Open (was: Patch Available) [rest] support impersonation Key: HBASE-8662 URL: https://issues.apache.org/jira/browse/HBASE-8662 Project: HBase Issue Type: Sub-task Components: REST, security Reporter: Jimmy Xiang Assignee: Jimmy Xiang Fix For: 0.98.0 Attachments: method_doas.patch, secure_rest.patch, trunk-8662.patch, trunk-8662_v2.patch, trunk-8662_v3.patch Currently, our client API uses a fixed user: the current user. It should accept a user passed in, if authenticated. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-8721) Deletes can mask puts that happen after the delete
[ https://issues.apache.org/jira/browse/HBASE-8721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell resolved HBASE-8721. --- Resolution: Won't Fix bq. It is unlikely that this will be changed as you have to a find committers to +1 this. All we got up to this points are a -1 unless it is configurable and a couple of -0s. Agreed, resolved as WONTFIX. Interested parties are encouraged to go to the followups HBASE-8763 and HBASE-8770 Deletes can mask puts that happen after the delete -- Key: HBASE-8721 URL: https://issues.apache.org/jira/browse/HBASE-8721 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Feng Honghua Attachments: HBASE-8721-0.94-V0.patch this fix aims for bug mentioned in http://hbase.apache.org/book.html 5.8.2.1: Deletes mask puts, even puts that happened after the delete was entered. Remember that a delete writes a tombstone, which only disappears after then next major compaction has run. Suppose you do a delete of everything = T. After this you do a new put with a timestamp = T. This put, even if it happened after the delete, will be masked by the delete tombstone. Performing the put will not fail, but when you do a get you will notice the put did have no effect. It will start working again after the major compaction has run. These issues should not be a problem if you use always-increasing versions for new puts to a row. But they can occur even if you do not care about time: just do delete and put immediately after each other, and there is some chance they happen within the same millisecond. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-3149) Make flush decisions per column family
[ https://issues.apache.org/jira/browse/HBASE-3149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Himanshu Vashishtha updated HBASE-3149: --- Assignee: (was: Himanshu Vashishtha) I figured that it increases mttr time. I will probably look into it after we fixed mttr issues of late. Un-assigning it for the meanwhile. Make flush decisions per column family -- Key: HBASE-3149 URL: https://issues.apache.org/jira/browse/HBASE-3149 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Karthik Ranganathan Priority: Critical Fix For: 0.92.3 Today, the flush decision is made using the aggregate size of all column families. When large and small column families co-exist, this causes many small flushes of the smaller CF. We need to make per-CF flush decisions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8015) Support for Namespaces
[ https://issues.apache.org/jira/browse/HBASE-8015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689512#comment-13689512 ] stack commented on HBASE-8015: -- [~toffer] There will be a migration evaluation script that will looks for presence of stuff like hfile v1s -- they must be compacted away before you can upgrade -- and this same step could check table names and if any w/ dot found, list options. This script would be run against 0.94 install before shutting down for upgrade (Yes two scripts, a checker, and then a doer). Francis, we should still do the Elliott suggestion even if dot, right? The dot would be for 'external' tools or a useful facility in shell but we want namespaces to be first class in API too. Did you get my review comments up on rb francis? On dots in namespace, no, if it simplifies. Support for Namespaces -- Key: HBASE-8015 URL: https://issues.apache.org/jira/browse/HBASE-8015 Project: HBase Issue Type: New Feature Reporter: Francis Liu Assignee: Francis Liu Attachments: HBASE-8015_draft_94.patch, Namespace Design.pdf -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8759) Family Delete Markers not getting purged after major compaction
[ https://issues.apache.org/jira/browse/HBASE-8759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689513#comment-13689513 ] James Taylor commented on HBASE-8759: - Thanks, @larsh. Now aren't you supposed to be on vacation! Go drink a liter or two of beer! :-) Family Delete Markers not getting purged after major compaction --- Key: HBASE-8759 URL: https://issues.apache.org/jira/browse/HBASE-8759 Project: HBase Issue Type: Bug Components: Compaction Affects Versions: 0.94.7 Reporter: Mujtaba Chohan Priority: Minor On table with VERSIONS = '1', KEEP_DELETED_CELLS = 'true'. Family Delete Markers does not get purged after put delete major compaction (they keep on incrementing after every put delete major compaction) Following is the raw scan output after 10 iterations of put delete major compaction. ROW COLUMN+CELL Acolumn=CF:, timestamp=1371512706683, type=DeleteFamily Acolumn=CF:, timestamp=1371512706394, type=DeleteFamily Acolumn=CF:, timestamp=1371512706054, type=DeleteFamily Acolumn=CF:, timestamp=1371512705763, type=DeleteFamily Acolumn=CF:, timestamp=1371512705457, type=DeleteFamily Acolumn=CF:, timestamp=1371512705149, type=DeleteFamily Acolumn=CF:, timestamp=1371512704836, type=DeleteFamily Acolumn=CF:, timestamp=1371512704518, type=DeleteFamily Acolumn=CF:, timestamp=1371512704162, type=DeleteFamily Acolumn=CF:, timestamp=1371512703779, type=DeleteFamily Acolumn=CF:COL, timestamp=1371512706682, value=X [~lhofhansl] Code to repro this issue: http://phoenix-bin.github.io/client/code/delete.java -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-3149) Make flush decisions per column family
[ https://issues.apache.org/jira/browse/HBASE-3149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-3149: - Fix Version/s: (was: 0.92.3) Make flush decisions per column family -- Key: HBASE-3149 URL: https://issues.apache.org/jira/browse/HBASE-3149 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Karthik Ranganathan Priority: Critical Today, the flush decision is made using the aggregate size of all column families. When large and small column families co-exist, this causes many small flushes of the smaller CF. We need to make per-CF flush decisions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6295) Possible performance improvement in client batch operations: presplit and send in background
[ https://issues.apache.org/jira/browse/HBASE-6295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689519#comment-13689519 ] Nicolas Liochon commented on HBASE-6295: Can I do 2097152 / 79 = 26500 to compare with the performances tests previously described in http://www.spaggiari.org/media/blogs/hbase/pictures/performances_20130321.pdf? Because the performances were better previously (~35k / rows second). Same for 2097152 / 114 = 18396 vs. ~30k Or is it calculated differently? Anyway, thanks a lot for all these great tests. I will commit tomorrow morning my time if there is no objection. Possible performance improvement in client batch operations: presplit and send in background Key: HBASE-6295 URL: https://issues.apache.org/jira/browse/HBASE-6295 Project: HBase Issue Type: Improvement Components: Client, Performance Affects Versions: 0.95.2 Reporter: Nicolas Liochon Assignee: Nicolas Liochon Labels: noob Fix For: 0.98.0 Attachments: 6295.v11.patch, 6295.v12.patch, 6295.v14.patch, 6295.v1.patch, 6295.v2.patch, 6295.v3.patch, 6295.v4.patch, 6295.v5.patch, 6295.v6.patch, 6295.v8.patch, 6295.v9.patch today batch algo is: {noformat} for Operation o: ListOp{ add o to todolist if todolist maxsize or o last in list split todolist per location send split lists to region servers clear todolist wait } {noformat} We could: - create immediately the final object instead of an intermediate array - split per location immediately - instead of sending when the list as a whole is full, send it when there is enough data for a single location It would be: {noformat} for Operation o: ListOp{ get location add o to todo location.todolist if (location.todolist maxLocationSize) send location.todolist to region server clear location.todolist // don't wait, continue the loop } send remaining wait {noformat} It's not trivial to write if you add error management: retried list must be shared with the operations added in the todolist. But it's doable. It's interesting mainly for 'big' writes -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8662) [rest] support impersonation
[ https://issues.apache.org/jira/browse/HBASE-8662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689521#comment-13689521 ] Jimmy Xiang commented on HBASE-8662: The AuthFilter in your patch looks very familiar. Got from Ooize? [rest] support impersonation Key: HBASE-8662 URL: https://issues.apache.org/jira/browse/HBASE-8662 Project: HBase Issue Type: Sub-task Components: REST, security Reporter: Jimmy Xiang Assignee: Jimmy Xiang Fix For: 0.98.0 Attachments: method_doas.patch, secure_rest.patch, trunk-8662.patch, trunk-8662_v2.patch, trunk-8662_v3.patch Currently, our client API uses a fixed user: the current user. It should accept a user passed in, if authenticated. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Comment Edited] (HBASE-8662) [rest] support impersonation
[ https://issues.apache.org/jira/browse/HBASE-8662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689521#comment-13689521 ] Jimmy Xiang edited comment on HBASE-8662 at 6/20/13 6:59 PM: - The AuthFilter in your patch looks very familiar. Got from Ooize? Do we need the optionsServlet? was (Author: jxiang): The AuthFilter in your patch looks very familiar. Got from Ooize? [rest] support impersonation Key: HBASE-8662 URL: https://issues.apache.org/jira/browse/HBASE-8662 Project: HBase Issue Type: Sub-task Components: REST, security Reporter: Jimmy Xiang Assignee: Jimmy Xiang Fix For: 0.98.0 Attachments: method_doas.patch, secure_rest.patch, trunk-8662.patch, trunk-8662_v2.patch, trunk-8662_v3.patch Currently, our client API uses a fixed user: the current user. It should accept a user passed in, if authenticated. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6295) Possible performance improvement in client batch operations: presplit and send in background
[ https://issues.apache.org/jira/browse/HBASE-6295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689578#comment-13689578 ] Jean-Marc Spaggiari commented on HBASE-6295: It's time for x lines, depending of the tests it's not the same number of lines. For RandomReadTest you need to divide by 1048576 For RandomScanWithRange100Test you need to divide by 4096 For RandomSeekScanTest you need to divide by 40960. For RandomWriteTest you need to divide by 1048576 For SequentialWriteTest you need to divide by 1048576 This is the number of lines per ms. So multiply by 1000 to have the same result. Some are rows/minutes, so just adjust that. So if you want to compare, here are the numbers in the same format as te PDF that I usually produce: ||Test||Trunk||Nic||0.95|| |org,apache,hadoop,hbase,PerformanceEvaluation$RandomReadTest|1377.08|1420.14|1390.50| |org,apache,hadoop,hbase,PerformanceEvaluation$RandomScanWithRange100Test|11243.12|10992.68|10971.09| |org,apache,hadoop,hbase,PerformanceEvaluation$RandomSeekScanTest|304.66|296.43|305.25| |org,apache,hadoop,hbase,PerformanceEvaluation$RandomWriteTest|9176.07|13619.59|9134.09| |org,apache,hadoop,hbase,PerformanceEvaluation$SequentialWriteTest|13592.40|42655.52|13255.12| I already noticed the RandomWriteTest impact compared to 0.94 branch and 0.95... I will re-run the 0.94 tests to make sure, but overall, I really think 0.95 is not doing as good as 0.95 for the RandomWriteTest. Possible performance improvement in client batch operations: presplit and send in background Key: HBASE-6295 URL: https://issues.apache.org/jira/browse/HBASE-6295 Project: HBase Issue Type: Improvement Components: Client, Performance Affects Versions: 0.95.2 Reporter: Nicolas Liochon Assignee: Nicolas Liochon Labels: noob Fix For: 0.98.0 Attachments: 6295.v11.patch, 6295.v12.patch, 6295.v14.patch, 6295.v1.patch, 6295.v2.patch, 6295.v3.patch, 6295.v4.patch, 6295.v5.patch, 6295.v6.patch, 6295.v8.patch, 6295.v9.patch today batch algo is: {noformat} for Operation o: ListOp{ add o to todolist if todolist maxsize or o last in list split todolist per location send split lists to region servers clear todolist wait } {noformat} We could: - create immediately the final object instead of an intermediate array - split per location immediately - instead of sending when the list as a whole is full, send it when there is enough data for a single location It would be: {noformat} for Operation o: ListOp{ get location add o to todo location.todolist if (location.todolist maxLocationSize) send location.todolist to region server clear location.todolist // don't wait, continue the loop } send remaining wait {noformat} It's not trivial to write if you add error management: retried list must be shared with the operations added in the todolist. But it's doable. It's interesting mainly for 'big' writes -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8662) [rest] support impersonation
[ https://issues.apache.org/jira/browse/HBASE-8662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689623#comment-13689623 ] Francis Liu commented on HBASE-8662: Yep. OptionServlet, what for? [rest] support impersonation Key: HBASE-8662 URL: https://issues.apache.org/jira/browse/HBASE-8662 Project: HBase Issue Type: Sub-task Components: REST, security Reporter: Jimmy Xiang Assignee: Jimmy Xiang Fix For: 0.98.0 Attachments: method_doas.patch, secure_rest.patch, trunk-8662.patch, trunk-8662_v2.patch, trunk-8662_v3.patch Currently, our client API uses a fixed user: the current user. It should accept a user passed in, if authenticated. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8015) Support for Namespaces
[ https://issues.apache.org/jira/browse/HBASE-8015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689631#comment-13689631 ] Francis Liu commented on HBASE-8015: {quote} Francis, we should still do the Elliott suggestion even if dot, right? The dot would be for 'external' tools or a useful facility in shell but we want namespaces to be first class in API too. {quote} The approach I proposed earlier would avoid having to do all the api stuff as part of the first namespace checkin as well as make use of '.' as a delimeter. The suprises are as I mentioned. We can incrementally add the apis. Sounds like we are going with overloading all the existing apis to take a namespace parameter. If so what would be the behavior when using the old api? Will it always reference default namespace or will we support fully qualified table names? For some reason I'm not getting any jira or RB emails. Will take a look. Support for Namespaces -- Key: HBASE-8015 URL: https://issues.apache.org/jira/browse/HBASE-8015 Project: HBase Issue Type: New Feature Reporter: Francis Liu Assignee: Francis Liu Attachments: HBASE-8015_draft_94.patch, Namespace Design.pdf -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8721) Deletes can mask puts that happen after the delete
[ https://issues.apache.org/jira/browse/HBASE-8721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689717#comment-13689717 ] Sergey Shelukhin commented on HBASE-8721: - btw, HBase does support point version deletes as far as I see. So specific version can be deleted if desired. Should we add APIs to delete latest version? We can even add API to delete all existing versions, won't be very efficient with many versions (scan or get+bunch of deletes on server side), but it will work without changing internals Deletes can mask puts that happen after the delete -- Key: HBASE-8721 URL: https://issues.apache.org/jira/browse/HBASE-8721 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Feng Honghua Attachments: HBASE-8721-0.94-V0.patch this fix aims for bug mentioned in http://hbase.apache.org/book.html 5.8.2.1: Deletes mask puts, even puts that happened after the delete was entered. Remember that a delete writes a tombstone, which only disappears after then next major compaction has run. Suppose you do a delete of everything = T. After this you do a new put with a timestamp = T. This put, even if it happened after the delete, will be masked by the delete tombstone. Performing the put will not fail, but when you do a get you will notice the put did have no effect. It will start working again after the major compaction has run. These issues should not be a problem if you use always-increasing versions for new puts to a row. But they can occur even if you do not care about time: just do delete and put immediately after each other, and there is some chance they happen within the same millisecond. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-1177) Delay when client is located on the same node as the regionserver
[ https://issues.apache.org/jira/browse/HBASE-1177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack resolved HBASE-1177. -- Resolution: Invalid Resolving as no longer valid. Looks like Nagles' anyways. Delay when client is located on the same node as the regionserver - Key: HBASE-1177 URL: https://issues.apache.org/jira/browse/HBASE-1177 Project: HBase Issue Type: Bug Components: Performance Affects Versions: 0.19.0 Environment: Linux 2.6.25 x86_64 Reporter: Jonathan Gray Labels: noob Attachments: Contribution of getClosest to getRow time.jpg, Contribution of next to getRow time.jpg, Contribution of seekTo to getClosest time.jpg, Elapsed time of RowResults.readFields.jpg, getRow + round-trip vs # columns.jpg, getRow times.jpg, ReadDelayTest.java, RowResults.readFields zoomed.jpg, screenshot-1.jpg, screenshot-2.jpg, screenshot-3.jpg, screenshot-4.jpg, zoom of columns vs round-trip blowup.jpg During testing of HBASE-80, we uncovered a strange 40ms delay for random reads. We ran a series of tests and found that it only happens when the client is on the same node as the RS and for a certain range of payloads (not specifically related to number of columns or size of them, only total payload). It appears to be precisely 40ms every time. Unsure if this is particular to our architecture, but it does happen on all nodes we've tried. Issue completely goes away with very large payloads or moving the client. Will post a test program tomorrow if anyone can test on a different architecture. Making a blocker for 0.20. Since this happens when you have an MR task running local to the RS, and this is what we try to do, might also consider making this a blocker for 0.19.1. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4755) HBase based block placement in DFS
[ https://issues.apache.org/jira/browse/HBASE-4755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689754#comment-13689754 ] Devaraj Das commented on HBASE-4755: [~jiangbinglover], yes HBase would need to periodically refresh the mappings, and also when compactions happen, the data would be rewritten in the three current nodes. I need to implement the balancer in FavoredNodeLoadBalancer (balanceCluster method). I should have something shortly. HBase based block placement in DFS -- Key: HBASE-4755 URL: https://issues.apache.org/jira/browse/HBASE-4755 Project: HBase Issue Type: New Feature Affects Versions: 0.94.0 Reporter: Karthik Ranganathan Assignee: Christopher Gist Priority: Critical Attachments: 4755-wip-1.patch, hbase-4755-notes.txt The feature as is only useful for HBase clusters that care about data locality on regionservers, but this feature can also enable a lot of nice features down the road. The basic idea is as follows: instead of letting HDFS determine where to replicate data (r=3) by place blocks on various regions, it is better to let HBase do so by providing hints to HDFS through the DFS client. That way instead of replicating data at a blocks level, we can replicate data at a per-region level (each region owned by a promary, a secondary and a tertiary regionserver). This is better for 2 things: - Can make region failover faster on clusters which benefit from data affinity - On large clusters with random block placement policy, this helps reduce the probability of data loss The algo is as follows: - Each region in META will have 3 columns which are the preferred regionservers for that region (primary, secondary and tertiary) - Preferred assignment can be controlled by a config knob - Upon cluster start, HMaster will enter a mapping from each region to 3 regionservers (random hash, could use current locality, etc) - The load balancer would assign out regions preferring region assignments to primary over secondary over tertiary over any other node - Periodically (say weekly, configurable) the HMaster would run a locality checked and make sure the map it has for region to regionservers is optimal. Down the road, this can be enhanced to control region placement in the following cases: - Mixed hardware SKU where some regionservers can hold fewer regions - Load balancing across tables where we dont want multiple regions of a table to get assigned to the same regionservers - Multi-tenancy, where we can restrict the assignment of the regions of some table to a subset of regionservers, so an abusive app cannot take down the whole HBase cluster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-8776) port HBASE-8723 to 0.94
Sergey Shelukhin created HBASE-8776: --- Summary: port HBASE-8723 to 0.94 Key: HBASE-8776 URL: https://issues.apache.org/jira/browse/HBASE-8776 Project: HBase Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8776) port HBASE-8723 to 0.94
[ https://issues.apache.org/jira/browse/HBASE-8776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-8776: Fix Version/s: 0.94.9 port HBASE-8723 to 0.94 --- Key: HBASE-8776 URL: https://issues.apache.org/jira/browse/HBASE-8776 Project: HBase Issue Type: Bug Affects Versions: 0.94.8 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: 0.94.9 Attachments: HBASE-8776-v0.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8776) port HBASE-8723 to 0.94
[ https://issues.apache.org/jira/browse/HBASE-8776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-8776: Attachment: HBASE-8776-v0.patch I am increasing retry count less aggressively than original; this should be more than enough to ride over server failure given the default negotiated ZK timeout of 40s. port HBASE-8723 to 0.94 --- Key: HBASE-8776 URL: https://issues.apache.org/jira/browse/HBASE-8776 Project: HBase Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HBASE-8776-v0.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8776) port HBASE-8723 to 0.94
[ https://issues.apache.org/jira/browse/HBASE-8776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-8776: Status: Patch Available (was: Open) port HBASE-8723 to 0.94 --- Key: HBASE-8776 URL: https://issues.apache.org/jira/browse/HBASE-8776 Project: HBase Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HBASE-8776-v0.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8776) port HBASE-8723 to 0.94
[ https://issues.apache.org/jira/browse/HBASE-8776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-8776: Affects Version/s: 0.94.8 port HBASE-8723 to 0.94 --- Key: HBASE-8776 URL: https://issues.apache.org/jira/browse/HBASE-8776 Project: HBase Issue Type: Bug Affects Versions: 0.94.8 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HBASE-8776-v0.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8776) port HBASE-8723 to 0.94
[ https://issues.apache.org/jira/browse/HBASE-8776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689781#comment-13689781 ] Sergey Shelukhin commented on HBASE-8776: - [~lhofhansl] are you ok with this change to client retries? port HBASE-8723 to 0.94 --- Key: HBASE-8776 URL: https://issues.apache.org/jira/browse/HBASE-8776 Project: HBase Issue Type: Bug Affects Versions: 0.94.8 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: 0.94.9 Attachments: HBASE-8776-v0.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8776) port HBASE-8723 to 0.94
[ https://issues.apache.org/jira/browse/HBASE-8776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689783#comment-13689783 ] Hadoop QA commented on HBASE-8776: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12588947/HBASE-8776-v0.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6086//console This message is automatically generated. port HBASE-8723 to 0.94 --- Key: HBASE-8776 URL: https://issues.apache.org/jira/browse/HBASE-8776 Project: HBase Issue Type: Bug Affects Versions: 0.94.8 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: 0.94.9 Attachments: HBASE-8776-v0.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-8777) HBase client should determine the destination server after retry time
Sergey Shelukhin created HBASE-8777: --- Summary: HBase client should determine the destination server after retry time Key: HBASE-8777 URL: https://issues.apache.org/jira/browse/HBASE-8777 Project: HBase Issue Type: Improvement Components: Client Reporter: Sergey Shelukhin HBase currently determines which server to go to, then creates delayed callable with pre-determined server and goes there. For later 16-32-... second retries this approach is suboptimal, the cluster could have seen massive changes in the meantime, so retry might be completely useless. We should re-locate regions after the delay, at least for longer retries. Given how grouping is currently done it would be a bit of a refactoring. The effect of this is alleviated (to a degree) on trunk by server-based retries (if we fail going to the pre-delay server after delay and then determine the server has changed, we will go to the new server immediately, so we only lose the failed round-trip time); on 94, if the region is opened on some other server during the delay, we'd go to the old one, fail, then find out it's on different server, wait a bunch more time because it's a late-stage retry and THEN go to the new one, as far as I see. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8776) port HBASE-8723 to 0.94
[ https://issues.apache.org/jira/browse/HBASE-8776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689801#comment-13689801 ] Lars Hofhansl commented on HBASE-8776: -- Don't the current defaults already add up to 47? 1+1+1+2+2+4+4+8+8+16 = 47 10 seems good enough, unless I am missing something. Will check the original jira tomorrow. port HBASE-8723 to 0.94 --- Key: HBASE-8776 URL: https://issues.apache.org/jira/browse/HBASE-8776 Project: HBase Issue Type: Bug Affects Versions: 0.94.8 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: 0.94.9 Attachments: HBASE-8776-v0.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-6620) Test org.apache.hadoop.hbase.client.TestFromClientSideWithCoprocessor.testPoolBehavior flaps in autobuilds.
[ https://issues.apache.org/jira/browse/HBASE-6620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack resolved HBASE-6620. -- Resolution: Cannot Reproduce We have not seen this in a long time. Closing. Can open a new one if we see it again. Test org.apache.hadoop.hbase.client.TestFromClientSideWithCoprocessor.testPoolBehavior flaps in autobuilds. --- Key: HBASE-6620 URL: https://issues.apache.org/jira/browse/HBASE-6620 Project: HBase Issue Type: Bug Components: Client Reporter: Sameer Vaishampayan Test flaps in autobuilds with assertion failure. org.apache.hadoop.hbase.client.TestFromClientSideWithCoprocessor.testPoolBehavior Failing for the past 1 build (Since #2602 ) Took 3 ms. Error Message expected:3 but was:4 Stacktrace java.lang.AssertionError: expected:3 but was:4 at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.failNotEquals(Assert.java:647) at org.junit.Assert.assertEquals(Assert.java:128) at org.junit.Assert.assertEquals(Assert.java:472) at org.junit.Assert.assertEquals(Assert.java:456) at org.apache.hadoop.hbase.client.TestFromClientSide.testPoolBehavior(TestFromClientSide.java:4334) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30) at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:47) at org.junit.rules.RunRules.evaluate(RunRules.java:18) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30) at org.junit.runners.ParentRunner.run(ParentRunner.java:300) at org.junit.runners.Suite.runChild(Suite.java:128) at org.junit.runners.Suite.runChild(Suite.java:24) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8015) Support for Namespaces
[ https://issues.apache.org/jira/browse/HBASE-8015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689812#comment-13689812 ] Enis Soztutar commented on HBASE-8015: -- One problem with option 4 is that we want to pay the price of migration only one between 0.94-0.96. If we do that, then it means we have to carry the exception tables code for all the releases going forward. Option 1 better than this I think? Note that surprise #1 also applies here as well. bq. Sounds like we are going with overloading all the existing apis to take a namespace parameter. If so what would be the behavior when using the old api? Will it always reference default namespace or will we support fully qualified table names? It should use the default ns. I think the idea is that there will not be a public facing thing called fully qualified table name in Elliot's approach. Although internally, we will need one, hence my tendency to go with option 2 over 3 (see my above comment): namespace,table seems good enough for me. Support for Namespaces -- Key: HBASE-8015 URL: https://issues.apache.org/jira/browse/HBASE-8015 Project: HBase Issue Type: New Feature Reporter: Francis Liu Assignee: Francis Liu Attachments: HBASE-8015_draft_94.patch, Namespace Design.pdf -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8777) HBase client should determine the destination server after retry time
[ https://issues.apache.org/jira/browse/HBASE-8777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689819#comment-13689819 ] Nicolas Liochon commented on HBASE-8777: It's actually implemented this way in 6295. HBase client should determine the destination server after retry time - Key: HBASE-8777 URL: https://issues.apache.org/jira/browse/HBASE-8777 Project: HBase Issue Type: Improvement Components: Client Reporter: Sergey Shelukhin HBase currently determines which server to go to, then creates delayed callable with pre-determined server and goes there. For later 16-32-... second retries this approach is suboptimal, the cluster could have seen massive changes in the meantime, so retry might be completely useless. We should re-locate regions after the delay, at least for longer retries. Given how grouping is currently done it would be a bit of a refactoring. The effect of this is alleviated (to a degree) on trunk by server-based retries (if we fail going to the pre-delay server after delay and then determine the server has changed, we will go to the new server immediately, so we only lose the failed round-trip time); on 94, if the region is opened on some other server during the delay, we'd go to the old one, fail, then find out it's on different server, wait a bunch more time because it's a late-stage retry and THEN go to the new one, as far as I see. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8778) Region assigments scan table directory making them slow for huge tables
[ https://issues.apache.org/jira/browse/HBASE-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dave Latham updated HBASE-8778: --- Attachment: HBASE-8778-0.94.5.patch One solution is to instead keep the table descriptor files in a subdirectory of the table directory so that only that subdirectory needs a scan. The attached patch is one from 0.94.5 that implements this scheme. In order to be applicable in a rolling restart scenario, the new descriptor is written to both the table directory and the subdirectory. Readers first read the subdirectory, then fall back to the table directory. In order to be robust against failures or races, a lock file is used in the subdirectory during writes. The patch also refactors the FSTableDescriptors class to require a Configuration (to determine lock wait duration) as well as updates so that it more uniformly enforces the fsreadonly flag (RegionServers never do writes) and stick with using instance methods rather than static methods. We are proceeding with this and hope to roll it out to our cluster. To update to this patch once the writers (HBase Master, tools like hbck, merge, compact) are upgraded then old writers should not be used. I would love to hear the opinion of the HBase community regarding this issue. Some questions: - Is it worth fixing? (I strongly believe so as it has a big impact on MTTR for large clusters) - What's the best approach to fixing? - Some other possibilities: - Using a lock file and well known table descriptor file rather than sequence ids - Relying on more descriptor caching rather than hitting hdfs on every region assignment (as bulk assignment already does). - Move table descriptors to a different location in hdfs (single location for all tables?) - Move table descriptors out of hdfs to ZK - How and when can we migrate to that approach? - For the patch above once the cluster has been upgraded and updated the location of the descriptor files to have a copy in the subdirectory it would be easy to have the next version use only those files. - Alternatively, for the singularity there could be a one-time piece of migration code that just moved the files there. Region assigments scan table directory making them slow for huge tables --- Key: HBASE-8778 URL: https://issues.apache.org/jira/browse/HBASE-8778 Project: HBase Issue Type: Improvement Reporter: Dave Latham Attachments: HBASE-8778-0.94.5.patch On a table with 130k regions it takes about 3 seconds for a region server to open a region once it has been assigned. Watching the threads for a region server running 0.94.5 that is opening many such regions shows the thread opening the reigon in code like this: {noformat} PRI IPC Server handler 4 on 60020 daemon prio=10 tid=0x2aaac07e9000 nid=0x6566 runnable [0x4c46d000] java.lang.Thread.State: RUNNABLE at java.lang.String.indexOf(String.java:1521) at java.net.URI$Parser.scan(URI.java:2912) at java.net.URI$Parser.parse(URI.java:3004) at java.net.URI.init(URI.java:736) at org.apache.hadoop.fs.Path.initialize(Path.java:145) at org.apache.hadoop.fs.Path.init(Path.java:126) at org.apache.hadoop.fs.Path.init(Path.java:50) at org.apache.hadoop.hdfs.protocol.HdfsFileStatus.getFullPath(HdfsFileStatus.java:215) at org.apache.hadoop.hdfs.DistributedFileSystem.makeQualified(DistributedFileSystem.java:252) at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:311) at org.apache.hadoop.fs.FilterFileSystem.listStatus(FilterFileSystem.java:159) at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:842) at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:867) at org.apache.hadoop.hbase.util.FSUtils.listStatus(FSUtils.java:1168) at org.apache.hadoop.hbase.util.FSTableDescriptors.getTableInfoPath(FSTableDescriptors.java:269) at org.apache.hadoop.hbase.util.FSTableDescriptors.getTableInfoPath(FSTableDescriptors.java:255) at org.apache.hadoop.hbase.util.FSTableDescriptors.getTableInfoModtime(FSTableDescriptors.java:368) at org.apache.hadoop.hbase.util.FSTableDescriptors.get(FSTableDescriptors.java:155) at org.apache.hadoop.hbase.util.FSTableDescriptors.get(FSTableDescriptors.java:126) at org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion(HRegionServer.java:2834) at org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion(HRegionServer.java:2807) at sun.reflect.GeneratedMethodAccessor64.invoke(Unknown Source) at
[jira] [Created] (HBASE-8778) Region assigments scan table directory making them slow for huge tables
Dave Latham created HBASE-8778: -- Summary: Region assigments scan table directory making them slow for huge tables Key: HBASE-8778 URL: https://issues.apache.org/jira/browse/HBASE-8778 Project: HBase Issue Type: Improvement Reporter: Dave Latham Attachments: HBASE-8778-0.94.5.patch On a table with 130k regions it takes about 3 seconds for a region server to open a region once it has been assigned. Watching the threads for a region server running 0.94.5 that is opening many such regions shows the thread opening the reigon in code like this: {noformat} PRI IPC Server handler 4 on 60020 daemon prio=10 tid=0x2aaac07e9000 nid=0x6566 runnable [0x4c46d000] java.lang.Thread.State: RUNNABLE at java.lang.String.indexOf(String.java:1521) at java.net.URI$Parser.scan(URI.java:2912) at java.net.URI$Parser.parse(URI.java:3004) at java.net.URI.init(URI.java:736) at org.apache.hadoop.fs.Path.initialize(Path.java:145) at org.apache.hadoop.fs.Path.init(Path.java:126) at org.apache.hadoop.fs.Path.init(Path.java:50) at org.apache.hadoop.hdfs.protocol.HdfsFileStatus.getFullPath(HdfsFileStatus.java:215) at org.apache.hadoop.hdfs.DistributedFileSystem.makeQualified(DistributedFileSystem.java:252) at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:311) at org.apache.hadoop.fs.FilterFileSystem.listStatus(FilterFileSystem.java:159) at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:842) at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:867) at org.apache.hadoop.hbase.util.FSUtils.listStatus(FSUtils.java:1168) at org.apache.hadoop.hbase.util.FSTableDescriptors.getTableInfoPath(FSTableDescriptors.java:269) at org.apache.hadoop.hbase.util.FSTableDescriptors.getTableInfoPath(FSTableDescriptors.java:255) at org.apache.hadoop.hbase.util.FSTableDescriptors.getTableInfoModtime(FSTableDescriptors.java:368) at org.apache.hadoop.hbase.util.FSTableDescriptors.get(FSTableDescriptors.java:155) at org.apache.hadoop.hbase.util.FSTableDescriptors.get(FSTableDescriptors.java:126) at org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion(HRegionServer.java:2834) at org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion(HRegionServer.java:2807) at sun.reflect.GeneratedMethodAccessor64.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:320) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1426) {noformat} To open the region, the region server first loads the latest HTableDescriptor. Since HBASE-4553 HTableDescriptor's are stored in the file system at /hbase/tableDir/.tableinfo.sequenceNum. The file with the largest sequenceNum is the current descriptor. This is done so that the current descirptor is updated atomically. However, since the filename is not known in advance FSTableDescriptors it has to do a FileSystem.listStatus operation which has to list all files in the directory to find it. The directory also contains all the region directories, so in our case it has to load 130k FileStatus objects. Even using a globStatus matching function still transfers all the objects to the client before performing the pattern matching. Furthermore HDFS uses a default of transferring 1000 directory entries in each RPC call, so it requires 130 roundtrips to the namenode to fetch all the directory entries. Consequently, to reassign all the regions of a table (or a constant fraction thereof) requires time proportional to the square of the number of regions. In our case, if a region server fails with 200 such regions, it takes 10+ minutes for them all to be reassigned, after the zk expiration and log splitting. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8015) Support for Namespaces
[ https://issues.apache.org/jira/browse/HBASE-8015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689856#comment-13689856 ] stack commented on HBASE-8015: -- bq. If so what would be the behavior when using the old api? Will it always reference default namespace or will we support fully qualified table names? I think the old API will be against default NS. The FQTN (Fully Qualified Table Name) would be an internal or something that could be passed to external tools (command-line, shell). Support for Namespaces -- Key: HBASE-8015 URL: https://issues.apache.org/jira/browse/HBASE-8015 Project: HBase Issue Type: New Feature Reporter: Francis Liu Assignee: Francis Liu Attachments: HBASE-8015_draft_94.patch, Namespace Design.pdf -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8776) port HBASE-8723 to 0.94
[ https://issues.apache.org/jira/browse/HBASE-8776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689861#comment-13689861 ] Sergey Shelukhin commented on HBASE-8776: - there's only one 8, and 32. The problem is that we determine the server before delay, so recovery has to happen before the delay for last retry (I filed a JIRA for that). 1+1+1+2+2+4+4+8+16 = 39. Recovery after zk timeout is also not instant. port HBASE-8723 to 0.94 --- Key: HBASE-8776 URL: https://issues.apache.org/jira/browse/HBASE-8776 Project: HBase Issue Type: Bug Affects Versions: 0.94.8 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: 0.94.9 Attachments: HBASE-8776-v0.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6295) Possible performance improvement in client batch operations: presplit and send in background
[ https://issues.apache.org/jira/browse/HBASE-6295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689867#comment-13689867 ] Sergey Shelukhin commented on HBASE-6295: - Hmm, I just noticed this test removed usage of errorsByServer.calculateBackoffTime. Can it please be put back? I have to withdraw my +1... :( Possible performance improvement in client batch operations: presplit and send in background Key: HBASE-6295 URL: https://issues.apache.org/jira/browse/HBASE-6295 Project: HBase Issue Type: Improvement Components: Client, Performance Affects Versions: 0.95.2 Reporter: Nicolas Liochon Assignee: Nicolas Liochon Labels: noob Fix For: 0.98.0 Attachments: 6295.v11.patch, 6295.v12.patch, 6295.v14.patch, 6295.v1.patch, 6295.v2.patch, 6295.v3.patch, 6295.v4.patch, 6295.v5.patch, 6295.v6.patch, 6295.v8.patch, 6295.v9.patch today batch algo is: {noformat} for Operation o: ListOp{ add o to todolist if todolist maxsize or o last in list split todolist per location send split lists to region servers clear todolist wait } {noformat} We could: - create immediately the final object instead of an intermediate array - split per location immediately - instead of sending when the list as a whole is full, send it when there is enough data for a single location It would be: {noformat} for Operation o: ListOp{ get location add o to todo location.todolist if (location.todolist maxLocationSize) send location.todolist to region server clear location.todolist // don't wait, continue the loop } send remaining wait {noformat} It's not trivial to write if you add error management: retried list must be shared with the operations added in the todolist. But it's doable. It's interesting mainly for 'big' writes -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8777) HBase client should determine the destination server after retry time
[ https://issues.apache.org/jira/browse/HBASE-8777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689869#comment-13689869 ] Sergey Shelukhin commented on HBASE-8777: - not in 94 though? HBase client should determine the destination server after retry time - Key: HBASE-8777 URL: https://issues.apache.org/jira/browse/HBASE-8777 Project: HBase Issue Type: Improvement Components: Client Reporter: Sergey Shelukhin HBase currently determines which server to go to, then creates delayed callable with pre-determined server and goes there. For later 16-32-... second retries this approach is suboptimal, the cluster could have seen massive changes in the meantime, so retry might be completely useless. We should re-locate regions after the delay, at least for longer retries. Given how grouping is currently done it would be a bit of a refactoring. The effect of this is alleviated (to a degree) on trunk by server-based retries (if we fail going to the pre-delay server after delay and then determine the server has changed, we will go to the new server immediately, so we only lose the failed round-trip time); on 94, if the region is opened on some other server during the delay, we'd go to the old one, fail, then find out it's on different server, wait a bunch more time because it's a late-stage retry and THEN go to the new one, as far as I see. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8777) HBase client should determine the destination server after retry time
[ https://issues.apache.org/jira/browse/HBASE-8777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689875#comment-13689875 ] Nicolas Liochon commented on HBASE-8777: I won't dare backporting 6295 to 0.95 :-) But iirc in 0.94 we were doing the split after the sleep (it may have changed, I haven't looked for a while) HBase client should determine the destination server after retry time - Key: HBASE-8777 URL: https://issues.apache.org/jira/browse/HBASE-8777 Project: HBase Issue Type: Improvement Components: Client Reporter: Sergey Shelukhin HBase currently determines which server to go to, then creates delayed callable with pre-determined server and goes there. For later 16-32-... second retries this approach is suboptimal, the cluster could have seen massive changes in the meantime, so retry might be completely useless. We should re-locate regions after the delay, at least for longer retries. Given how grouping is currently done it would be a bit of a refactoring. The effect of this is alleviated (to a degree) on trunk by server-based retries (if we fail going to the pre-delay server after delay and then determine the server has changed, we will go to the new server immediately, so we only lose the failed round-trip time); on 94, if the region is opened on some other server during the delay, we'd go to the old one, fail, then find out it's on different server, wait a bunch more time because it's a late-stage retry and THEN go to the new one, as far as I see. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira