date:20130620


[ 
https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13688933#comment-13688933
 ] 

rajeshbabu commented on HBASE-8667:
---

[~stack]
https://issues.apache.org/jira/secure/attachment/12587780/HBASE-8667_trunk.patch
 is the latest patch I have tested.I think you are reviewing 
https://issues.apache.org/jira/secure/attachment/12587092/HBASE-8667_Trunk-V2.patch.
Sorry for the patch name Stack, it should be something like 
HBASE-8667_trunk_v3.patch.


 Master and Regionserver not able to communicate if both bound to different 
 network interfaces on the same machine.
 --

 Key: HBASE-8667
 URL: https://issues.apache.org/jira/browse/HBASE-8667
 Project: HBase
  Issue Type: Bug
  Components: IPC/RPC
Reporter: rajeshbabu
 Fix For: 0.98.0, 0.95.2, 0.94.9

 Attachments: HBASE-8667_trunk.patch, HBASE-8667_Trunk.patch, 
 HBASE-8667_Trunk-V2.patch


 While testing HBASE-8640 fix found that master and regionserver running on 
 different interfaces are not communicating properly.
 I have two interfaces 1) lo 2) eth0 in my machine and default hostname 
 interface is lo.
 I have configured master ipc address to ip of eth0 interface.
 Started master and regionserver on the same machine.
 1) master rpc server bound to eth0 and RS rpc server bound to lo
 2) Since rpc client is not binding to any ip address, when RS is reporting RS 
 startup its getting registered with eth0 ip address(but actually it should 
 register localhost)
 Here are RS logs:
 {code}
 2013-05-31 06:05:28,608 WARN  [regionserver60020] 
 org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; 
 sleeping and then retrying.
 2013-05-31 06:05:31,609 INFO  [regionserver60020] 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to 
 Master server at 192.168.0.100,6,1369960497008
 2013-05-31 06:05:31,609 INFO  [regionserver60020] 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 
 192.168.0.100,6,1369960497008 that we are up with port=60020, 
 startcode=1369960502544
 2013-05-31 06:05:31,618 DEBUG [regionserver60020] 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: 
 hbase.rootdir=hdfs://localhost:2851/hbase
 2013-05-31 06:05:31,618 DEBUG [regionserver60020] 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: 
 fs.default.name=hdfs://localhost:2851
 2013-05-31 06:05:31,618 INFO  [regionserver60020] 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us a 
 different hostname to use; was=localhost, but now=192.168.0.100
 {code}
 Here are master logs:
 {code}
 2013-05-31 06:05:31,615 INFO  [IPC Server handler 9 on 6] 
 org.apache.hadoop.hbase.master.ServerManager: Registering 
 server=192.168.0.100,60020,1369960502544
 {code}
 Since master has wrong rpc server address of RS, META is not getting assigned.
 {code}
 2013-05-31 06:05:34,362 DEBUG [master-192.168.0.100,6,1369960497008] 
 org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan 
 was found (or we are ignoring an existing plan) for .META.,,1.1028785192 so 
 generated a random one; hri=.META.,,1.1028785192, src=, 
 dest=192.168.0.100,60020,1369960502544; 1 (online=1, available=1) available 
 servers, forceNewPlan=false
 -
 org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of 
 .META.,,1.1028785192 to 192.168.0.100,60020,1369960502544, trying to assign 
 elsewhere instead; try=1 of 10
 java.net.ConnectException: Connection refused
   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
   at 
 sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
   at 
 org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511)
   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:549)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:813)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1422)
   at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1315)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1532)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1587)
   at 
 org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.openRegion(AdminProtos.java:15039)
   at 
 org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:627)
   at

[jira] [Commented] (HBASE-8755) A new write thread model for HLog to improve the overall HBase write throughput


[ 
https://issues.apache.org/jira/browse/HBASE-8755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13688936#comment-13688936
 ] 

Feng Honghua commented on HBASE-8755:
-

[~zjushch] We run the same tests as yours, and below are the result:

  1). One YCSB client with 5/50/200 write threads respectively
  2). One RS with 300 RPC handlers, 20 regions  (5 data-nodes back-end HDFS 
running CDH 4.1.1)
  3). row-size = 150 bytes

threads  row-count new-throughputnew-latency old-throughput  
old-latency
---
5 20   3191  1.551(ms)   3172 
1.561(ms)
50   200   23215 2.131(ms)   7437 
6.693(ms)
200  200   35793 5.450(ms)   10816   
18.312(ms)
---

A). the difference is negligible when 5 threads of YCSB client
B). new-model still has 3X+ improvement compared to old-model when threads are 
50/200

Anybody else can help do the similar tests using the same test configuration as 
Chunhui?


 A new write thread model for HLog to improve the overall HBase write 
 throughput
 ---

 Key: HBASE-8755
 URL: https://issues.apache.org/jira/browse/HBASE-8755
 Project: HBase
  Issue Type: Improvement
  Components: wal
Reporter: Feng Honghua
 Attachments: HBASE-8755-0.94-V0.patch, HBASE-8755-0.94-V1.patch, 
 HBASE-8755-trunk-V0.patch


 In current write model, each write handler thread (executing put()) will 
 individually go through a full 'append (hlog local buffer) = HLog writer 
 append (write to hdfs) = HLog writer sync (sync hdfs)' cycle for each write, 
 which incurs heavy race condition on updateLock and flushLock.
 The only optimization where checking if current syncTillHere  txid in 
 expectation for other thread help write/sync its own txid to hdfs and 
 omitting the write/sync actually help much less than expectation.
 Three of my colleagues(Ye Hangjun / Wu Zesheng / Zhang Peng) at Xiaomi 
 proposed a new write thread model for writing hdfs sequence file and the 
 prototype implementation shows a 4X improvement for throughput (from 17000 to 
 7+). 
 I apply this new write thread model in HLog and the performance test in our 
 test cluster shows about 3X throughput improvement (from 12150 to 31520 for 1 
 RS, from 22000 to 7 for 5 RS), the 1 RS write throughput (1K row-size) 
 even beats the one of BigTable (Precolator published in 2011 says Bigtable's 
 write throughput then is 31002). I can provide the detailed performance test 
 results if anyone is interested.
 The change for new write thread model is as below:
  1 All put handler threads append the edits to HLog's local pending buffer; 
 (it notifies AsyncWriter thread that there is new edits in local buffer)
  2 All put handler threads wait in HLog.syncer() function for underlying 
 threads to finish the sync that contains its txid;
  3 An single AsyncWriter thread is responsible for retrieve all the buffered 
 edits in HLog's local pending buffer and write to the hdfs 
 (hlog.writer.append); (it notifies AsyncFlusher thread that there is new 
 writes to hdfs that needs a sync)
  4 An single AsyncFlusher thread is responsible for issuing a sync to hdfs 
 to persist the writes by AsyncWriter; (it notifies the AsyncNotifier thread 
 that sync watermark increases)
  5 An single AsyncNotifier thread is responsible for notifying all pending 
 put handler threads which are waiting in the HLog.syncer() function
  6 No LogSyncer thread any more (since there is always 
 AsyncWriter/AsyncFlusher threads do the same job it does)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7667) Support stripe compaction

[
https://issues.apache.org/jira/browse/HBASE-7667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13688945#comment-13688945
]

stack commented on HBASE-7667:
--

Rereading the design doc and how-to-use. They are very nice. Can go into the
book.

High-level, and I think you have suggested this yourself elsewhere, it'd be
coolio if user didn't have to choose between size and count -- if it'd just
figure itself based off incoming load.

I've seen case where a compaction produces a zero-length file (all deletes) so
would that mess w/ this invariant: Compaction mustproduce at least
one file(seeHBASE-6059). or ...No stripe can everbe
leftwith0 files...

I almost asked a few questions you'd already answered above in my previous read
of the doc (smile).

How would region merge work? We'd just drop all files into L0? Sounds like
we'd have to drop references if we are not to break snapshotting.

You think this true? stripescheme useslarger number of
files than
default to ensure all compactions are small, which can
affect verywidescans. Any measure of how much?

Should stripe be on by default? Or have it as experimental for now until we
get more data?

How to use doc is excellent (though too many configs). Will review patch again
next.

Support stripe compaction
-

Key: HBASE-7667
URL: https://issues.apache.org/jira/browse/HBASE-7667
Project: HBase
Issue Type: New Feature
Components: Compaction
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
Attachments: stripe-cdf.pdf, Stripe compaction perf evaluation.pdf,
Stripe compaction perf evaluation.pdf, Stripe compaction perf evaluation.pdf,
Stripe compactions.pdf, Stripe compactions.pdf, Stripe compactions.pdf,
Stripe compactions.pdf, Using stripe compactions.pdf, Using stripe
compactions.pdf, Using stripe compactions.pdf

So I was thinking about having many regions as the way to make compactions
more manageable, and writing the level db doc about how level db range
overlap and data mixing breaks seqNum sorting, and discussing it with Jimmy,
Matteo and Ted, and thinking about how to avoid Level DB I/O multiplication
factor.
And I suggest the following idea, let's call it stripe compactions. It's a
mix between level db ideas and having many small regions.
It allows us to have a subset of benefits of many regions (wrt reads and
compactions) without many of the drawbacks (managing and current
memstore/etc. limitation).
It also doesn't break seqNum-based file sorting for any one key.
It works like this.
The region key space is separated into configurable number of fixed-boundary
stripes (determined the first time we stripe the data, see below).
All the data from memstores is written to normal files with all keys present
(not striped), similar to L0 in LevelDb, or current files.
Compaction policy does 3 types of compactions.
First is L0 compaction, which takes all L0 files and breaks them down by
stripe. It may be optimized by adding more small files from different
stripes, but the main logical outcome is that there are no more L0 files and
all data is striped.
Second is exactly similar to current compaction, but compacting one single
stripe. In future, nothing prevents us from applying compaction rules and
compacting part of the stripe (e.g. similar to current policy with rations
and stuff, tiers, whatever), but for the first cut I'd argue let it major
compact the entire stripe. Or just have the ratio and no more complexity.
Finally, the third addresses the concern of the fixed boundaries causing
stripes to be very unbalanced.
It's exactly like the 2nd, except it takes 2+ adjacent stripes and writes the
results out with different boundaries.
There's a tradeoff here - if we always take 2 adjacent stripes, compactions
will be smaller but rebalancing will take ridiculous amount of I/O.
If we take many stripes we are essentially getting into the
epic-major-compaction problem again. Some heuristics will have to be in place.
In general, if, before stripes are determined, we initially let L0 grow
before determining the stripes, we will get better boundaries.
Also, unless unbalancing is really large we don't need to rebalance really.
Obviously this scheme (as well as level) is not applicable for all scenarios,
e.g. if timestamp is your key it completely falls apart.
The end result:
- many small compactions that can be spread out in time.
- reads still read from a small number of files (one stripe + L0).
- region splits become marvelously simple (if we could move files between
regions, no references would be needed).
Main advantage over Level (for HBase)

[jira] [Commented] (HBASE-8701) distributedLogReplay need to apply wal edits in the receiving order of those edits

2013-06-20 Thread Jeffrey Zhong (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-8701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13688954#comment-13688954
 ] 

Jeffrey Zhong commented on HBASE-8701:
--

Thanks [~saint@gmail.com] for the comments.

{quote}
How will compactions deal with the -ve sequenceid
{quote}
The sequence ids of hfile are intact as before.

{quote}
Sometimes its a boolean and other times its a ts?
{quote}
decodeMemstoreTS is boolean. It's used to indicate hfilereader whether to 
decode memtoreTS(mvcc) number. An existing optimization to skip mvcc number 
decoding by using the following logic. Since we use negative mvcc, the 
optimization may skip decode mvcc number from a hfile.
{code}
Bytes.toLong(fileInfo.get(HFileWriterV2.MAX_MEMSTORE_TS_KEY))  0;
{code}

{quote}
Regards 200M.
{quote}
This part will be updated later by 8741. I left the code there is to let one of 
my new test case pass where we test same version update comes during recovery.

{quote}
Is that safe presumption to make in replay?
Is this the least sequenceid of the batch?
Again, what is the difference between these two sequenceids?
Do we have to add it to WALEdit at all?
{quote}
I think we may not need the origSequneceNumber because mvcc is part of KV and 
should be already written into WAL? Let me try to see if I can cut the 
origSequenceNumber.

{quote}
Is this 'if it is present'?
{quote}
Yes.

{quote}
We only do this stuff for Puts and Deletes? Don't we have other types out in 
the WAL?
{quote}
Only puts and deletes are used for recovery purpose in WAL.

 distributedLogReplay need to apply wal edits in the receiving order of those 
 edits
 --

 Key: HBASE-8701
 URL: https://issues.apache.org/jira/browse/HBASE-8701
 Project: HBase
  Issue Type: Bug
  Components: MTTR
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.98.0, 0.95.2

 Attachments: 8701-v3.txt, hbase-8701-v4.patch, hbase-8701-v5.patch, 
 hbase-8701-v6.patch, hbase-8701-v7.patch


 This issue happens in distributedLogReplay mode when recovering multiple puts 
 of the same key + version(timestamp). After replay, the value is 
 nondeterministic of the key
 h5. The original concern situation raised from [~eclark]:
 For all edits the rowkey is the same.
 There's a log with: [ A (ts = 0), B (ts = 0) ]
 Replay the first half of the log.
 A user puts in C (ts = 0)
 Memstore has to flush
 A new Hfile will be created with [ C, A ] and MaxSequenceId = C's seqid.
 Replay the rest of the Log.
 Flush
 The issue will happen in similar situation like Put(key, t=T) in WAL1 and 
 Put(key,t=T) in WAL2
 h5. Below is the option(proposed by Ted) I'd like to use:
 a) During replay, we pass original wal sequence number of each edit to the 
 receiving RS
 b) In receiving RS, we store negative original sequence number of wal edits 
 into mvcc field of KVs of wal edits
 c) Add handling of negative MVCC in KVScannerComparator and KVComparator   
 d) In receiving RS, write original sequence number into an optional field of 
 wal file for chained RS failure situation 
 e) When opening a region, we add a safety bumper(a large number) in order for 
 the new sequence number of a newly opened region not to collide with old 
 sequence numbers. 
 In the future, when we stores sequence number along with KVs, we can adjust 
 the above solution a little bit by avoiding to overload MVCC field.
 h5. The other alternative options are listed below for references:
 Option one
 a) disallow writes during recovery
 b) during replay, we pass original wal sequence ids
 c) hold flush till all wals of a recovering region are replayed. Memstore 
 should hold because we only recover unflushed wal edits. For edits with same 
 key + version, whichever with larger sequence Id wins.
 Option two
 a) During replay, we pass original wal sequence ids
 b) for each wal edit, we store each edit's original sequence id along with 
 its key. 
 c) during scanning, we use the original sequence id if it's present otherwise 
 its store file sequence Id
 d) compaction can just leave put with max sequence id
 Please let me know if you have better ideas.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-8667) Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine.


[ 
https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13688953#comment-13688953
 ] 

stack commented on HBASE-8667:
--

Whoops.  My fault.

Why not just pass this.isa rather than wrap it in a new InetSocketAddress 
(which will do a new resolve -- could do 
http://download.java.net/jdk7/archive/b123/docs/api/java/net/InetSocketAddress.html#createUnresolved(java.lang.String,
 int) I suppose)?

{code}
+rpcClient = new RpcClient(conf, clusterId, new 
InetSocketAddress(this.isa.getHostName(), 0));
{code}

 Master and Regionserver not able to communicate if both bound to different 
 network interfaces on the same machine.
 --

 Key: HBASE-8667
 URL: https://issues.apache.org/jira/browse/HBASE-8667
 Project: HBase
  Issue Type: Bug
  Components: IPC/RPC
Reporter: rajeshbabu
 Fix For: 0.98.0, 0.95.2, 0.94.9

 Attachments: HBASE-8667_trunk.patch, HBASE-8667_Trunk.patch, 
 HBASE-8667_Trunk-V2.patch


 While testing HBASE-8640 fix found that master and regionserver running on 
 different interfaces are not communicating properly.
 I have two interfaces 1) lo 2) eth0 in my machine and default hostname 
 interface is lo.
 I have configured master ipc address to ip of eth0 interface.
 Started master and regionserver on the same machine.
 1) master rpc server bound to eth0 and RS rpc server bound to lo
 2) Since rpc client is not binding to any ip address, when RS is reporting RS 
 startup its getting registered with eth0 ip address(but actually it should 
 register localhost)
 Here are RS logs:
 {code}
 2013-05-31 06:05:28,608 WARN  [regionserver60020] 
 org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; 
 sleeping and then retrying.
 2013-05-31 06:05:31,609 INFO  [regionserver60020] 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to 
 Master server at 192.168.0.100,6,1369960497008
 2013-05-31 06:05:31,609 INFO  [regionserver60020] 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 
 192.168.0.100,6,1369960497008 that we are up with port=60020, 
 startcode=1369960502544
 2013-05-31 06:05:31,618 DEBUG [regionserver60020] 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: 
 hbase.rootdir=hdfs://localhost:2851/hbase
 2013-05-31 06:05:31,618 DEBUG [regionserver60020] 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: 
 fs.default.name=hdfs://localhost:2851
 2013-05-31 06:05:31,618 INFO  [regionserver60020] 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us a 
 different hostname to use; was=localhost, but now=192.168.0.100
 {code}
 Here are master logs:
 {code}
 2013-05-31 06:05:31,615 INFO  [IPC Server handler 9 on 6] 
 org.apache.hadoop.hbase.master.ServerManager: Registering 
 server=192.168.0.100,60020,1369960502544
 {code}
 Since master has wrong rpc server address of RS, META is not getting assigned.
 {code}
 2013-05-31 06:05:34,362 DEBUG [master-192.168.0.100,6,1369960497008] 
 org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan 
 was found (or we are ignoring an existing plan) for .META.,,1.1028785192 so 
 generated a random one; hri=.META.,,1.1028785192, src=, 
 dest=192.168.0.100,60020,1369960502544; 1 (online=1, available=1) available 
 servers, forceNewPlan=false
 -
 org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of 
 .META.,,1.1028785192 to 192.168.0.100,60020,1369960502544, trying to assign 
 elsewhere instead; try=1 of 10
 java.net.ConnectException: Connection refused
   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
   at 
 sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
   at 
 org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511)
   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:549)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:813)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1422)
   at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1315)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1532)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1587)
   at 
 org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.openRegion(AdminProtos.java:15039)
   at

[jira] [Commented] (HBASE-8755) A new write thread model for HLog to improve the overall HBase write throughput


[ 
https://issues.apache.org/jira/browse/HBASE-8755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13688957#comment-13688957
 ] 

Feng Honghua commented on HBASE-8755:
-

[~zjushch]: We run the same tests as yours, and below are the result:
1). One YCSB client with 5/50/200 write threads respectively
2). One RS with 300 RPC handlers, 20 regions (5 data-nodes back-end HDFS 
running CDH 4.1.1)
3). row-size = 150 bytes
||client-threads ||row-count ||new-model throughput ||new-model latency 
||old-model throughput||old-model latency||
|5 |20 |3191|1.551(ms) |3172 |1.561(ms)|
|50 |200 |23215 |2.131(ms) |7437 |6.693(ms)|
|200 |200 |35793 |5.450(ms) |10816 |18.312(ms)|
A). the difference is negligible when 5 threads of YCSB client, this is because 
B). new-model still has 3X+ improvement compared to old-model when threads are 
50/200
Can anybody else help do the tests using the same configurations as Chunhui?

Another guess is the HDFS used by chunhui has much better performance on HLog's 
write/sync, which makes the new model in HBase has less impact. Just guess.

 A new write thread model for HLog to improve the overall HBase write 
 throughput
 ---

 Key: HBASE-8755
 URL: https://issues.apache.org/jira/browse/HBASE-8755
 Project: HBase
  Issue Type: Improvement
  Components: wal
Reporter: Feng Honghua
 Attachments: HBASE-8755-0.94-V0.patch, HBASE-8755-0.94-V1.patch, 
 HBASE-8755-trunk-V0.patch


 In current write model, each write handler thread (executing put()) will 
 individually go through a full 'append (hlog local buffer) = HLog writer 
 append (write to hdfs) = HLog writer sync (sync hdfs)' cycle for each write, 
 which incurs heavy race condition on updateLock and flushLock.
 The only optimization where checking if current syncTillHere  txid in 
 expectation for other thread help write/sync its own txid to hdfs and 
 omitting the write/sync actually help much less than expectation.
 Three of my colleagues(Ye Hangjun / Wu Zesheng / Zhang Peng) at Xiaomi 
 proposed a new write thread model for writing hdfs sequence file and the 
 prototype implementation shows a 4X improvement for throughput (from 17000 to 
 7+). 
 I apply this new write thread model in HLog and the performance test in our 
 test cluster shows about 3X throughput improvement (from 12150 to 31520 for 1 
 RS, from 22000 to 7 for 5 RS), the 1 RS write throughput (1K row-size) 
 even beats the one of BigTable (Precolator published in 2011 says Bigtable's 
 write throughput then is 31002). I can provide the detailed performance test 
 results if anyone is interested.
 The change for new write thread model is as below:
  1 All put handler threads append the edits to HLog's local pending buffer; 
 (it notifies AsyncWriter thread that there is new edits in local buffer)
  2 All put handler threads wait in HLog.syncer() function for underlying 
 threads to finish the sync that contains its txid;
  3 An single AsyncWriter thread is responsible for retrieve all the buffered 
 edits in HLog's local pending buffer and write to the hdfs 
 (hlog.writer.append); (it notifies AsyncFlusher thread that there is new 
 writes to hdfs that needs a sync)
  4 An single AsyncFlusher thread is responsible for issuing a sync to hdfs 
 to persist the writes by AsyncWriter; (it notifies the AsyncNotifier thread 
 that sync watermark increases)
  5 An single AsyncNotifier thread is responsible for notifying all pending 
 put handler threads which are waiting in the HLog.syncer() function
  6 No LogSyncer thread any more (since there is always 
 AsyncWriter/AsyncFlusher threads do the same job it does)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-8667) Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine.


[ 
https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13688964#comment-13688964
 ] 

rajeshbabu commented on HBASE-8667:
---

[~ram_krish]
bq. So after this patch the RPC server and the rpc client on the RS connects 
using the same host?
Yes Ram. 

If we dont pass bind address in connect call,presently it will pass null 
internally. 
{code}
  // connection time out is 20s
  NetUtils.connect(this.socket, remoteId.getAddress(),
  getSocketTimeout(conf));
{code} 
{code}
  public static void connect(Socket socket,
  SocketAddress address,
  int timeout) throws IOException {
connect(socket, address, null, timeout);
  }
{code}

 Master and Regionserver not able to communicate if both bound to different 
 network interfaces on the same machine.
 --

 Key: HBASE-8667
 URL: https://issues.apache.org/jira/browse/HBASE-8667
 Project: HBase
  Issue Type: Bug
  Components: IPC/RPC
Reporter: rajeshbabu
 Fix For: 0.98.0, 0.95.2, 0.94.9

 Attachments: HBASE-8667_trunk.patch, HBASE-8667_Trunk.patch, 
 HBASE-8667_Trunk-V2.patch


 While testing HBASE-8640 fix found that master and regionserver running on 
 different interfaces are not communicating properly.
 I have two interfaces 1) lo 2) eth0 in my machine and default hostname 
 interface is lo.
 I have configured master ipc address to ip of eth0 interface.
 Started master and regionserver on the same machine.
 1) master rpc server bound to eth0 and RS rpc server bound to lo
 2) Since rpc client is not binding to any ip address, when RS is reporting RS 
 startup its getting registered with eth0 ip address(but actually it should 
 register localhost)
 Here are RS logs:
 {code}
 2013-05-31 06:05:28,608 WARN  [regionserver60020] 
 org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; 
 sleeping and then retrying.
 2013-05-31 06:05:31,609 INFO  [regionserver60020] 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to 
 Master server at 192.168.0.100,6,1369960497008
 2013-05-31 06:05:31,609 INFO  [regionserver60020] 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 
 192.168.0.100,6,1369960497008 that we are up with port=60020, 
 startcode=1369960502544
 2013-05-31 06:05:31,618 DEBUG [regionserver60020] 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: 
 hbase.rootdir=hdfs://localhost:2851/hbase
 2013-05-31 06:05:31,618 DEBUG [regionserver60020] 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: 
 fs.default.name=hdfs://localhost:2851
 2013-05-31 06:05:31,618 INFO  [regionserver60020] 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us a 
 different hostname to use; was=localhost, but now=192.168.0.100
 {code}
 Here are master logs:
 {code}
 2013-05-31 06:05:31,615 INFO  [IPC Server handler 9 on 6] 
 org.apache.hadoop.hbase.master.ServerManager: Registering 
 server=192.168.0.100,60020,1369960502544
 {code}
 Since master has wrong rpc server address of RS, META is not getting assigned.
 {code}
 2013-05-31 06:05:34,362 DEBUG [master-192.168.0.100,6,1369960497008] 
 org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan 
 was found (or we are ignoring an existing plan) for .META.,,1.1028785192 so 
 generated a random one; hri=.META.,,1.1028785192, src=, 
 dest=192.168.0.100,60020,1369960502544; 1 (online=1, available=1) available 
 servers, forceNewPlan=false
 -
 org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of 
 .META.,,1.1028785192 to 192.168.0.100,60020,1369960502544, trying to assign 
 elsewhere instead; try=1 of 10
 java.net.ConnectException: Connection refused
   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
   at 
 sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
   at 
 org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511)
   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:549)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:813)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1422)
   at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1315)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1532)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1587)
   at

[jira] [Commented] (HBASE-8755) A new write thread model for HLog to improve the overall HBase write throughput

2013-06-20 Thread chunhui shen (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-8755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13688962#comment-13688962
 ] 

chunhui shen commented on HBASE-8755:
-

As the above tests， try to find out why the old-throughput is so low.

Do your client run on the regionserver or another separated server？

 A new write thread model for HLog to improve the overall HBase write 
 throughput
 ---

 Key: HBASE-8755
 URL: https://issues.apache.org/jira/browse/HBASE-8755
 Project: HBase
  Issue Type: Improvement
  Components: wal
Reporter: Feng Honghua
 Attachments: HBASE-8755-0.94-V0.patch, HBASE-8755-0.94-V1.patch, 
 HBASE-8755-trunk-V0.patch


 In current write model, each write handler thread (executing put()) will 
 individually go through a full 'append (hlog local buffer) = HLog writer 
 append (write to hdfs) = HLog writer sync (sync hdfs)' cycle for each write, 
 which incurs heavy race condition on updateLock and flushLock.
 The only optimization where checking if current syncTillHere  txid in 
 expectation for other thread help write/sync its own txid to hdfs and 
 omitting the write/sync actually help much less than expectation.
 Three of my colleagues(Ye Hangjun / Wu Zesheng / Zhang Peng) at Xiaomi 
 proposed a new write thread model for writing hdfs sequence file and the 
 prototype implementation shows a 4X improvement for throughput (from 17000 to 
 7+). 
 I apply this new write thread model in HLog and the performance test in our 
 test cluster shows about 3X throughput improvement (from 12150 to 31520 for 1 
 RS, from 22000 to 7 for 5 RS), the 1 RS write throughput (1K row-size) 
 even beats the one of BigTable (Precolator published in 2011 says Bigtable's 
 write throughput then is 31002). I can provide the detailed performance test 
 results if anyone is interested.
 The change for new write thread model is as below:
  1 All put handler threads append the edits to HLog's local pending buffer; 
 (it notifies AsyncWriter thread that there is new edits in local buffer)
  2 All put handler threads wait in HLog.syncer() function for underlying 
 threads to finish the sync that contains its txid;
  3 An single AsyncWriter thread is responsible for retrieve all the buffered 
 edits in HLog's local pending buffer and write to the hdfs 
 (hlog.writer.append); (it notifies AsyncFlusher thread that there is new 
 writes to hdfs that needs a sync)
  4 An single AsyncFlusher thread is responsible for issuing a sync to hdfs 
 to persist the writes by AsyncWriter; (it notifies the AsyncNotifier thread 
 that sync watermark increases)
  5 An single AsyncNotifier thread is responsible for notifying all pending 
 put handler threads which are waiting in the HLog.syncer() function
  6 No LogSyncer thread any more (since there is always 
 AsyncWriter/AsyncFlusher threads do the same job it does)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-8667) Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine.


[ 
https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13688967#comment-13688967
 ] 

rajeshbabu commented on HBASE-8667:
---

[~stack]
bq.  could do 
http://download.java.net/jdk7/archive/b123/docs/api/java/net/InetSocketAddress.html#createUnresolved(java.lang.String,
 int) I suppose)?
This is good. I will change and update the patch.
Thanks.

 Master and Regionserver not able to communicate if both bound to different 
 network interfaces on the same machine.
 --

 Key: HBASE-8667
 URL: https://issues.apache.org/jira/browse/HBASE-8667
 Project: HBase
  Issue Type: Bug
  Components: IPC/RPC
Reporter: rajeshbabu
 Fix For: 0.98.0, 0.95.2, 0.94.9

 Attachments: HBASE-8667_trunk.patch, HBASE-8667_Trunk.patch, 
 HBASE-8667_Trunk-V2.patch


 While testing HBASE-8640 fix found that master and regionserver running on 
 different interfaces are not communicating properly.
 I have two interfaces 1) lo 2) eth0 in my machine and default hostname 
 interface is lo.
 I have configured master ipc address to ip of eth0 interface.
 Started master and regionserver on the same machine.
 1) master rpc server bound to eth0 and RS rpc server bound to lo
 2) Since rpc client is not binding to any ip address, when RS is reporting RS 
 startup its getting registered with eth0 ip address(but actually it should 
 register localhost)
 Here are RS logs:
 {code}
 2013-05-31 06:05:28,608 WARN  [regionserver60020] 
 org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; 
 sleeping and then retrying.
 2013-05-31 06:05:31,609 INFO  [regionserver60020] 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to 
 Master server at 192.168.0.100,6,1369960497008
 2013-05-31 06:05:31,609 INFO  [regionserver60020] 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 
 192.168.0.100,6,1369960497008 that we are up with port=60020, 
 startcode=1369960502544
 2013-05-31 06:05:31,618 DEBUG [regionserver60020] 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: 
 hbase.rootdir=hdfs://localhost:2851/hbase
 2013-05-31 06:05:31,618 DEBUG [regionserver60020] 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: 
 fs.default.name=hdfs://localhost:2851
 2013-05-31 06:05:31,618 INFO  [regionserver60020] 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us a 
 different hostname to use; was=localhost, but now=192.168.0.100
 {code}
 Here are master logs:
 {code}
 2013-05-31 06:05:31,615 INFO  [IPC Server handler 9 on 6] 
 org.apache.hadoop.hbase.master.ServerManager: Registering 
 server=192.168.0.100,60020,1369960502544
 {code}
 Since master has wrong rpc server address of RS, META is not getting assigned.
 {code}
 2013-05-31 06:05:34,362 DEBUG [master-192.168.0.100,6,1369960497008] 
 org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan 
 was found (or we are ignoring an existing plan) for .META.,,1.1028785192 so 
 generated a random one; hri=.META.,,1.1028785192, src=, 
 dest=192.168.0.100,60020,1369960502544; 1 (online=1, available=1) available 
 servers, forceNewPlan=false
 -
 org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of 
 .META.,,1.1028785192 to 192.168.0.100,60020,1369960502544, trying to assign 
 elsewhere instead; try=1 of 10
 java.net.ConnectException: Connection refused
   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
   at 
 sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
   at 
 org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511)
   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:549)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:813)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1422)
   at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1315)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1532)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1587)
   at 
 org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.openRegion(AdminProtos.java:15039)
   at 
 org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:627)
   at 
 org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1826)
   at

[jira] [Commented] (HBASE-8755) A new write thread model for HLog to improve the overall HBase write throughput


[ 
https://issues.apache.org/jira/browse/HBASE-8755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13688973#comment-13688973
 ] 

Feng Honghua commented on HBASE-8755:
-

Our comparison tests have only the RS bits different, and all 
others(client/HDFS/cluster/row-size...) remain the same. 

The client runs on a different machine other than the RS, we don't run client 
on RS because almost all our applications using HBase run their application in 
their own machines different from the HBase cluster.

Actually we never saw a such high throughput as 18018/24691 for a single RS in 
our cluster. It's really weird :).

 A new write thread model for HLog to improve the overall HBase write 
 throughput
 ---

 Key: HBASE-8755
 URL: https://issues.apache.org/jira/browse/HBASE-8755
 Project: HBase
  Issue Type: Improvement
  Components: wal
Reporter: Feng Honghua
 Attachments: HBASE-8755-0.94-V0.patch, HBASE-8755-0.94-V1.patch, 
 HBASE-8755-trunk-V0.patch


 In current write model, each write handler thread (executing put()) will 
 individually go through a full 'append (hlog local buffer) = HLog writer 
 append (write to hdfs) = HLog writer sync (sync hdfs)' cycle for each write, 
 which incurs heavy race condition on updateLock and flushLock.
 The only optimization where checking if current syncTillHere  txid in 
 expectation for other thread help write/sync its own txid to hdfs and 
 omitting the write/sync actually help much less than expectation.
 Three of my colleagues(Ye Hangjun / Wu Zesheng / Zhang Peng) at Xiaomi 
 proposed a new write thread model for writing hdfs sequence file and the 
 prototype implementation shows a 4X improvement for throughput (from 17000 to 
 7+). 
 I apply this new write thread model in HLog and the performance test in our 
 test cluster shows about 3X throughput improvement (from 12150 to 31520 for 1 
 RS, from 22000 to 7 for 5 RS), the 1 RS write throughput (1K row-size) 
 even beats the one of BigTable (Precolator published in 2011 says Bigtable's 
 write throughput then is 31002). I can provide the detailed performance test 
 results if anyone is interested.
 The change for new write thread model is as below:
  1 All put handler threads append the edits to HLog's local pending buffer; 
 (it notifies AsyncWriter thread that there is new edits in local buffer)
  2 All put handler threads wait in HLog.syncer() function for underlying 
 threads to finish the sync that contains its txid;
  3 An single AsyncWriter thread is responsible for retrieve all the buffered 
 edits in HLog's local pending buffer and write to the hdfs 
 (hlog.writer.append); (it notifies AsyncFlusher thread that there is new 
 writes to hdfs that needs a sync)
  4 An single AsyncFlusher thread is responsible for issuing a sync to hdfs 
 to persist the writes by AsyncWriter; (it notifies the AsyncNotifier thread 
 that sync watermark increases)
  5 An single AsyncNotifier thread is responsible for notifying all pending 
 put handler threads which are waiting in the HLog.syncer() function
  6 No LogSyncer thread any more (since there is always 
 AsyncWriter/AsyncFlusher threads do the same job it does)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-8759) Family Delete Markers not getting purged after major compaction


[ 
https://issues.apache.org/jira/browse/HBASE-8759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13688988#comment-13688988
 ] 

Lars Hofhansl commented on HBASE-8759:
--

NP. I should also be a bit more explicit how family delete markets are actually 
deleted. The logic is this:
Each compaction registers the timestamp of the oldest put in the created hfile.
The next (major) compaction then removes all family markers that are older than 
the oldest put. So as long as the client does not keep puts, eventually all 
family markers drop out of the compacted files.

 Family Delete Markers not getting purged after major compaction
 ---

 Key: HBASE-8759
 URL: https://issues.apache.org/jira/browse/HBASE-8759
 Project: HBase
  Issue Type: Bug
  Components: Compaction
Affects Versions: 0.94.7
Reporter: Mujtaba Chohan
Priority: Minor

 On table with VERSIONS = '1', KEEP_DELETED_CELLS = 'true'. Family Delete 
 Markers does not get purged after put  delete  major compaction (they keep 
 on incrementing after every put  delete  major compaction)
 Following is the raw scan output after 10 iterations of put  delete  major 
 compaction.
 ROW   COLUMN+CELL 
   
   
 Acolumn=CF:, timestamp=1371512706683, 
 type=DeleteFamily 
  
 Acolumn=CF:, timestamp=1371512706394, 
 type=DeleteFamily 
  
 Acolumn=CF:, timestamp=1371512706054, 
 type=DeleteFamily 
  
 Acolumn=CF:, timestamp=1371512705763, 
 type=DeleteFamily 
  
 Acolumn=CF:, timestamp=1371512705457, 
 type=DeleteFamily 
  
 Acolumn=CF:, timestamp=1371512705149, 
 type=DeleteFamily 
  
 Acolumn=CF:, timestamp=1371512704836, 
 type=DeleteFamily 
  
 Acolumn=CF:, timestamp=1371512704518, 
 type=DeleteFamily 
  
 Acolumn=CF:, timestamp=1371512704162, 
 type=DeleteFamily 
  
 Acolumn=CF:, timestamp=1371512703779, 
 type=DeleteFamily 
  
 Acolumn=CF:COL, 
 timestamp=1371512706682, value=X 
 [~lhofhansl]
 Code to repro this issue:
 http://phoenix-bin.github.io/client/code/delete.java

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HBASE-8773) Can be setup the COMPRESSION base on HTable in meta or user set in Configuration

jay wong created HBASE-8773:
---

 Summary: Can be setup the COMPRESSION base on HTable in meta or 
user set in Configuration
 Key: HBASE-8773
 URL: https://issues.apache.org/jira/browse/HBASE-8773
 Project: HBase
  Issue Type: New Feature
  Components: HFile
Affects Versions: 0.94.8
Reporter: jay wong
 Fix For: 0.94.9


when I want create HFile with the ImportTsv. I found that if i set the 
compression in the Configuration or not, It's always invalid。

It because of the method 'configureIncrementalLoad' in HFileOutputFormat will 
set the compression with the HTable in meta. So if add a configuration to 
switch use set compression with HTable or Not

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-8773) Can be setup the COMPRESSION base on HTable in meta or user set in Configuration


 [ 
https://issues.apache.org/jira/browse/HBASE-8773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jay wong updated HBASE-8773:


Status: Open  (was: Patch Available)

 Can be setup the COMPRESSION base on HTable in meta or user set in 
 Configuration
 

 Key: HBASE-8773
 URL: https://issues.apache.org/jira/browse/HBASE-8773
 Project: HBase
  Issue Type: New Feature
  Components: HFile
Affects Versions: 0.94.8
Reporter: jay wong
 Fix For: 0.94.9


 when I want create HFile with the ImportTsv. I found that if i set the 
 compression in the Configuration or not, It's always invalid。
 It because of the method 'configureIncrementalLoad' in HFileOutputFormat will 
 set the compression with the HTable in meta. So if add a configuration to 
 switch use set compression with HTable or Not

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-8773) Can be setup the COMPRESSION base on HTable in meta or user set in Configuration


 [ 
https://issues.apache.org/jira/browse/HBASE-8773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jay wong updated HBASE-8773:


Labels:   (was: patch)

 Can be setup the COMPRESSION base on HTable in meta or user set in 
 Configuration
 

 Key: HBASE-8773
 URL: https://issues.apache.org/jira/browse/HBASE-8773
 Project: HBase
  Issue Type: New Feature
  Components: HFile
Affects Versions: 0.94.8
Reporter: jay wong
 Fix For: 0.94.9


 when I want create HFile with the ImportTsv. I found that if i set the 
 compression in the Configuration or not, It's always invalid。
 It because of the method 'configureIncrementalLoad' in HFileOutputFormat will 
 set the compression with the HTable in meta. So if add a configuration to 
 switch use set compression with HTable or Not

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-8773) Can be setup the COMPRESSION base on HTable in meta or user set in Configuration


 [ 
https://issues.apache.org/jira/browse/HBASE-8773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jay wong updated HBASE-8773:


  Labels: patch  (was: )
Hadoop Flags: Reviewed
  Status: Patch Available  (was: Open)

 Can be setup the COMPRESSION base on HTable in meta or user set in 
 Configuration
 

 Key: HBASE-8773
 URL: https://issues.apache.org/jira/browse/HBASE-8773
 Project: HBase
  Issue Type: New Feature
  Components: HFile
Affects Versions: 0.94.8
Reporter: jay wong
  Labels: patch
 Fix For: 0.94.9


 when I want create HFile with the ImportTsv. I found that if i set the 
 compression in the Configuration or not, It's always invalid。
 It because of the method 'configureIncrementalLoad' in HFileOutputFormat will 
 set the compression with the HTable in meta. So if add a configuration to 
 switch use set compression with HTable or Not

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-8773) Can be setup the COMPRESSION base on HTable in meta or user set in Configuration


 [ 
https://issues.apache.org/jira/browse/HBASE-8773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jay wong updated HBASE-8773:


Attachment: HBASE-8773.patch

 Can be setup the COMPRESSION base on HTable in meta or user set in 
 Configuration
 

 Key: HBASE-8773
 URL: https://issues.apache.org/jira/browse/HBASE-8773
 Project: HBase
  Issue Type: New Feature
  Components: HFile
Affects Versions: 0.94.8
Reporter: jay wong
 Fix For: 0.94.9

 Attachments: HBASE-8773.patch


 when I want create HFile with the ImportTsv. I found that if i set the 
 compression in the Configuration or not, It's always invalid。
 It because of the method 'configureIncrementalLoad' in HFileOutputFormat will 
 set the compression with the HTable in meta. So if add a configuration to 
 switch use set compression with HTable or Not

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-8773) Can be setup the COMPRESSION base on HTable in meta or user set in Configuration


 [ 
https://issues.apache.org/jira/browse/HBASE-8773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jay wong updated HBASE-8773:


Hadoop Flags:   (was: Reviewed)

 Can be setup the COMPRESSION base on HTable in meta or user set in 
 Configuration
 

 Key: HBASE-8773
 URL: https://issues.apache.org/jira/browse/HBASE-8773
 Project: HBase
  Issue Type: New Feature
  Components: HFile
Affects Versions: 0.94.8
Reporter: jay wong
 Fix For: 0.94.9

 Attachments: HBASE-8773.patch


 when I want create HFile with the ImportTsv. I found that if i set the 
 compression in the Configuration or not, It's always invalid。
 It because of the method 'configureIncrementalLoad' in HFileOutputFormat will 
 set the compression with the HTable in meta. So if add a configuration to 
 switch use set compression with HTable or Not

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-8667) Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine.


 [ 
https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

rajeshbabu updated HBASE-8667:
--

Attachment: HBASE-8667_trunk_v4.patch

Patch addressing Stack's comments.

 Master and Regionserver not able to communicate if both bound to different 
 network interfaces on the same machine.
 --

 Key: HBASE-8667
 URL: https://issues.apache.org/jira/browse/HBASE-8667
 Project: HBase
  Issue Type: Bug
  Components: IPC/RPC
Reporter: rajeshbabu
 Fix For: 0.98.0, 0.95.2, 0.94.9

 Attachments: HBASE-8667_trunk.patch, HBASE-8667_Trunk.patch, 
 HBASE-8667_Trunk-V2.patch, HBASE-8667_trunk_v4.patch


 While testing HBASE-8640 fix found that master and regionserver running on 
 different interfaces are not communicating properly.
 I have two interfaces 1) lo 2) eth0 in my machine and default hostname 
 interface is lo.
 I have configured master ipc address to ip of eth0 interface.
 Started master and regionserver on the same machine.
 1) master rpc server bound to eth0 and RS rpc server bound to lo
 2) Since rpc client is not binding to any ip address, when RS is reporting RS 
 startup its getting registered with eth0 ip address(but actually it should 
 register localhost)
 Here are RS logs:
 {code}
 2013-05-31 06:05:28,608 WARN  [regionserver60020] 
 org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; 
 sleeping and then retrying.
 2013-05-31 06:05:31,609 INFO  [regionserver60020] 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to 
 Master server at 192.168.0.100,6,1369960497008
 2013-05-31 06:05:31,609 INFO  [regionserver60020] 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 
 192.168.0.100,6,1369960497008 that we are up with port=60020, 
 startcode=1369960502544
 2013-05-31 06:05:31,618 DEBUG [regionserver60020] 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: 
 hbase.rootdir=hdfs://localhost:2851/hbase
 2013-05-31 06:05:31,618 DEBUG [regionserver60020] 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: 
 fs.default.name=hdfs://localhost:2851
 2013-05-31 06:05:31,618 INFO  [regionserver60020] 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us a 
 different hostname to use; was=localhost, but now=192.168.0.100
 {code}
 Here are master logs:
 {code}
 2013-05-31 06:05:31,615 INFO  [IPC Server handler 9 on 6] 
 org.apache.hadoop.hbase.master.ServerManager: Registering 
 server=192.168.0.100,60020,1369960502544
 {code}
 Since master has wrong rpc server address of RS, META is not getting assigned.
 {code}
 2013-05-31 06:05:34,362 DEBUG [master-192.168.0.100,6,1369960497008] 
 org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan 
 was found (or we are ignoring an existing plan) for .META.,,1.1028785192 so 
 generated a random one; hri=.META.,,1.1028785192, src=, 
 dest=192.168.0.100,60020,1369960502544; 1 (online=1, available=1) available 
 servers, forceNewPlan=false
 -
 org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of 
 .META.,,1.1028785192 to 192.168.0.100,60020,1369960502544, trying to assign 
 elsewhere instead; try=1 of 10
 java.net.ConnectException: Connection refused
   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
   at 
 sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
   at 
 org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511)
   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:549)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:813)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1422)
   at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1315)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1532)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1587)
   at 
 org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.openRegion(AdminProtos.java:15039)
   at 
 org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:627)
   at 
 org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1826)
   at 
 org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1453)
   at 
 org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1432)

[jira] [Assigned] (HBASE-8667) Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine.


 [ 
https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

rajeshbabu reassigned HBASE-8667:
-

Assignee: rajeshbabu

 Master and Regionserver not able to communicate if both bound to different 
 network interfaces on the same machine.
 --

 Key: HBASE-8667
 URL: https://issues.apache.org/jira/browse/HBASE-8667
 Project: HBase
  Issue Type: Bug
  Components: IPC/RPC
Reporter: rajeshbabu
Assignee: rajeshbabu
 Fix For: 0.98.0, 0.95.2, 0.94.9

 Attachments: HBASE-8667_trunk.patch, HBASE-8667_Trunk.patch, 
 HBASE-8667_Trunk-V2.patch, HBASE-8667_trunk_v4.patch


 While testing HBASE-8640 fix found that master and regionserver running on 
 different interfaces are not communicating properly.
 I have two interfaces 1) lo 2) eth0 in my machine and default hostname 
 interface is lo.
 I have configured master ipc address to ip of eth0 interface.
 Started master and regionserver on the same machine.
 1) master rpc server bound to eth0 and RS rpc server bound to lo
 2) Since rpc client is not binding to any ip address, when RS is reporting RS 
 startup its getting registered with eth0 ip address(but actually it should 
 register localhost)
 Here are RS logs:
 {code}
 2013-05-31 06:05:28,608 WARN  [regionserver60020] 
 org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; 
 sleeping and then retrying.
 2013-05-31 06:05:31,609 INFO  [regionserver60020] 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to 
 Master server at 192.168.0.100,6,1369960497008
 2013-05-31 06:05:31,609 INFO  [regionserver60020] 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 
 192.168.0.100,6,1369960497008 that we are up with port=60020, 
 startcode=1369960502544
 2013-05-31 06:05:31,618 DEBUG [regionserver60020] 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: 
 hbase.rootdir=hdfs://localhost:2851/hbase
 2013-05-31 06:05:31,618 DEBUG [regionserver60020] 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: 
 fs.default.name=hdfs://localhost:2851
 2013-05-31 06:05:31,618 INFO  [regionserver60020] 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us a 
 different hostname to use; was=localhost, but now=192.168.0.100
 {code}
 Here are master logs:
 {code}
 2013-05-31 06:05:31,615 INFO  [IPC Server handler 9 on 6] 
 org.apache.hadoop.hbase.master.ServerManager: Registering 
 server=192.168.0.100,60020,1369960502544
 {code}
 Since master has wrong rpc server address of RS, META is not getting assigned.
 {code}
 2013-05-31 06:05:34,362 DEBUG [master-192.168.0.100,6,1369960497008] 
 org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan 
 was found (or we are ignoring an existing plan) for .META.,,1.1028785192 so 
 generated a random one; hri=.META.,,1.1028785192, src=, 
 dest=192.168.0.100,60020,1369960502544; 1 (online=1, available=1) available 
 servers, forceNewPlan=false
 -
 org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of 
 .META.,,1.1028785192 to 192.168.0.100,60020,1369960502544, trying to assign 
 elsewhere instead; try=1 of 10
 java.net.ConnectException: Connection refused
   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
   at 
 sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
   at 
 org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511)
   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:549)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:813)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1422)
   at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1315)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1532)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1587)
   at 
 org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.openRegion(AdminProtos.java:15039)
   at 
 org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:627)
   at 
 org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1826)
   at 
 org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1453)
   at 
 org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1432)
   at

[jira] [Commented] (HBASE-8667) Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine.

2013-06-20 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689011#comment-13689011
 ] 

Hadoop QA commented on HBASE-8667:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12588784/HBASE-8667_trunk_v4.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6083//console

This message is automatically generated.

 Master and Regionserver not able to communicate if both bound to different 
 network interfaces on the same machine.
 --

 Key: HBASE-8667
 URL: https://issues.apache.org/jira/browse/HBASE-8667
 Project: HBase
  Issue Type: Bug
  Components: IPC/RPC
Reporter: rajeshbabu
Assignee: rajeshbabu
 Fix For: 0.98.0, 0.95.2, 0.94.9

 Attachments: HBASE-8667_trunk.patch, HBASE-8667_Trunk.patch, 
 HBASE-8667_Trunk-V2.patch, HBASE-8667_trunk_v4.patch


 While testing HBASE-8640 fix found that master and regionserver running on 
 different interfaces are not communicating properly.
 I have two interfaces 1) lo 2) eth0 in my machine and default hostname 
 interface is lo.
 I have configured master ipc address to ip of eth0 interface.
 Started master and regionserver on the same machine.
 1) master rpc server bound to eth0 and RS rpc server bound to lo
 2) Since rpc client is not binding to any ip address, when RS is reporting RS 
 startup its getting registered with eth0 ip address(but actually it should 
 register localhost)
 Here are RS logs:
 {code}
 2013-05-31 06:05:28,608 WARN  [regionserver60020] 
 org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; 
 sleeping and then retrying.
 2013-05-31 06:05:31,609 INFO  [regionserver60020] 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to 
 Master server at 192.168.0.100,6,1369960497008
 2013-05-31 06:05:31,609 INFO  [regionserver60020] 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 
 192.168.0.100,6,1369960497008 that we are up with port=60020, 
 startcode=1369960502544
 2013-05-31 06:05:31,618 DEBUG [regionserver60020] 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: 
 hbase.rootdir=hdfs://localhost:2851/hbase
 2013-05-31 06:05:31,618 DEBUG [regionserver60020] 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: 
 fs.default.name=hdfs://localhost:2851
 2013-05-31 06:05:31,618 INFO  [regionserver60020] 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us a 
 different hostname to use; was=localhost, but now=192.168.0.100
 {code}
 Here are master logs:
 {code}
 2013-05-31 06:05:31,615 INFO  [IPC Server handler 9 on 6] 
 org.apache.hadoop.hbase.master.ServerManager: Registering 
 server=192.168.0.100,60020,1369960502544
 {code}
 Since master has wrong rpc server address of RS, META is not getting assigned.
 {code}
 2013-05-31 06:05:34,362 DEBUG [master-192.168.0.100,6,1369960497008] 
 org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan 
 was found (or we are ignoring an existing plan) for .META.,,1.1028785192 so 
 generated a random one; hri=.META.,,1.1028785192, src=, 
 dest=192.168.0.100,60020,1369960502544; 1 (online=1, available=1) available 
 servers, forceNewPlan=false
 -
 org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of 
 .META.,,1.1028785192 to 192.168.0.100,60020,1369960502544, trying to assign 
 elsewhere instead; try=1 of 10
 java.net.ConnectException: Connection refused
   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
   at 
 sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
   at 
 org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511)
   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:549)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:813)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1422)
   at

[jira] [Commented] (HBASE-7404) Bucket Cache:A solution about CMS,Heap Fragment and Big Cache on HBASE

2013-06-20 Thread Wei Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-7404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689013#comment-13689013
 ] 

Wei Li commented on HBASE-7404:
---

The default value of bucketCachePercentage is 0 currently, I suggest set it to 
hfile.block.cache.size if combinedWithLru is true.

 Bucket Cache:A solution about CMS,Heap Fragment and Big Cache on HBASE
 --

 Key: HBASE-7404
 URL: https://issues.apache.org/jira/browse/HBASE-7404
 Project: HBase
  Issue Type: New Feature
Affects Versions: 0.94.3
Reporter: chunhui shen
Assignee: chunhui shen
 Fix For: 0.95.0

 Attachments: 7404-trunk-v10.patch, 7404-trunk-v11.patch, 
 7404-trunk-v12.patch, 7404-trunk-v13.patch, 7404-trunk-v13.txt, 
 7404-trunk-v14.patch, BucketCache.pdf, hbase-7404-94v2.patch, 
 HBASE-7404-backport-0.94.patch, hbase-7404-trunkv2.patch, 
 hbase-7404-trunkv9.patch, Introduction of Bucket Cache.pdf


 First, thanks @neil from Fusion-IO share the source code.
 Usage:
 1.Use bucket cache as main memory cache, configured as the following:
 –hbase.bucketcache.ioengine heap
 –hbase.bucketcache.size 0.4 (size for bucket cache, 0.4 is a percentage of 
 max heap size)
 2.Use bucket cache as a secondary cache, configured as the following:
 –hbase.bucketcache.ioengine file:/disk1/hbase/cache.data(The file path 
 where to store the block data)
 –hbase.bucketcache.size 1024 (size for bucket cache, unit is MB, so 1024 
 means 1GB)
 –hbase.bucketcache.combinedcache.enabled false (default value being true)
 See more configurations from org.apache.hadoop.hbase.io.hfile.CacheConfig and 
 org.apache.hadoop.hbase.io.hfile.bucket.BucketCache
 What's Bucket Cache? 
 It could greatly decrease CMS and heap fragment by GC
 It support a large cache space for High Read Performance by using high speed 
 disk like Fusion-io
 1.An implementation of block cache like LruBlockCache
 2.Self manage blocks' storage position through Bucket Allocator
 3.The cached blocks could be stored in the memory or file system
 4.Bucket Cache could be used as a mainly block cache(see CombinedBlockCache), 
 combined with LruBlockCache to decrease CMS and fragment by GC.
 5.BucketCache also could be used as a secondary cache(e.g. using Fusionio to 
 store block) to enlarge cache space
 How about SlabCache?
 We have studied and test SlabCache first, but the result is bad, because:
 1.SlabCache use SingleSizeCache, its use ratio of memory is low because kinds 
 of block size, especially using DataBlockEncoding
 2.SlabCache is uesd in DoubleBlockCache, block is cached both in SlabCache 
 and LruBlockCache, put the block to LruBlockCache again if hit in SlabCache , 
 it causes CMS and heap fragment don't get any better
 3.Direct heap performance is not good as heap, and maybe cause OOM, so we 
 recommend using heap engine 
 See more in the attachment and in the patch

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Comment Edited] (HBASE-8759) Family Delete Markers not getting purged after major compaction


[ 
https://issues.apache.org/jira/browse/HBASE-8759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13688988#comment-13688988
 ] 

Lars Hofhansl edited comment on HBASE-8759 at 6/20/13 9:05 AM:
---

NP. I should also be a bit more explicit how family delete markets are actually 
deleted. The logic is this:
Each compaction registers the timestamp of the oldest put in the created hfile.
The next (major) compaction then removes all family markers that are older than 
the oldest put. So as long as the client does not keep backdating puts, 
eventually all family markers drop out of the compacted files.

  was (Author: lhofhansl):
NP. I should also be a bit more explicit how family delete markets are 
actually deleted. The logic is this:
Each compaction registers the timestamp of the oldest put in the created hfile.
The next (major) compaction then removes all family markers that are older than 
the oldest put. So as long as the client does not keep puts, eventually all 
family markers drop out of the compacted files.
  
 Family Delete Markers not getting purged after major compaction
 ---

 Key: HBASE-8759
 URL: https://issues.apache.org/jira/browse/HBASE-8759
 Project: HBase
  Issue Type: Bug
  Components: Compaction
Affects Versions: 0.94.7
Reporter: Mujtaba Chohan
Priority: Minor

 On table with VERSIONS = '1', KEEP_DELETED_CELLS = 'true'. Family Delete 
 Markers does not get purged after put  delete  major compaction (they keep 
 on incrementing after every put  delete  major compaction)
 Following is the raw scan output after 10 iterations of put  delete  major 
 compaction.
 ROW   COLUMN+CELL 
   
   
 Acolumn=CF:, timestamp=1371512706683, 
 type=DeleteFamily 
  
 Acolumn=CF:, timestamp=1371512706394, 
 type=DeleteFamily 
  
 Acolumn=CF:, timestamp=1371512706054, 
 type=DeleteFamily 
  
 Acolumn=CF:, timestamp=1371512705763, 
 type=DeleteFamily 
  
 Acolumn=CF:, timestamp=1371512705457, 
 type=DeleteFamily 
  
 Acolumn=CF:, timestamp=1371512705149, 
 type=DeleteFamily 
  
 Acolumn=CF:, timestamp=1371512704836, 
 type=DeleteFamily 
  
 Acolumn=CF:, timestamp=1371512704518, 
 type=DeleteFamily 
  
 Acolumn=CF:, timestamp=1371512704162, 
 type=DeleteFamily 
  
 Acolumn=CF:, timestamp=1371512703779, 
 type=DeleteFamily 
  
 Acolumn=CF:COL, 
 timestamp=1371512706682, value=X 
 [~lhofhansl]
 Code to repro this issue:
 http://phoenix-bin.github.io/client/code/delete.java

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HBASE-8759) Family Delete Markers not getting purged after major compaction


 [ 
https://issues.apache.org/jira/browse/HBASE-8759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-8759.
--

Resolution: Not A Problem

 Family Delete Markers not getting purged after major compaction
 ---

 Key: HBASE-8759
 URL: https://issues.apache.org/jira/browse/HBASE-8759
 Project: HBase
  Issue Type: Bug
  Components: Compaction
Affects Versions: 0.94.7
Reporter: Mujtaba Chohan
Priority: Minor

 On table with VERSIONS = '1', KEEP_DELETED_CELLS = 'true'. Family Delete 
 Markers does not get purged after put  delete  major compaction (they keep 
 on incrementing after every put  delete  major compaction)
 Following is the raw scan output after 10 iterations of put  delete  major 
 compaction.
 ROW   COLUMN+CELL 
   
   
 Acolumn=CF:, timestamp=1371512706683, 
 type=DeleteFamily 
  
 Acolumn=CF:, timestamp=1371512706394, 
 type=DeleteFamily 
  
 Acolumn=CF:, timestamp=1371512706054, 
 type=DeleteFamily 
  
 Acolumn=CF:, timestamp=1371512705763, 
 type=DeleteFamily 
  
 Acolumn=CF:, timestamp=1371512705457, 
 type=DeleteFamily 
  
 Acolumn=CF:, timestamp=1371512705149, 
 type=DeleteFamily 
  
 Acolumn=CF:, timestamp=1371512704836, 
 type=DeleteFamily 
  
 Acolumn=CF:, timestamp=1371512704518, 
 type=DeleteFamily 
  
 Acolumn=CF:, timestamp=1371512704162, 
 type=DeleteFamily 
  
 Acolumn=CF:, timestamp=1371512703779, 
 type=DeleteFamily 
  
 Acolumn=CF:COL, 
 timestamp=1371512706682, value=X 
 [~lhofhansl]
 Code to repro this issue:
 http://phoenix-bin.github.io/client/code/delete.java

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-8721) Deletes can mask puts that happen after the delete

[
https://issues.apache.org/jira/browse/HBASE-8721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689043#comment-13689043
]

Lars Hofhansl commented on HBASE-8721:
--

If your clients use the timestamp not as a timestamp, can they add whatever
their values are to the rowkey?

bq. do you mean some users rely on the KEEP_DELETED_CELLS and want the feature
Delete can mask puts that happen after the delete?

Exactly. When KEEP_DELETED_CELLS is enabled you can do true time range queries
in HBase. For example you get the exact state of your data as of last week, or
an hour ago, etc; even when data was deleted via delete markers.

I think adding a global or maybe column family config option to change this
behavior is fine, as long as the code does not get too convoluted. In that case
we need to make sure then that all other HBase features such as replication,
WAL replay, as-of-time queries, bulk loading HFiles, etc still work as
expected. Also need to check that the HFile metadata is still correct as the
timerange of the included KVs is used to exclude HFile from scans in some
situations (if if you put a Delete marker at MAX_LONG this HFile would not be
excluded for queries on new data, unless we add some other special logic).

Even in that case I'd still be -0 on this (but I would no longer veto it with a
-1) - this looks like a very app specific use case to me.
You would need to find one or two committers who are ready to +1 this feature
and patch to get it committed.

Deletes can mask puts that happen after the delete
--

Key: HBASE-8721
URL: https://issues.apache.org/jira/browse/HBASE-8721
Project: HBase
Issue Type: Improvement
Components: regionserver
Reporter: Feng Honghua
Attachments: HBASE-8721-0.94-V0.patch

this fix aims for bug mentioned in http://hbase.apache.org/book.html 5.8.2.1:
Deletes mask puts, even puts that happened after the delete was entered.
Remember that a delete writes a tombstone, which only disappears after then
next major compaction has run. Suppose you do a delete of everything = T.
After this you do a new put with a timestamp = T. This put, even if it
happened after the delete, will be masked by the delete tombstone. Performing
the put will not fail, but when you do a get you will notice the put did have
no effect. It will start working again after the major compaction has run.
These issues should not be a problem if you use always-increasing versions
for new puts to a row. But they can occur even if you do not care about time:
just do delete and put immediately after each other, and there is some chance
they happen within the same millisecond.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-8060) Num compacting KVs diverges from num compacted KVs over time


[ 
https://issues.apache.org/jira/browse/HBASE-8060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689047#comment-13689047
 ] 

Lars Hofhansl commented on HBASE-8060:
--

What is the meaning of totalCompactingKVs? Overall total? Or just the total for 
the last compaction?
Does the patch change the meaning by resetting to current compacted count?

 Num compacting KVs diverges from num compacted KVs over time
 

 Key: HBASE-8060
 URL: https://issues.apache.org/jira/browse/HBASE-8060
 Project: HBase
  Issue Type: Bug
  Components: Compaction, UI
Affects Versions: 0.94.6, 0.95.0, 0.95.2
Reporter: Andrew Purtell
Assignee: Sergey Shelukhin
 Attachments: HBASE-8060-v0.patch, screenshot.png


 I have been running what amounts to an ingestion test for a day or so. This 
 is an all-in-one cluster launched with './bin/hbase master start' from 
 sources. In the RS stats on the master UI, the num compacting KVs has 
 diverged from num compacted KVs even though compaction has been completed 
 from perspective of selection, no compaction tasks are running on the RS. I 
 think this could be confusing -- is compaction happening or not?
 Or maybe I'm misunderstanding what this is supposed to show?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6295) Possible performance improvement in client batch operations: presplit and send in background


[ 
https://issues.apache.org/jira/browse/HBASE-6295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689055#comment-13689055
 ] 

Nicolas Liochon commented on HBASE-6295:


[~jmspaggi] I'm waiting for your feedback then. BTW, if you have time ( :-) ), 
publishing a comparison between the 0.95 without this patch  0.94 might be 
useful. I'm saying this because if we have a performance degradation with the 
0.94 this patch will hide it...

 Possible performance improvement in client batch operations: presplit and 
 send in background
 

 Key: HBASE-6295
 URL: https://issues.apache.org/jira/browse/HBASE-6295
 Project: HBase
  Issue Type: Improvement
  Components: Client, Performance
Affects Versions: 0.95.2
Reporter: Nicolas Liochon
Assignee: Nicolas Liochon
  Labels: noob
 Fix For: 0.98.0

 Attachments: 6295.v11.patch, 6295.v12.patch, 6295.v14.patch, 
 6295.v1.patch, 6295.v2.patch, 6295.v3.patch, 6295.v4.patch, 6295.v5.patch, 
 6295.v6.patch, 6295.v8.patch, 6295.v9.patch


 today batch algo is:
 {noformat}
 for Operation o: ListOp{
   add o to todolist
   if todolist  maxsize or o last in list
 split todolist per location
 send split lists to region servers
 clear todolist
 wait
 }
 {noformat}
 We could:
 - create immediately the final object instead of an intermediate array
 - split per location immediately
 - instead of sending when the list as a whole is full, send it when there is 
 enough data for a single location
 It would be:
 {noformat}
 for Operation o: ListOp{
   get location
   add o to todo location.todolist
   if (location.todolist  maxLocationSize)
 send location.todolist to region server 
 clear location.todolist
 // don't wait, continue the loop
 }
 send remaining
 wait
 {noformat}
 It's not trivial to write if you add error management: retried list must be 
 shared with the operations added in the todolist. But it's doable.
 It's interesting mainly for 'big' writes

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-8667) Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine.


 [ 
https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

rajeshbabu updated HBASE-8667:
--

Attachment: HBASE-8667_trunk_v5.patch

[~stack]
InetSocketAddress.createUnresolved is creating unresolved socket address which 
can be used only in some circumstances like connecting through proxy. Any way 
avoided extra resolving by passing InetAddress instead of hostname.
{code}
+rpcClient = new RpcClient(conf, clusterId, new InetSocketAddress(
+this.isa.getAddress(), 0));
{code}

 Master and Regionserver not able to communicate if both bound to different 
 network interfaces on the same machine.
 --

 Key: HBASE-8667
 URL: https://issues.apache.org/jira/browse/HBASE-8667
 Project: HBase
  Issue Type: Bug
  Components: IPC/RPC
Reporter: rajeshbabu
Assignee: rajeshbabu
 Fix For: 0.98.0, 0.95.2, 0.94.9

 Attachments: HBASE-8667_trunk.patch, HBASE-8667_Trunk.patch, 
 HBASE-8667_Trunk-V2.patch, HBASE-8667_trunk_v4.patch, 
 HBASE-8667_trunk_v5.patch


 While testing HBASE-8640 fix found that master and regionserver running on 
 different interfaces are not communicating properly.
 I have two interfaces 1) lo 2) eth0 in my machine and default hostname 
 interface is lo.
 I have configured master ipc address to ip of eth0 interface.
 Started master and regionserver on the same machine.
 1) master rpc server bound to eth0 and RS rpc server bound to lo
 2) Since rpc client is not binding to any ip address, when RS is reporting RS 
 startup its getting registered with eth0 ip address(but actually it should 
 register localhost)
 Here are RS logs:
 {code}
 2013-05-31 06:05:28,608 WARN  [regionserver60020] 
 org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; 
 sleeping and then retrying.
 2013-05-31 06:05:31,609 INFO  [regionserver60020] 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to 
 Master server at 192.168.0.100,6,1369960497008
 2013-05-31 06:05:31,609 INFO  [regionserver60020] 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 
 192.168.0.100,6,1369960497008 that we are up with port=60020, 
 startcode=1369960502544
 2013-05-31 06:05:31,618 DEBUG [regionserver60020] 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: 
 hbase.rootdir=hdfs://localhost:2851/hbase
 2013-05-31 06:05:31,618 DEBUG [regionserver60020] 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: 
 fs.default.name=hdfs://localhost:2851
 2013-05-31 06:05:31,618 INFO  [regionserver60020] 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us a 
 different hostname to use; was=localhost, but now=192.168.0.100
 {code}
 Here are master logs:
 {code}
 2013-05-31 06:05:31,615 INFO  [IPC Server handler 9 on 6] 
 org.apache.hadoop.hbase.master.ServerManager: Registering 
 server=192.168.0.100,60020,1369960502544
 {code}
 Since master has wrong rpc server address of RS, META is not getting assigned.
 {code}
 2013-05-31 06:05:34,362 DEBUG [master-192.168.0.100,6,1369960497008] 
 org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan 
 was found (or we are ignoring an existing plan) for .META.,,1.1028785192 so 
 generated a random one; hri=.META.,,1.1028785192, src=, 
 dest=192.168.0.100,60020,1369960502544; 1 (online=1, available=1) available 
 servers, forceNewPlan=false
 -
 org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of 
 .META.,,1.1028785192 to 192.168.0.100,60020,1369960502544, trying to assign 
 elsewhere instead; try=1 of 10
 java.net.ConnectException: Connection refused
   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
   at 
 sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
   at 
 org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511)
   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:549)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:813)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1422)
   at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1315)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1532)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1587)
   at 
 org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.openRegion(AdminProtos.java:15039)

[jira] [Commented] (HBASE-8773) Can be setup the COMPRESSION base on HTable in meta or user set in Configuration

2013-06-20 Thread Anoop Sam John (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-8773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689068#comment-13689068
 ] 

Anoop Sam John commented on HBASE-8773:
---

So u dont want to use the compression based on the config param. There is some 
compression scheme already for the HTable and u just want ot continue with that?

 Can be setup the COMPRESSION base on HTable in meta or user set in 
 Configuration
 

 Key: HBASE-8773
 URL: https://issues.apache.org/jira/browse/HBASE-8773
 Project: HBase
  Issue Type: New Feature
  Components: HFile
Affects Versions: 0.94.8
Reporter: jay wong
 Fix For: 0.94.9

 Attachments: HBASE-8773.patch


 when I want create HFile with the ImportTsv. I found that if i set the 
 compression in the Configuration or not, It's always invalid。
 It because of the method 'configureIncrementalLoad' in HFileOutputFormat will 
 set the compression with the HTable in meta. So if add a configuration to 
 switch use set compression with HTable or Not

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HBASE-8774) Add BatchSize and Filter to Thrift2

Hamed Madani created HBASE-8774:
---

 Summary: Add BatchSize and Filter to Thrift2
 Key: HBASE-8774
 URL: https://issues.apache.org/jira/browse/HBASE-8774
 Project: HBase
  Issue Type: Improvement
  Components: Thrift
Affects Versions: 0.95.1
Reporter: Hamed Madani
Priority: Minor
 Attachments: HBASE_8774.patch

Attached Patch will add BatchSize and Filter support to Thrift2

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-8774) Add BatchSize and Filter to Thrift2


 [ 
https://issues.apache.org/jira/browse/HBASE-8774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hamed Madani updated HBASE-8774:


Attachment: HBASE_8774.patch

 Add BatchSize and Filter to Thrift2
 ---

 Key: HBASE-8774
 URL: https://issues.apache.org/jira/browse/HBASE-8774
 Project: HBase
  Issue Type: Improvement
  Components: Thrift
Affects Versions: 0.95.1
Reporter: Hamed Madani
Priority: Minor
 Attachments: HBASE_8774.patch


 Attached Patch will add BatchSize and Filter support to Thrift2

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-8755) A new write thread model for HLog to improve the overall HBase write throughput


[ 
https://issues.apache.org/jira/browse/HBASE-8755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689094#comment-13689094
 ] 

Feng Honghua commented on HBASE-8755:
-

If possible, would anybody else help do the same comparison test as Chunhui/me? 
Thanks in advance. [~lhofhansl] [~yuzhih...@gmail.com] [~sershe] [~stack]

 A new write thread model for HLog to improve the overall HBase write 
 throughput
 ---

 Key: HBASE-8755
 URL: https://issues.apache.org/jira/browse/HBASE-8755
 Project: HBase
  Issue Type: Improvement
  Components: wal
Reporter: Feng Honghua
 Attachments: HBASE-8755-0.94-V0.patch, HBASE-8755-0.94-V1.patch, 
 HBASE-8755-trunk-V0.patch


 In current write model, each write handler thread (executing put()) will 
 individually go through a full 'append (hlog local buffer) = HLog writer 
 append (write to hdfs) = HLog writer sync (sync hdfs)' cycle for each write, 
 which incurs heavy race condition on updateLock and flushLock.
 The only optimization where checking if current syncTillHere  txid in 
 expectation for other thread help write/sync its own txid to hdfs and 
 omitting the write/sync actually help much less than expectation.
 Three of my colleagues(Ye Hangjun / Wu Zesheng / Zhang Peng) at Xiaomi 
 proposed a new write thread model for writing hdfs sequence file and the 
 prototype implementation shows a 4X improvement for throughput (from 17000 to 
 7+). 
 I apply this new write thread model in HLog and the performance test in our 
 test cluster shows about 3X throughput improvement (from 12150 to 31520 for 1 
 RS, from 22000 to 7 for 5 RS), the 1 RS write throughput (1K row-size) 
 even beats the one of BigTable (Precolator published in 2011 says Bigtable's 
 write throughput then is 31002). I can provide the detailed performance test 
 results if anyone is interested.
 The change for new write thread model is as below:
  1 All put handler threads append the edits to HLog's local pending buffer; 
 (it notifies AsyncWriter thread that there is new edits in local buffer)
  2 All put handler threads wait in HLog.syncer() function for underlying 
 threads to finish the sync that contains its txid;
  3 An single AsyncWriter thread is responsible for retrieve all the buffered 
 edits in HLog's local pending buffer and write to the hdfs 
 (hlog.writer.append); (it notifies AsyncFlusher thread that there is new 
 writes to hdfs that needs a sync)
  4 An single AsyncFlusher thread is responsible for issuing a sync to hdfs 
 to persist the writes by AsyncWriter; (it notifies the AsyncNotifier thread 
 that sync watermark increases)
  5 An single AsyncNotifier thread is responsible for notifying all pending 
 put handler threads which are waiting in the HLog.syncer() function
  6 No LogSyncer thread any more (since there is always 
 AsyncWriter/AsyncFlusher threads do the same job it does)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-8721) Deletes can mask puts that happen after the delete

2013-06-20 Thread Hangjun Ye (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-8721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689098#comment-13689098
]

Hangjun Ye commented on HBASE-8721:
---

Nice to know adding a config is acceptable at least!
You pointed out many features that we need to be careful not to break, we
should do that as you suggested.

Back to the feature of KEEP_DELETED_CELLS, my perception is even we disable
Delete can mask puts that happen after the delete (whether by a config or by
other ways), KEEP_DELETED_CELLS still works as you expect. Sounds they are
basically independent features?

Deletes can mask puts that happen after the delete
--

Key: HBASE-8721
URL: https://issues.apache.org/jira/browse/HBASE-8721
Project: HBase
Issue Type: Improvement
Components: regionserver
Reporter: Feng Honghua
Attachments: HBASE-8721-0.94-V0.patch

[jira] [Updated] (HBASE-8774) Add BatchSize and Filter to Thrift2


 [ 
https://issues.apache.org/jira/browse/HBASE-8774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hamed Madani updated HBASE-8774:


Status: Patch Available  (was: Open)

 Add BatchSize and Filter to Thrift2
 ---

 Key: HBASE-8774
 URL: https://issues.apache.org/jira/browse/HBASE-8774
 Project: HBase
  Issue Type: Improvement
  Components: Thrift
Affects Versions: 0.95.1
Reporter: Hamed Madani
Priority: Minor
 Attachments: HBASE_8774.patch


 Attached Patch will add BatchSize and Filter support to Thrift2

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-8774) Add BatchSize and Filter to Thrift2


 [ 
https://issues.apache.org/jira/browse/HBASE-8774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hamed Madani updated HBASE-8774:


Issue Type: New Feature  (was: Improvement)

 Add BatchSize and Filter to Thrift2
 ---

 Key: HBASE-8774
 URL: https://issues.apache.org/jira/browse/HBASE-8774
 Project: HBase
  Issue Type: New Feature
  Components: Thrift
Affects Versions: 0.95.1
Reporter: Hamed Madani
Priority: Minor
 Attachments: HBASE_8774.patch


 Attached Patch will add BatchSize and Filter support to Thrift2

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-8774) Add BatchSize and Filter to Thrift2


 [ 
https://issues.apache.org/jira/browse/HBASE-8774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hamed Madani updated HBASE-8774:


Priority: Major  (was: Minor)

 Add BatchSize and Filter to Thrift2
 ---

 Key: HBASE-8774
 URL: https://issues.apache.org/jira/browse/HBASE-8774
 Project: HBase
  Issue Type: New Feature
  Components: Thrift
Affects Versions: 0.95.1
Reporter: Hamed Madani
 Attachments: HBASE_8774.patch


 Attached Patch will add BatchSize and Filter support to Thrift2

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-8774) Add BatchSize and Filter to Thrift2

2013-06-20 Thread Jieshan Bean (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-8774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689107#comment-13689107
 ] 

Jieshan Bean commented on HBASE-8774:
-

See HBASE-6073. It's also regarding on adding filter to Thrift2.

1 minor problem in the patch:
{code}
+boolean this_present_filterString = true  this.isSetFilterString();
+boolean that_present_filterString = true  that.isSetFilterString();
{code}
true is redundant.

In addition, I suggest to add 1 unit test. Anyway, it's a nice patch.

 Add BatchSize and Filter to Thrift2
 ---

 Key: HBASE-8774
 URL: https://issues.apache.org/jira/browse/HBASE-8774
 Project: HBase
  Issue Type: New Feature
  Components: Thrift
Affects Versions: 0.95.1
Reporter: Hamed Madani
 Attachments: HBASE_8774.patch


 Attached Patch will add BatchSize and Filter support to Thrift2

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-8627) HBCK can not fix meta not assigned issue

2013-06-20 Thread Anoop Sam John (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-8627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689108#comment-13689108
 ] 

Anoop Sam John commented on HBASE-8627:
---

[~jmhsieh], [~jxiang] , [~sershe] comments?

 HBCK can not fix meta not assigned issue
 

 Key: HBASE-8627
 URL: https://issues.apache.org/jira/browse/HBASE-8627
 Project: HBase
  Issue Type: Bug
  Components: hbck
Affects Versions: 0.95.0
Reporter: Anoop Sam John
Assignee: Anoop Sam John
 Attachments: HBASE-8627_Trunk.patch, HBASE-8627_Trunk-V2.patch


 When meta table region is not assigned to any RS, HBCK run will get 
 exception. I can see code added in checkMetaRegion() to solve this issue but 
 it wont work. It still refers to ROOT region!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-8705) RS holding META when restarted in a single node setup may hang infinitely without META assignment

2013-06-20 Thread ramkrishna.s.vasudevan (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-8705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689139#comment-13689139
 ] 

ramkrishna.s.vasudevan commented on HBASE-8705:
---

Am not able to reproduce this scenario every time.  But after seeing the logs 
the restart of RS will not solve the problem because the META location is 
already unset in the ZK.
I can commit this patch unless objections.

 RS holding META when restarted in a single node setup may hang infinitely 
 without META assignment
 -

 Key: HBASE-8705
 URL: https://issues.apache.org/jira/browse/HBASE-8705
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.95.1
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
Priority: Minor
 Fix For: 0.98.0

 Attachments: HBASE-8705_1.patch, HBASE-8705_2.patch, HBASE-8705.patch


 This bug may be minor as it likely to happen in a single node setup.
 I restarted the RS holding META. The master tried assigning META using 
 MetaSSH. But tried this before the new RS came up.
 So as not region plan is found 
 {code}
  if (plan == null) {
 LOG.warn(Unable to determine a plan to assign  + region);
 if (tomActivated){
   this.timeoutMonitor.setAllRegionServersOffline(true);
 } else {
   regionStates.updateRegionState(region, 
 RegionState.State.FAILED_OPEN);
 }
 return;
   }
 {code}
 we just return without assigment.  And this being the META the small cluster 
 just hangs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-8721) Deletes can mask puts that happen after the delete


[ 
https://issues.apache.org/jira/browse/HBASE-8721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689160#comment-13689160
 ] 

Feng Honghua commented on HBASE-8721:
-

I list some merits with behavior 'Delete can't mask puts that happen after the 
delete':

1) Can avoid the inconsistency such as I mentioned above, with our patch, user 
can always read the put by 4. It's more natural and intuitive:

  1 put a kv (timestamp = T0), and flush;
  2 delete that kv using a DeleteColumn type kv with timestamp T0 (or any 
timestamp = T0), and flush;
  3 a major compact occurs [or not];
  4 put that kv again (timestamp = T0);
  5 read that kv;
  ===
  a) if a major compact occurs at step 3, then step 5 will get the put 
written at step 4
  b) if no major compact occurs at step 3, then step 5 get nothing

2) Can provide strong guarantee for such operation: I don't know 
which/how-many versions in a cell, now I (by removing all existing ones) just 
want to put a new version into it and ensure only this new put in the cell 
regardless of the ts comparison with old existing ones (I think this 
operation/guarantee is useful in many scenarios). Current delete behavior can't 
provide such guarantee.

3) 'delete latest version'(deleteColumn() without ts) can be tuned to remove 
the read (latest version for its ts) during 'deleteColumn'. Current delete 
behavior can't be tuned to remove the read operation during 'deleteColumn'

4) 'new put can't be masked (disappear) by old/existing delete' itself is a 
merit for many use-cases / application since it's more natural and intuitive. I 
ever explained many times to different customers for the old semantics of 
version/delete and without exception all the first responses from them are 
weird... why so?

Per my understanding, contrary to [~lhofhansl] and [~sershe], 'timestamp' is 
just a long type to determine versions' ordering using the rule of 'the 
bigger/later wins', and it happens the timestamp in 'time' semantic is a long 
type and new put with its 'current' timestamp has bigger timestamp, and in most 
cases new put versions knock out older ones. And for many use cases 
time-semantic for 'timestamp' is enough for the real-world requirement, but by 
design it's not always the case, otherwise the timestamp won't be exposed for 
user to set it explicitly.

In a word, as long as user knows 'timestamp' is just only the dimension of long 
type to determine the version ordering using the rule 'the bigger wins', he can 
reason out the result of any operation sequences. In essence 'timestamp as a 
dimension for version ordering' doesn't related to delete semantic.

-- I know my understanding is arguable for many guys, since the old delete 
semantic and behavior has existed for so long and everybody has already taken 
it for granted (I mean no offence here)


At last I also list the downside of proposed optional solutions I received:

A 'KEEP_DELETE_CELLS' is definitely a nice feature, but many users don't need 
this feature (to time-travel or trace-back action history) and this feature 
prevent major-compact to shrink data-set by collecting.

B disallow user explicitly set timestamp, this treatment limits HBase's schema 
flexibility, and prohibit many innovative design such as facebook's message 
search index, and at last it can't guarantee unique timestamp hence can still 
lead to tricky / confusing behavior.

 Deletes can mask puts that happen after the delete
 --

 Key: HBASE-8721
 URL: https://issues.apache.org/jira/browse/HBASE-8721
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: Feng Honghua
 Attachments: HBASE-8721-0.94-V0.patch


 this fix aims for bug mentioned in http://hbase.apache.org/book.html 5.8.2.1:
 Deletes mask puts, even puts that happened after the delete was entered. 
 Remember that a delete writes a tombstone, which only disappears after then 
 next major compaction has run. Suppose you do a delete of everything = T. 
 After this you do a new put with a timestamp = T. This put, even if it 
 happened after the delete, will be masked by the delete tombstone. Performing 
 the put will not fail, but when you do a get you will notice the put did have 
 no effect. It will start working again after the major compaction has run. 
 These issues should not be a problem if you use always-increasing versions 
 for new puts to a row. But they can occur even if you do not care about time: 
 just do delete and put immediately after each other, and there is some chance 
 they happen within the same millisecond.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-8617) Introducing a new config to disable writes during recovering

2013-06-20 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-8617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689174#comment-13689174
 ] 

Hudson commented on HBASE-8617:
---

Integrated in hbase-0.95-on-hadoop2 #139 (See 
[https://builds.apache.org/job/hbase-0.95-on-hadoop2/139/])
HBASE-8617: Introducing a new config to disable writes during recovering 
(Revision 1494814)

 Result = FAILURE
jeffreyz : 
Files : 
* 
/hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/HConstants.java
* 
/hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* 
/hbase/branches/0.95/hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestDistributedLogSplitting.java


 Introducing a new config to disable writes during recovering 
 -

 Key: HBASE-8617
 URL: https://issues.apache.org/jira/browse/HBASE-8617
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Affects Versions: 0.98.0, 0.95.1
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.98.0, 0.95.2

 Attachments: HBASE-8617.patch, HBASE-8617-v2.patch, 
 hbase-8617-v3.patch


 In distributedLogReplay(hbase-7006), we allow writes even when a region is in 
 recovering. It may cause undesired behavior when applications(or deployments) 
 already are near its write capacity because distributedLogReplay generates 
 more write traffic to remaining region servers.
 The new config hbase.regionserver.disallow.writes.when.recovering tries to 
 address the above situation so that recovering won't be affected by 
 application normal write traffic.
 The default value of this config is false(meaning allow writes in recovery)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6295) Possible performance improvement in client batch operations: presplit and send in background


[ 
https://issues.apache.org/jira/browse/HBASE-6295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689196#comment-13689196
 ] 

Jean-Marc Spaggiari commented on HBASE-6295:


Tests crashed yesterday because of some ZK obscure reasons... So I had to 
restart it. It should be done now. I will add 0.95 on the list, and run it. 
Which mean I should have all the results this evening (EST). I will take the 
required time to provide the feedback today.

 Possible performance improvement in client batch operations: presplit and 
 send in background
 

 Key: HBASE-6295
 URL: https://issues.apache.org/jira/browse/HBASE-6295
 Project: HBase
  Issue Type: Improvement
  Components: Client, Performance
Affects Versions: 0.95.2
Reporter: Nicolas Liochon
Assignee: Nicolas Liochon
  Labels: noob
 Fix For: 0.98.0

 Attachments: 6295.v11.patch, 6295.v12.patch, 6295.v14.patch, 
 6295.v1.patch, 6295.v2.patch, 6295.v3.patch, 6295.v4.patch, 6295.v5.patch, 
 6295.v6.patch, 6295.v8.patch, 6295.v9.patch


 today batch algo is:
 {noformat}
 for Operation o: ListOp{
   add o to todolist
   if todolist  maxsize or o last in list
 split todolist per location
 send split lists to region servers
 clear todolist
 wait
 }
 {noformat}
 We could:
 - create immediately the final object instead of an intermediate array
 - split per location immediately
 - instead of sending when the list as a whole is full, send it when there is 
 enough data for a single location
 It would be:
 {noformat}
 for Operation o: ListOp{
   get location
   add o to todo location.todolist
   if (location.todolist  maxLocationSize)
 send location.todolist to region server 
 clear location.todolist
 // don't wait, continue the loop
 }
 send remaining
 wait
 {noformat}
 It's not trivial to write if you add error management: retried list must be 
 shared with the operations added in the todolist. But it's doable.
 It's interesting mainly for 'big' writes

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-8617) Introducing a new config to disable writes during recovering

2013-06-20 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-8617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689227#comment-13689227
 ] 

Hudson commented on HBASE-8617:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #574 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/574/])
HBASE-8617: Introducing a new config to disable writes during recovering 
(Revision 1494804)

 Result = FAILURE
jeffreyz : 
Files : 
* 
/hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/HConstants.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestDistributedLogSplitting.java


 Introducing a new config to disable writes during recovering 
 -

 Key: HBASE-8617
 URL: https://issues.apache.org/jira/browse/HBASE-8617
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Affects Versions: 0.98.0, 0.95.1
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.98.0, 0.95.2

 Attachments: HBASE-8617.patch, HBASE-8617-v2.patch, 
 hbase-8617-v3.patch


 In distributedLogReplay(hbase-7006), we allow writes even when a region is in 
 recovering. It may cause undesired behavior when applications(or deployments) 
 already are near its write capacity because distributedLogReplay generates 
 more write traffic to remaining region servers.
 The new config hbase.regionserver.disallow.writes.when.recovering tries to 
 address the above situation so that recovering won't be affected by 
 application normal write traffic.
 The default value of this config is false(meaning allow writes in recovery)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-8753) Provide new delete flag which can delete all cells under a column-family which have a same designated timestamp


[ 
https://issues.apache.org/jira/browse/HBASE-8753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689229#comment-13689229
 ] 

Feng Honghua commented on HBASE-8753:
-

[~lhofhansl] For the backwards-compatibility, when old RS processes the 
DeleteFamilyVersion type kv (either written from new client, or the two 
scenarios you mentioned regarding rolling restart), the DeleteFamilyVersion can 
enter ScanDeleteTracker, and its only effect it has is when no DeleteColumn for 
null column with the same timestamp as this DeleteFamilyVersion, this 
DeleteFamilyVersion can delete the KV (column=null) with the same timestamp (a 
bit like the Delete(DeleteVersion) with the same timestamp), and no other 
side-effect.

In summary: DeleteFamilyVersion masks all the versions with a given timestamp 
under a CF, and when an old RS receives it(written from new client, or the two 
scenarios mentioned regarding rolling restart), the old RS treats it like it's 
a Delete(DeleteVersion) for null column. Nothing else.

I think this side-effect is acceptable. Your opinion?

 Provide new delete flag which can delete all cells under a column-family 
 which have a same designated timestamp
 ---

 Key: HBASE-8753
 URL: https://issues.apache.org/jira/browse/HBASE-8753
 Project: HBase
  Issue Type: New Feature
  Components: Deletes
Reporter: Feng Honghua
 Attachments: HBASE-8753-0.94-V0.patch, HBASE-8753-trunk-V0.patch


 In one of our production scenario (Xiaomi message search), multiple cells 
 will be put in batch using a same timestamp with different column names under 
 a specific column-family. 
 And after some time these cells also need to be deleted in batch by given a 
 specific timestamp. But the column names are parsed tokens which can be 
 arbitrary words , so such batch delete is impossible without first retrieving 
 all KVs from that CF and get the column name list which has KV with that 
 given timestamp, and then issuing individual deleteColumn for each column in 
 that column-list.
 Though it's possible to do such batch delete, its performance is poor, and 
 customers also find their code is quite clumsy by first retrieving and 
 populating the column list and then issuing a deleteColumn for each column in 
 that column-list.
 This feature resolves this problem by introducing a new delete flag: 
 DeleteFamilyVersion. 
   1). When you need to delete all KVs under a column-family with a given 
 timestamp, just call Delete.deleteFamilyVersion(cfName, timestamp); only a 
 DeleteFamilyVersion type KV is put to HBase (like DeleteFamily / DeleteColumn 
 / Delete) without read operation;
   2). Like other delete types, DeleteFamilyVersion takes effect in 
 get/scan/flush/compact operations, the ScanDeleteTracker now parses out and 
 uses DeleteFamilyVersion to prevent all KVs under the specific CF which has 
 the same timestamp as the DeleteFamilyVersion KV to pop-up as part of a 
 get/scan result (also in flush/compact).
 Our customers find this feature efficient, clean and easy-to-use since it 
 does its work without knowing the exact column names list that needs to be 
 deleted. 
 This feature has been running smoothly for a couple of months in our 
 production clusters.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-8755) A new write thread model for HLog to improve the overall HBase write throughput


[ 
https://issues.apache.org/jira/browse/HBASE-8755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689305#comment-13689305
 ] 

stack commented on HBASE-8755:
--

[~jmspaggi] You want to set up a rig to test this one?

 A new write thread model for HLog to improve the overall HBase write 
 throughput
 ---

 Key: HBASE-8755
 URL: https://issues.apache.org/jira/browse/HBASE-8755
 Project: HBase
  Issue Type: Improvement
  Components: wal
Reporter: Feng Honghua
 Attachments: HBASE-8755-0.94-V0.patch, HBASE-8755-0.94-V1.patch, 
 HBASE-8755-trunk-V0.patch


 In current write model, each write handler thread (executing put()) will 
 individually go through a full 'append (hlog local buffer) = HLog writer 
 append (write to hdfs) = HLog writer sync (sync hdfs)' cycle for each write, 
 which incurs heavy race condition on updateLock and flushLock.
 The only optimization where checking if current syncTillHere  txid in 
 expectation for other thread help write/sync its own txid to hdfs and 
 omitting the write/sync actually help much less than expectation.
 Three of my colleagues(Ye Hangjun / Wu Zesheng / Zhang Peng) at Xiaomi 
 proposed a new write thread model for writing hdfs sequence file and the 
 prototype implementation shows a 4X improvement for throughput (from 17000 to 
 7+). 
 I apply this new write thread model in HLog and the performance test in our 
 test cluster shows about 3X throughput improvement (from 12150 to 31520 for 1 
 RS, from 22000 to 7 for 5 RS), the 1 RS write throughput (1K row-size) 
 even beats the one of BigTable (Precolator published in 2011 says Bigtable's 
 write throughput then is 31002). I can provide the detailed performance test 
 results if anyone is interested.
 The change for new write thread model is as below:
  1 All put handler threads append the edits to HLog's local pending buffer; 
 (it notifies AsyncWriter thread that there is new edits in local buffer)
  2 All put handler threads wait in HLog.syncer() function for underlying 
 threads to finish the sync that contains its txid;
  3 An single AsyncWriter thread is responsible for retrieve all the buffered 
 edits in HLog's local pending buffer and write to the hdfs 
 (hlog.writer.append); (it notifies AsyncFlusher thread that there is new 
 writes to hdfs that needs a sync)
  4 An single AsyncFlusher thread is responsible for issuing a sync to hdfs 
 to persist the writes by AsyncWriter; (it notifies the AsyncNotifier thread 
 that sync watermark increases)
  5 An single AsyncNotifier thread is responsible for notifying all pending 
 put handler threads which are waiting in the HLog.syncer() function
  6 No LogSyncer thread any more (since there is always 
 AsyncWriter/AsyncFlusher threads do the same job it does)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-8667) Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine.


[ 
https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689306#comment-13689306
 ] 

stack commented on HBASE-8667:
--

[~rajesh23] v5 has this still:

-rpcClient = new RpcClient(conf, clusterId);
+rpcClient = new RpcClient(conf, clusterId, new InetSocketAddress(
+this.isa.getAddress(), 0));

You cannot do?

rpcClient = new RpcClient(conf, clusterId, this.isa);

Thanks for doing this fixup.

 Master and Regionserver not able to communicate if both bound to different 
 network interfaces on the same machine.
 --

 Key: HBASE-8667
 URL: https://issues.apache.org/jira/browse/HBASE-8667
 Project: HBase
  Issue Type: Bug
  Components: IPC/RPC
Reporter: rajeshbabu
Assignee: rajeshbabu
 Fix For: 0.98.0, 0.95.2, 0.94.9

 Attachments: HBASE-8667_trunk.patch, HBASE-8667_Trunk.patch, 
 HBASE-8667_Trunk-V2.patch, HBASE-8667_trunk_v4.patch, 
 HBASE-8667_trunk_v5.patch


 While testing HBASE-8640 fix found that master and regionserver running on 
 different interfaces are not communicating properly.
 I have two interfaces 1) lo 2) eth0 in my machine and default hostname 
 interface is lo.
 I have configured master ipc address to ip of eth0 interface.
 Started master and regionserver on the same machine.
 1) master rpc server bound to eth0 and RS rpc server bound to lo
 2) Since rpc client is not binding to any ip address, when RS is reporting RS 
 startup its getting registered with eth0 ip address(but actually it should 
 register localhost)
 Here are RS logs:
 {code}
 2013-05-31 06:05:28,608 WARN  [regionserver60020] 
 org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; 
 sleeping and then retrying.
 2013-05-31 06:05:31,609 INFO  [regionserver60020] 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to 
 Master server at 192.168.0.100,6,1369960497008
 2013-05-31 06:05:31,609 INFO  [regionserver60020] 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 
 192.168.0.100,6,1369960497008 that we are up with port=60020, 
 startcode=1369960502544
 2013-05-31 06:05:31,618 DEBUG [regionserver60020] 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: 
 hbase.rootdir=hdfs://localhost:2851/hbase
 2013-05-31 06:05:31,618 DEBUG [regionserver60020] 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: 
 fs.default.name=hdfs://localhost:2851
 2013-05-31 06:05:31,618 INFO  [regionserver60020] 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us a 
 different hostname to use; was=localhost, but now=192.168.0.100
 {code}
 Here are master logs:
 {code}
 2013-05-31 06:05:31,615 INFO  [IPC Server handler 9 on 6] 
 org.apache.hadoop.hbase.master.ServerManager: Registering 
 server=192.168.0.100,60020,1369960502544
 {code}
 Since master has wrong rpc server address of RS, META is not getting assigned.
 {code}
 2013-05-31 06:05:34,362 DEBUG [master-192.168.0.100,6,1369960497008] 
 org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan 
 was found (or we are ignoring an existing plan) for .META.,,1.1028785192 so 
 generated a random one; hri=.META.,,1.1028785192, src=, 
 dest=192.168.0.100,60020,1369960502544; 1 (online=1, available=1) available 
 servers, forceNewPlan=false
 -
 org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of 
 .META.,,1.1028785192 to 192.168.0.100,60020,1369960502544, trying to assign 
 elsewhere instead; try=1 of 10
 java.net.ConnectException: Connection refused
   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
   at 
 sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
   at 
 org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511)
   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:549)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:813)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1422)
   at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1315)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1532)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1587)
   at 
 org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.openRegion(AdminProtos.java:15039)
   at

[jira] [Commented] (HBASE-8701) distributedLogReplay need to apply wal edits in the receiving order of those edits

[
https://issues.apache.org/jira/browse/HBASE-8701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689316#comment-13689316
]

stack commented on HBASE-8701:
--

bq. The sequence ids of hfile are intact as before.

But some can be -ve? So they will be out of order? (I don't see special
handling in v7 -- I may have missed it).

Thanks.

distributedLogReplay need to apply wal edits in the receiving order of those
edits
--

Key: HBASE-8701
URL: https://issues.apache.org/jira/browse/HBASE-8701
Project: HBase
Issue Type: Bug
Components: MTTR
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
Fix For: 0.98.0, 0.95.2

Attachments: 8701-v3.txt, hbase-8701-v4.patch, hbase-8701-v5.patch,
hbase-8701-v6.patch, hbase-8701-v7.patch

This issue happens in distributedLogReplay mode when recovering multiple puts
of the same key + version(timestamp). After replay, the value is
nondeterministic of the key
h5. The original concern situation raised from [~eclark]:
For all edits the rowkey is the same.
There's a log with: [ A (ts = 0), B (ts = 0) ]
Replay the first half of the log.
A user puts in C (ts = 0)
Memstore has to flush
A new Hfile will be created with [ C, A ] and MaxSequenceId = C's seqid.
Replay the rest of the Log.
Flush
The issue will happen in similar situation like Put(key, t=T) in WAL1 and
Put(key,t=T) in WAL2
h5. Below is the option(proposed by Ted) I'd like to use:
a) During replay, we pass original wal sequence number of each edit to the
receiving RS
b) In receiving RS, we store negative original sequence number of wal edits
into mvcc field of KVs of wal edits
c) Add handling of negative MVCC in KVScannerComparator and KVComparator
d) In receiving RS, write original sequence number into an optional field of
wal file for chained RS failure situation
e) When opening a region, we add a safety bumper(a large number) in order for
the new sequence number of a newly opened region not to collide with old
sequence numbers.
In the future, when we stores sequence number along with KVs, we can adjust
the above solution a little bit by avoiding to overload MVCC field.
h5. The other alternative options are listed below for references:
Option one
a) disallow writes during recovery
b) during replay, we pass original wal sequence ids
c) hold flush till all wals of a recovering region are replayed. Memstore
should hold because we only recover unflushed wal edits. For edits with same
key + version, whichever with larger sequence Id wins.
Option two
a) During replay, we pass original wal sequence ids
b) for each wal edit, we store each edit's original sequence id along with
its key.
c) during scanning, we use the original sequence id if it's present otherwise
its store file sequence Id
d) compaction can just leave put with max sequence id
Please let me know if you have better ideas.

[jira] [Commented] (HBASE-8667) Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine.


[ 
https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689329#comment-13689329
 ] 

rajeshbabu commented on HBASE-8667:
---

If we use this.isa directly we will get BindException because rpc server 
already binding to the port(60010).

 Master and Regionserver not able to communicate if both bound to different 
 network interfaces on the same machine.
 --

 Key: HBASE-8667
 URL: https://issues.apache.org/jira/browse/HBASE-8667
 Project: HBase
  Issue Type: Bug
  Components: IPC/RPC
Reporter: rajeshbabu
Assignee: rajeshbabu
 Fix For: 0.98.0, 0.95.2, 0.94.9

 Attachments: HBASE-8667_trunk.patch, HBASE-8667_Trunk.patch, 
 HBASE-8667_Trunk-V2.patch, HBASE-8667_trunk_v4.patch, 
 HBASE-8667_trunk_v5.patch


 While testing HBASE-8640 fix found that master and regionserver running on 
 different interfaces are not communicating properly.
 I have two interfaces 1) lo 2) eth0 in my machine and default hostname 
 interface is lo.
 I have configured master ipc address to ip of eth0 interface.
 Started master and regionserver on the same machine.
 1) master rpc server bound to eth0 and RS rpc server bound to lo
 2) Since rpc client is not binding to any ip address, when RS is reporting RS 
 startup its getting registered with eth0 ip address(but actually it should 
 register localhost)
 Here are RS logs:
 {code}
 2013-05-31 06:05:28,608 WARN  [regionserver60020] 
 org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; 
 sleeping and then retrying.
 2013-05-31 06:05:31,609 INFO  [regionserver60020] 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to 
 Master server at 192.168.0.100,6,1369960497008
 2013-05-31 06:05:31,609 INFO  [regionserver60020] 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 
 192.168.0.100,6,1369960497008 that we are up with port=60020, 
 startcode=1369960502544
 2013-05-31 06:05:31,618 DEBUG [regionserver60020] 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: 
 hbase.rootdir=hdfs://localhost:2851/hbase
 2013-05-31 06:05:31,618 DEBUG [regionserver60020] 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: 
 fs.default.name=hdfs://localhost:2851
 2013-05-31 06:05:31,618 INFO  [regionserver60020] 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us a 
 different hostname to use; was=localhost, but now=192.168.0.100
 {code}
 Here are master logs:
 {code}
 2013-05-31 06:05:31,615 INFO  [IPC Server handler 9 on 6] 
 org.apache.hadoop.hbase.master.ServerManager: Registering 
 server=192.168.0.100,60020,1369960502544
 {code}
 Since master has wrong rpc server address of RS, META is not getting assigned.
 {code}
 2013-05-31 06:05:34,362 DEBUG [master-192.168.0.100,6,1369960497008] 
 org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan 
 was found (or we are ignoring an existing plan) for .META.,,1.1028785192 so 
 generated a random one; hri=.META.,,1.1028785192, src=, 
 dest=192.168.0.100,60020,1369960502544; 1 (online=1, available=1) available 
 servers, forceNewPlan=false
 -
 org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of 
 .META.,,1.1028785192 to 192.168.0.100,60020,1369960502544, trying to assign 
 elsewhere instead; try=1 of 10
 java.net.ConnectException: Connection refused
   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
   at 
 sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
   at 
 org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511)
   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:549)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:813)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1422)
   at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1315)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1532)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1587)
   at 
 org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.openRegion(AdminProtos.java:15039)
   at 
 org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:627)
   at 
 org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1826)
   at

[jira] [Commented] (HBASE-8627) HBCK can not fix meta not assigned issue


[ 
https://issues.apache.org/jira/browse/HBASE-8627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689404#comment-13689404
 ] 

Jimmy Xiang commented on HBASE-8627:


For #deleteMetaRegion, do you plan to use the last two parameters?

 HBCK can not fix meta not assigned issue
 

 Key: HBASE-8627
 URL: https://issues.apache.org/jira/browse/HBASE-8627
 Project: HBase
  Issue Type: Bug
  Components: hbck
Affects Versions: 0.95.0
Reporter: Anoop Sam John
Assignee: Anoop Sam John
 Attachments: HBASE-8627_Trunk.patch, HBASE-8627_Trunk-V2.patch


 When meta table region is not assigned to any RS, HBCK run will get 
 exception. I can see code added in checkMetaRegion() to solve this issue but 
 it wont work. It still refers to ROOT region!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-8015) Support for Namespaces


[ 
https://issues.apache.org/jira/browse/HBASE-8015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689417#comment-13689417
 ] 

Francis Liu commented on HBASE-8015:


[~saint@gmail.com], I'm leaning towards having the migration operation done 
manually by calling a script as well. Which options do we provide the user? 
Also it might be better if the script is portable enough that they can run on 
an existing 0.94 cluster so they don't have to find out during the actual 
upgrade process.


 Support for Namespaces
 --

 Key: HBASE-8015
 URL: https://issues.apache.org/jira/browse/HBASE-8015
 Project: HBase
  Issue Type: New Feature
Reporter: Francis Liu
Assignee: Francis Liu
 Attachments: HBASE-8015_draft_94.patch, Namespace Design.pdf




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-8667) Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine.


[ 
https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689418#comment-13689418
 ] 

stack commented on HBASE-8667:
--

Ok.  Makes sense.  I am up for trying it.  Thanks [~rajesh23].  Anyone else 
want to take a look?

 Master and Regionserver not able to communicate if both bound to different 
 network interfaces on the same machine.
 --

 Key: HBASE-8667
 URL: https://issues.apache.org/jira/browse/HBASE-8667
 Project: HBase
  Issue Type: Bug
  Components: IPC/RPC
Reporter: rajeshbabu
Assignee: rajeshbabu
 Fix For: 0.98.0, 0.95.2, 0.94.9

 Attachments: HBASE-8667_trunk.patch, HBASE-8667_Trunk.patch, 
 HBASE-8667_Trunk-V2.patch, HBASE-8667_trunk_v4.patch, 
 HBASE-8667_trunk_v5.patch


 While testing HBASE-8640 fix found that master and regionserver running on 
 different interfaces are not communicating properly.
 I have two interfaces 1) lo 2) eth0 in my machine and default hostname 
 interface is lo.
 I have configured master ipc address to ip of eth0 interface.
 Started master and regionserver on the same machine.
 1) master rpc server bound to eth0 and RS rpc server bound to lo
 2) Since rpc client is not binding to any ip address, when RS is reporting RS 
 startup its getting registered with eth0 ip address(but actually it should 
 register localhost)
 Here are RS logs:
 {code}
 2013-05-31 06:05:28,608 WARN  [regionserver60020] 
 org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; 
 sleeping and then retrying.
 2013-05-31 06:05:31,609 INFO  [regionserver60020] 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to 
 Master server at 192.168.0.100,6,1369960497008
 2013-05-31 06:05:31,609 INFO  [regionserver60020] 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 
 192.168.0.100,6,1369960497008 that we are up with port=60020, 
 startcode=1369960502544
 2013-05-31 06:05:31,618 DEBUG [regionserver60020] 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: 
 hbase.rootdir=hdfs://localhost:2851/hbase
 2013-05-31 06:05:31,618 DEBUG [regionserver60020] 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: 
 fs.default.name=hdfs://localhost:2851
 2013-05-31 06:05:31,618 INFO  [regionserver60020] 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us a 
 different hostname to use; was=localhost, but now=192.168.0.100
 {code}
 Here are master logs:
 {code}
 2013-05-31 06:05:31,615 INFO  [IPC Server handler 9 on 6] 
 org.apache.hadoop.hbase.master.ServerManager: Registering 
 server=192.168.0.100,60020,1369960502544
 {code}
 Since master has wrong rpc server address of RS, META is not getting assigned.
 {code}
 2013-05-31 06:05:34,362 DEBUG [master-192.168.0.100,6,1369960497008] 
 org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan 
 was found (or we are ignoring an existing plan) for .META.,,1.1028785192 so 
 generated a random one; hri=.META.,,1.1028785192, src=, 
 dest=192.168.0.100,60020,1369960502544; 1 (online=1, available=1) available 
 servers, forceNewPlan=false
 -
 org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of 
 .META.,,1.1028785192 to 192.168.0.100,60020,1369960502544, trying to assign 
 elsewhere instead; try=1 of 10
 java.net.ConnectException: Connection refused
   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
   at 
 sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
   at 
 org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511)
   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:549)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:813)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1422)
   at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1315)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1532)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1587)
   at 
 org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.openRegion(AdminProtos.java:15039)
   at 
 org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:627)
   at 
 org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1826)
   at

[jira] [Commented] (HBASE-8015) Support for Namespaces


[ 
https://issues.apache.org/jira/browse/HBASE-8015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689419#comment-13689419
 ] 

Francis Liu commented on HBASE-8015:


I thought of a way to implement [~sershe] idea. It was simple enough so I 
thought I'd give it a try. Essentially keep an in-memory list of tables which 
make use of the delimiter (ie '.') and consider these tables as exceptions to 
the namespace rule and handle them properly to make sure they are part of the 
default namespace. Have an added constraint that prevent creation of namespaces 
and tables that would conflict with any of the exception tables (ie ns1 and 
ns1.foo).

Surprise here is:
- you can't create tables with the delimiter no longer unless you create the 
appropriate namespace.  
- you can't create tables/namespace which conflict the exception 
tables/namespaces
- the exception list is derived by scanning the default namespace directories 
in .tmp, .data and .archive

Here's a sample of how it works. I've updated the TestNamespaceUpgrade test to 
verify that it works:
https://github.com/francisliu/hbase_namespace/tree/core_8408_exception_list

 Support for Namespaces
 --

 Key: HBASE-8015
 URL: https://issues.apache.org/jira/browse/HBASE-8015
 Project: HBase
  Issue Type: New Feature
Reporter: Francis Liu
Assignee: Francis Liu
 Attachments: HBASE-8015_draft_94.patch, Namespace Design.pdf




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-8060) Num compacting KVs diverges from num compacted KVs over time

[
https://issues.apache.org/jira/browse/HBASE-8060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689424#comment-13689424
]

Sergey Shelukhin commented on HBASE-8060:
-

The meaning of total compacting KVs is the number of KVs in the input for
last-started compaction (this historically assumes one compaction at a time per
store). I.e. either current compaction or the last finished one. It is an
estimate of the number of KVs compaction will see, used to track progress.
Patch does change meaning of it to reconcile the (often incorrect) estimate
with reality..

Num compacting KVs diverges from num compacted KVs over time

Key: HBASE-8060
URL: https://issues.apache.org/jira/browse/HBASE-8060
Project: HBase
Issue Type: Bug
Components: Compaction, UI
Affects Versions: 0.94.6, 0.95.0, 0.95.2
Reporter: Andrew Purtell
Assignee: Sergey Shelukhin
Attachments: HBASE-8060-v0.patch, screenshot.png

I have been running what amounts to an ingestion test for a day or so. This
is an all-in-one cluster launched with './bin/hbase master start' from
sources. In the RS stats on the master UI, the num compacting KVs has
diverged from num compacted KVs even though compaction has been completed
from perspective of selection, no compaction tasks are running on the RS. I
think this could be confusing -- is compaction happening or not?
Or maybe I'm misunderstanding what this is supposed to show?

[jira] [Commented] (HBASE-8721) Deletes can mask puts that happen after the delete

[
https://issues.apache.org/jira/browse/HBASE-8721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689435#comment-13689435
]

Lars Hofhansl commented on HBASE-8721:
--

KEEP_DELETED_CELLS would still work fine, but their main goal is to allow
correct point-in-time-queries, which among others is important for consistent
backups.

Regarding all the points above. Let's please not go overboard. Now we're
extending this to Puts as well, and are saying that a Put that hits the
RegionServer later should be considered newer even if its TS is old, this opens
another can of worms.

It is unlikely that this will be changed as you have to a find committers to +1
this. All we got up to this points are a -1 unless it is configurable and a
couple of -0s.

Deletes can mask puts that happen after the delete
--

Key: HBASE-8721
URL: https://issues.apache.org/jira/browse/HBASE-8721
Project: HBase
Issue Type: Improvement
Components: regionserver
Reporter: Feng Honghua
Attachments: HBASE-8721-0.94-V0.patch

[jira] [Commented] (HBASE-8771) ensure replication_scope's value is either local(0) or global(1)

2013-06-20 Thread Demai Ni (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-8771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689439#comment-13689439
 ] 

Demai Ni commented on HBASE-8771:
-

[~ctrezzo] , sorry that I didn't explain clear the first time. Although the 
setScope is currently only used in such the columndescriptor constructor, the 
setScope() is a public method. so user can do this 
{code}
...
HTableDescriptor ht = new HTableDescriptor( t3_dn );
HColumnDescriptor cfd = new HColumnDescriptor( cf1 );
cfd.setScope(-1000);
ht.addFamily( cfd );
...
{code}
so if the checking is put inside the constructor (similar as the logic for 
minVersions and maxVersions), the above code will not be caught. 


 ensure replication_scope's value is either local(0) or global(1)
 

 Key: HBASE-8771
 URL: https://issues.apache.org/jira/browse/HBASE-8771
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.8
Reporter: Demai Ni
Priority: Minor
 Fix For: 0.94.9

 Attachments: HBASE-8771-0.94.8-v0.patch


 For replication_scope, only two values are meaningful:
 {code} 
   public static final int REPLICATION_SCOPE_LOCAL = 0;
   public static final int REPLICATION_SCOPE_GLOBAL = 1;
 {code} 
 However, there is no checking for that, so currently user can set it to any 
 integer value. And all non-zero value will be treated as 1(GLOBAL). 
 This jira is to add a checking in HColumnDescriptor#setScope() so that only 0 
 and 1 will be accept during create_table or alter_table. 
 In the future, we can leverage replication_scope to store for info. For 
 example: 
 -1: A columnfam is replicated from another cluster in MASTER_SLAVE setup (i.e 
 readonly)
 2 : A columnfam is set MASTER_MASTER
 Probably a major improve JIRA is needed for the future usage. It will be good 
 to ensure the scope value at this moment. 
 {code:title=Testing|borderStyle=solid}
 hbase(main):002:0 create 't1_dn',{NAME='cf1',REPLICATION_SCOPE=2}
 ERROR: java.lang.IllegalArgumentException: Replication Scope must be either 
 0(local) or 1(global)
 ...
 hbase(main):004:0 alter 't1_dn',{NAME='cf1',REPLICATION_SCOPE=-1}
 ERROR: java.lang.IllegalArgumentException: Replication Scope must be either 
 0(local) or 1(global)
 ...
 {code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-8753) Provide new delete flag which can delete all cells under a column-family which have a same designated timestamp

[
https://issues.apache.org/jira/browse/HBASE-8753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689441#comment-13689441
]

Lars Hofhansl commented on HBASE-8753:
--

May you not run into this case in isDeleted:
{code}
int ret = Bytes.compareTo(deleteBuffer, deleteOffset, deleteLength,
buffer, qualifierOffset, qualifierLength);
if (ret == 0) {
...
} else if(ret 0){
...
} else {
throw new IllegalStateException(isDelete failed: deleteBuffer=
...
{code}

In any case, we should just test it: Write an HFile with a new RS, start an old
RS, scan that file, check that it works fine.

Provide new delete flag which can delete all cells under a column-family
which have a same designated timestamp
---

Key: HBASE-8753
URL: https://issues.apache.org/jira/browse/HBASE-8753
Project: HBase
Issue Type: New Feature
Components: Deletes
Reporter: Feng Honghua
Attachments: HBASE-8753-0.94-V0.patch, HBASE-8753-trunk-V0.patch

In one of our production scenario (Xiaomi message search), multiple cells
will be put in batch using a same timestamp with different column names under
a specific column-family.
And after some time these cells also need to be deleted in batch by given a
specific timestamp. But the column names are parsed tokens which can be
arbitrary words , so such batch delete is impossible without first retrieving
all KVs from that CF and get the column name list which has KV with that
given timestamp, and then issuing individual deleteColumn for each column in
that column-list.
Though it's possible to do such batch delete, its performance is poor, and
customers also find their code is quite clumsy by first retrieving and
populating the column list and then issuing a deleteColumn for each column in
that column-list.
This feature resolves this problem by introducing a new delete flag:
DeleteFamilyVersion.
1). When you need to delete all KVs under a column-family with a given
timestamp, just call Delete.deleteFamilyVersion(cfName, timestamp); only a
DeleteFamilyVersion type KV is put to HBase (like DeleteFamily / DeleteColumn
/ Delete) without read operation;
2). Like other delete types, DeleteFamilyVersion takes effect in
get/scan/flush/compact operations, the ScanDeleteTracker now parses out and
uses DeleteFamilyVersion to prevent all KVs under the specific CF which has
the same timestamp as the DeleteFamilyVersion KV to pop-up as part of a
get/scan result (also in flush/compact).
Our customers find this feature efficient, clean and easy-to-use since it
does its work without knowing the exact column names list that needs to be
deleted.
This feature has been running smoothly for a couple of months in our
production clusters.

[jira] [Commented] (HBASE-8060) Num compacting KVs diverges from num compacted KVs over time


[ 
https://issues.apache.org/jira/browse/HBASE-8060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689442#comment-13689442
 ] 

Lars Hofhansl commented on HBASE-8060:
--

I see. So it does not really change the meaning - it's still the number of KV 
compacted in last compaction - it just corrects the value.

 Num compacting KVs diverges from num compacted KVs over time
 

 Key: HBASE-8060
 URL: https://issues.apache.org/jira/browse/HBASE-8060
 Project: HBase
  Issue Type: Bug
  Components: Compaction, UI
Affects Versions: 0.94.6, 0.95.0, 0.95.2
Reporter: Andrew Purtell
Assignee: Sergey Shelukhin
 Attachments: HBASE-8060-v0.patch, screenshot.png


 I have been running what amounts to an ingestion test for a day or so. This 
 is an all-in-one cluster launched with './bin/hbase master start' from 
 sources. In the RS stats on the master UI, the num compacting KVs has 
 diverged from num compacted KVs even though compaction has been completed 
 from perspective of selection, no compaction tasks are running on the RS. I 
 think this could be confusing -- is compaction happening or not?
 Or maybe I'm misunderstanding what this is supposed to show?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-8771) ensure replication_scope's value is either local(0) or global(1)

2013-06-20 Thread Demai Ni (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-8771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689449#comment-13689449
 ] 

Demai Ni commented on HBASE-8771:
-

[~anoop.hbase], thanks for you comments. 

since we plan to use values other than 0 or 1 in the future, it may be better 
to block other values now to avoid conflict in the future. 

For example, today a userA can use value -1 and 2 for scope, and hbase code 
will treat it as '1' (global replication). Then, a future JIRA gives value '2' 
another feature. userA will face a difficult replication behavior. 

With that, it is better to block values such as -1 and 2 earlier to reduce such 
potential issues. 

 ensure replication_scope's value is either local(0) or global(1)
 

 Key: HBASE-8771
 URL: https://issues.apache.org/jira/browse/HBASE-8771
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.8
Reporter: Demai Ni
Priority: Minor
 Fix For: 0.94.9

 Attachments: HBASE-8771-0.94.8-v0.patch


 For replication_scope, only two values are meaningful:
 {code} 
   public static final int REPLICATION_SCOPE_LOCAL = 0;
   public static final int REPLICATION_SCOPE_GLOBAL = 1;
 {code} 
 However, there is no checking for that, so currently user can set it to any 
 integer value. And all non-zero value will be treated as 1(GLOBAL). 
 This jira is to add a checking in HColumnDescriptor#setScope() so that only 0 
 and 1 will be accept during create_table or alter_table. 
 In the future, we can leverage replication_scope to store for info. For 
 example: 
 -1: A columnfam is replicated from another cluster in MASTER_SLAVE setup (i.e 
 readonly)
 2 : A columnfam is set MASTER_MASTER
 Probably a major improve JIRA is needed for the future usage. It will be good 
 to ensure the scope value at this moment. 
 {code:title=Testing|borderStyle=solid}
 hbase(main):002:0 create 't1_dn',{NAME='cf1',REPLICATION_SCOPE=2}
 ERROR: java.lang.IllegalArgumentException: Replication Scope must be either 
 0(local) or 1(global)
 ...
 hbase(main):004:0 alter 't1_dn',{NAME='cf1',REPLICATION_SCOPE=-1}
 ERROR: java.lang.IllegalArgumentException: Replication Scope must be either 
 0(local) or 1(global)
 ...
 {code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-8774) Add BatchSize and Filter to Thrift2


[ 
https://issues.apache.org/jira/browse/HBASE-8774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689454#comment-13689454
 ] 

Hamed Madani commented on HBASE-8774:
-

'true' is there because the rest of the thrift and thrift2 had that format. For 
example 
{code}
boolean this_present_columns = true  this.isSetColumns();
boolean that_present_columns = true  that.isSetColumns();
{code}

HBASE-6073 patch is missing the modification to thrift2.generated.TScan.java 
and thrift2.generated.TGet.java. 

 Add BatchSize and Filter to Thrift2
 ---

 Key: HBASE-8774
 URL: https://issues.apache.org/jira/browse/HBASE-8774
 Project: HBase
  Issue Type: New Feature
  Components: Thrift
Affects Versions: 0.95.1
Reporter: Hamed Madani
 Attachments: HBASE_8774.patch


 Attached Patch will add BatchSize and Filter support to Thrift2

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-8015) Support for Namespaces


[ 
https://issues.apache.org/jira/browse/HBASE-8015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689468#comment-13689468
 ] 

Francis Liu commented on HBASE-8015:


Oops sorry I guess I'm talking about two scripts. One to check if some 
surprising migration needs to be done and provide links/options. And another 
that does the actual migration.

 Support for Namespaces
 --

 Key: HBASE-8015
 URL: https://issues.apache.org/jira/browse/HBASE-8015
 Project: HBase
  Issue Type: New Feature
Reporter: Francis Liu
Assignee: Francis Liu
 Attachments: HBASE-8015_draft_94.patch, Namespace Design.pdf




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6295) Possible performance improvement in client batch operations: presplit and send in background


[ 
https://issues.apache.org/jira/browse/HBASE-6295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689480#comment-13689480
 ] 

Jean-Marc Spaggiari commented on HBASE-6295:


||Test||Trunk||Nic||0.95||
|org.apache.hadoop.hbase.PerformanceEvaluation$RandomReadTest|761449.8|738362.4|754100|
|org.apache.hadoop.hbase.PerformanceEvaluation$RandomScanWithRange100Test|21858.7|22356.7|22400.7|
|org.apache.hadoop.hbase.PerformanceEvaluation$RandomSeekScanTest|13.6|138179.3|134186.7|
|org.apache.hadoop.hbase.PerformanceEvaluation$RandomWriteTest|114272.9|76990.3|114798.1|
|org.apache.hadoop.hbase.PerformanceEvaluation$SequentialWriteTest|77144.275|24582.425|79107.25|

so Trunk and 0.95 are consistent, while Nic's version show a nice improvement 
on the write operations (both Random and Sequentials), and a very small 
degradation on SeekScan. Also a small improvement on RandomRead.

Do you need the IntegrationTestBigLinkedList for the 3 releases too?

 Possible performance improvement in client batch operations: presplit and 
 send in background
 

 Key: HBASE-6295
 URL: https://issues.apache.org/jira/browse/HBASE-6295
 Project: HBase
  Issue Type: Improvement
  Components: Client, Performance
Affects Versions: 0.95.2
Reporter: Nicolas Liochon
Assignee: Nicolas Liochon
  Labels: noob
 Fix For: 0.98.0

 Attachments: 6295.v11.patch, 6295.v12.patch, 6295.v14.patch, 
 6295.v1.patch, 6295.v2.patch, 6295.v3.patch, 6295.v4.patch, 6295.v5.patch, 
 6295.v6.patch, 6295.v8.patch, 6295.v9.patch


 today batch algo is:
 {noformat}
 for Operation o: ListOp{
   add o to todolist
   if todolist  maxsize or o last in list
 split todolist per location
 send split lists to region servers
 clear todolist
 wait
 }
 {noformat}
 We could:
 - create immediately the final object instead of an intermediate array
 - split per location immediately
 - instead of sending when the list as a whole is full, send it when there is 
 enough data for a single location
 It would be:
 {noformat}
 for Operation o: ListOp{
   get location
   add o to todo location.todolist
   if (location.todolist  maxLocationSize)
 send location.todolist to region server 
 clear location.todolist
 // don't wait, continue the loop
 }
 send remaining
 wait
 {noformat}
 It's not trivial to write if you add error management: retried list must be 
 shared with the operations added in the todolist. But it's doable.
 It's interesting mainly for 'big' writes

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HBASE-8775) Throttle online schema changes.

2013-06-20 Thread Shane Hogan (JIRA)

Shane Hogan created HBASE-8775:
--

 Summary: Throttle online schema changes.
 Key: HBASE-8775
 URL: https://issues.apache.org/jira/browse/HBASE-8775
 Project: HBase
  Issue Type: Improvement
  Components: master
Affects Versions: 0.89-fb
Reporter: Shane Hogan
Priority: Minor
 Fix For: 0.89-fb


Throttle the open and close of the regions after an online schema change

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-8755) A new write thread model for HLog to improve the overall HBase write throughput


[ 
https://issues.apache.org/jira/browse/HBASE-8755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689492#comment-13689492
 ] 

Jean-Marc Spaggiari commented on HBASE-8755:


Sure! Let me prepare that. I will read this JIRA from the beginning and try to 
start the tests today.

 A new write thread model for HLog to improve the overall HBase write 
 throughput
 ---

 Key: HBASE-8755
 URL: https://issues.apache.org/jira/browse/HBASE-8755
 Project: HBase
  Issue Type: Improvement
  Components: wal
Reporter: Feng Honghua
 Attachments: HBASE-8755-0.94-V0.patch, HBASE-8755-0.94-V1.patch, 
 HBASE-8755-trunk-V0.patch


 In current write model, each write handler thread (executing put()) will 
 individually go through a full 'append (hlog local buffer) = HLog writer 
 append (write to hdfs) = HLog writer sync (sync hdfs)' cycle for each write, 
 which incurs heavy race condition on updateLock and flushLock.
 The only optimization where checking if current syncTillHere  txid in 
 expectation for other thread help write/sync its own txid to hdfs and 
 omitting the write/sync actually help much less than expectation.
 Three of my colleagues(Ye Hangjun / Wu Zesheng / Zhang Peng) at Xiaomi 
 proposed a new write thread model for writing hdfs sequence file and the 
 prototype implementation shows a 4X improvement for throughput (from 17000 to 
 7+). 
 I apply this new write thread model in HLog and the performance test in our 
 test cluster shows about 3X throughput improvement (from 12150 to 31520 for 1 
 RS, from 22000 to 7 for 5 RS), the 1 RS write throughput (1K row-size) 
 even beats the one of BigTable (Precolator published in 2011 says Bigtable's 
 write throughput then is 31002). I can provide the detailed performance test 
 results if anyone is interested.
 The change for new write thread model is as below:
  1 All put handler threads append the edits to HLog's local pending buffer; 
 (it notifies AsyncWriter thread that there is new edits in local buffer)
  2 All put handler threads wait in HLog.syncer() function for underlying 
 threads to finish the sync that contains its txid;
  3 An single AsyncWriter thread is responsible for retrieve all the buffered 
 edits in HLog's local pending buffer and write to the hdfs 
 (hlog.writer.append); (it notifies AsyncFlusher thread that there is new 
 writes to hdfs that needs a sync)
  4 An single AsyncFlusher thread is responsible for issuing a sync to hdfs 
 to persist the writes by AsyncWriter; (it notifies the AsyncNotifier thread 
 that sync watermark increases)
  5 An single AsyncNotifier thread is responsible for notifying all pending 
 put handler threads which are waiting in the HLog.syncer() function
  6 No LogSyncer thread any more (since there is always 
 AsyncWriter/AsyncFlusher threads do the same job it does)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-8771) ensure replication_scope's value is either local(0) or global(1)

2013-06-20 Thread Demai Ni (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-8771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689489#comment-13689489
 ] 

Demai Ni commented on HBASE-8771:
-

[~ctrezzo], another interesting finding while I was playing different 
approaches (shell vs constructor). Using maxVersion as example, which has code 
like below to check value:
{code}
if (maxVersions = 0) {
  // TODO: Allow maxVersion of 0 to be the way you say Keep all versions.
  // Until there is support, consider 0 or  0 -- a configuration error.
  
  throw new IllegalArgumentException(Maximum versions must be positive);
}
{code}
Above code can catch the illegal arg only when user call the HColumnDescriptor 
constructor, but won't work in hbase shell or call setMaxVersion() direclty. 

{code:title=set Max Version = -1 in Shell. No Error thrown because shell called 
setMaxVersions directly}
hbase(main):016:0 create 't5_dn',{NAME='cf1',VERSIONS=-1}
0 row(s) in 1.0420 seconds

hbase(main):017:0 put 't5_dn','row1','cf1:q1','row1cf1_v1'
0 row(s) in 0.0700 seconds

hbase(main):018:0 scan 't5_dn'
ROW   COLUMN+CELL  
0 row(s) in 0.0090 seconds

hbase(main):019:0 describe 't5_dn'
DESCRIPTION  ENABLED
't5_dn', {NAME = 'cf1', VERSIONS = '-1',...}  
 
{code}

{code:title=set Max Version = -999 through constructor. Error caught inside }
HTableDescriptor ht = new HTableDescriptor( t3_dn );
HColumnDescriptor cfd = new HColumnDescriptor(Bytes.toBytes( 
cf1),-999,NONE,false,false,100,NONE);
...
Exception in thread main java.lang.IllegalArgumentException: Maximum versions 
must be positive
at 
org.apache.hadoop.hbase.HColumnDescriptor.init(HColumnDescriptor.java:386)
at 
org.apache.hadoop.hbase.HColumnDescriptor.init(HColumnDescriptor.java:334)
at 
org.apache.hadoop.hbase.HColumnDescriptor.init(HColumnDescriptor.java:302)
at CreateTable_version.main(CreateTable_version.java:23)
{code}

 ensure replication_scope's value is either local(0) or global(1)
 

 Key: HBASE-8771
 URL: https://issues.apache.org/jira/browse/HBASE-8771
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.8
Reporter: Demai Ni
Priority: Minor
 Fix For: 0.94.9

 Attachments: HBASE-8771-0.94.8-v0.patch


 For replication_scope, only two values are meaningful:
 {code} 
   public static final int REPLICATION_SCOPE_LOCAL = 0;
   public static final int REPLICATION_SCOPE_GLOBAL = 1;
 {code} 
 However, there is no checking for that, so currently user can set it to any 
 integer value. And all non-zero value will be treated as 1(GLOBAL). 
 This jira is to add a checking in HColumnDescriptor#setScope() so that only 0 
 and 1 will be accept during create_table or alter_table. 
 In the future, we can leverage replication_scope to store for info. For 
 example: 
 -1: A columnfam is replicated from another cluster in MASTER_SLAVE setup (i.e 
 readonly)
 2 : A columnfam is set MASTER_MASTER
 Probably a major improve JIRA is needed for the future usage. It will be good 
 to ensure the scope value at this moment. 
 {code:title=Testing|borderStyle=solid}
 hbase(main):002:0 create 't1_dn',{NAME='cf1',REPLICATION_SCOPE=2}
 ERROR: java.lang.IllegalArgumentException: Replication Scope must be either 
 0(local) or 1(global)
 ...
 hbase(main):004:0 alter 't1_dn',{NAME='cf1',REPLICATION_SCOPE=-1}
 ERROR: java.lang.IllegalArgumentException: Replication Scope must be either 
 0(local) or 1(global)
 ...
 {code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-8627) HBCK can not fix meta not assigned issue


[ 
https://issues.apache.org/jira/browse/HBASE-8627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689495#comment-13689495
 ] 

Sergey Shelukhin commented on HBASE-8627:
-

lgtm

 HBCK can not fix meta not assigned issue
 

 Key: HBASE-8627
 URL: https://issues.apache.org/jira/browse/HBASE-8627
 Project: HBase
  Issue Type: Bug
  Components: hbck
Affects Versions: 0.95.0
Reporter: Anoop Sam John
Assignee: Anoop Sam John
 Attachments: HBASE-8627_Trunk.patch, HBASE-8627_Trunk-V2.patch


 When meta table region is not assigned to any RS, HBCK run will get 
 exception. I can see code added in checkMetaRegion() to solve this issue but 
 it wont work. It still refers to ROOT region!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-8015) Support for Namespaces


[ 
https://issues.apache.org/jira/browse/HBASE-8015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689501#comment-13689501
 ] 

Sergey Shelukhin commented on HBASE-8015:
-

{code}exceptionNS.add(tableName.getNamespaceAsString()); {code}
What is the current thinking on dots in namespaces and names? Presumably one 
table could prevent the creation of multiple namespaces if dots are allowed in 
namespace name, which I thought they are

 Support for Namespaces
 --

 Key: HBASE-8015
 URL: https://issues.apache.org/jira/browse/HBASE-8015
 Project: HBase
  Issue Type: New Feature
Reporter: Francis Liu
Assignee: Francis Liu
 Attachments: HBASE-8015_draft_94.patch, Namespace Design.pdf




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-8662) [rest] support impersonation


 [ 
https://issues.apache.org/jira/browse/HBASE-8662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-8662:
---

Status: Open  (was: Patch Available)

 [rest] support impersonation
 

 Key: HBASE-8662
 URL: https://issues.apache.org/jira/browse/HBASE-8662
 Project: HBase
  Issue Type: Sub-task
  Components: REST, security
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Fix For: 0.98.0

 Attachments: method_doas.patch, secure_rest.patch, trunk-8662.patch, 
 trunk-8662_v2.patch, trunk-8662_v3.patch


 Currently, our client API uses a fixed user: the current user. It should 
 accept a user passed in, if authenticated.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HBASE-8721) Deletes can mask puts that happen after the delete

2013-06-20 Thread Andrew Purtell (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-8721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Andrew Purtell resolved HBASE-8721.
---

Resolution: Won't Fix

bq. It is unlikely that this will be changed as you have to a find committers
to +1 this. All we got up to this points are a -1 unless it is configurable and
a couple of -0s.

Agreed, resolved as WONTFIX.

Interested parties are encouraged to go to the followups HBASE-8763 and
HBASE-8770

Deletes can mask puts that happen after the delete
--

Key: HBASE-8721
URL: https://issues.apache.org/jira/browse/HBASE-8721
Project: HBase
Issue Type: Improvement
Components: regionserver
Reporter: Feng Honghua
Attachments: HBASE-8721-0.94-V0.patch

[jira] [Updated] (HBASE-3149) Make flush decisions per column family

2013-06-20 Thread Himanshu Vashishtha (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-3149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Himanshu Vashishtha updated HBASE-3149:
---

Assignee: (was: Himanshu Vashishtha)

I figured that it increases mttr time. I will probably look into it after we 
fixed mttr issues of late. Un-assigning it for the meanwhile.

 Make flush decisions per column family
 --

 Key: HBASE-3149
 URL: https://issues.apache.org/jira/browse/HBASE-3149
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: Karthik Ranganathan
Priority: Critical
 Fix For: 0.92.3


 Today, the flush decision is made using the aggregate size of all column 
 families. When large and small column families co-exist, this causes many 
 small flushes of the smaller CF. We need to make per-CF flush decisions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-8015) Support for Namespaces


[ 
https://issues.apache.org/jira/browse/HBASE-8015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689512#comment-13689512
 ] 

stack commented on HBASE-8015:
--

[~toffer] There will be a migration evaluation script that will looks for 
presence of stuff like hfile v1s -- they must be compacted away before you can 
upgrade -- and this same step could check table names and if any w/ dot found, 
list options.  This script would be run against 0.94 install before shutting 
down for upgrade (Yes two scripts, a checker, and then a doer).

Francis, we should still do the Elliott suggestion even if dot, right?  The dot 
would be for 'external' tools or a useful facility in shell but we want 
namespaces to be first class in API too.

Did you get my review comments up on rb francis?

On dots in namespace, no, if it simplifies.



 Support for Namespaces
 --

 Key: HBASE-8015
 URL: https://issues.apache.org/jira/browse/HBASE-8015
 Project: HBase
  Issue Type: New Feature
Reporter: Francis Liu
Assignee: Francis Liu
 Attachments: HBASE-8015_draft_94.patch, Namespace Design.pdf




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-8759) Family Delete Markers not getting purged after major compaction

2013-06-20 Thread James Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-8759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689513#comment-13689513
 ] 

James Taylor commented on HBASE-8759:
-

Thanks, @larsh. Now aren't you supposed to be on vacation! Go drink a liter or 
two of beer! :-)

 Family Delete Markers not getting purged after major compaction
 ---

 Key: HBASE-8759
 URL: https://issues.apache.org/jira/browse/HBASE-8759
 Project: HBase
  Issue Type: Bug
  Components: Compaction
Affects Versions: 0.94.7
Reporter: Mujtaba Chohan
Priority: Minor

 On table with VERSIONS = '1', KEEP_DELETED_CELLS = 'true'. Family Delete 
 Markers does not get purged after put  delete  major compaction (they keep 
 on incrementing after every put  delete  major compaction)
 Following is the raw scan output after 10 iterations of put  delete  major 
 compaction.
 ROW   COLUMN+CELL 
   
   
 Acolumn=CF:, timestamp=1371512706683, 
 type=DeleteFamily 
  
 Acolumn=CF:, timestamp=1371512706394, 
 type=DeleteFamily 
  
 Acolumn=CF:, timestamp=1371512706054, 
 type=DeleteFamily 
  
 Acolumn=CF:, timestamp=1371512705763, 
 type=DeleteFamily 
  
 Acolumn=CF:, timestamp=1371512705457, 
 type=DeleteFamily 
  
 Acolumn=CF:, timestamp=1371512705149, 
 type=DeleteFamily 
  
 Acolumn=CF:, timestamp=1371512704836, 
 type=DeleteFamily 
  
 Acolumn=CF:, timestamp=1371512704518, 
 type=DeleteFamily 
  
 Acolumn=CF:, timestamp=1371512704162, 
 type=DeleteFamily 
  
 Acolumn=CF:, timestamp=1371512703779, 
 type=DeleteFamily 
  
 Acolumn=CF:COL, 
 timestamp=1371512706682, value=X 
 [~lhofhansl]
 Code to repro this issue:
 http://phoenix-bin.github.io/client/code/delete.java

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-3149) Make flush decisions per column family


 [ 
https://issues.apache.org/jira/browse/HBASE-3149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-3149:
-

Fix Version/s: (was: 0.92.3)

 Make flush decisions per column family
 --

 Key: HBASE-3149
 URL: https://issues.apache.org/jira/browse/HBASE-3149
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: Karthik Ranganathan
Priority: Critical

 Today, the flush decision is made using the aggregate size of all column 
 families. When large and small column families co-exist, this causes many 
 small flushes of the smaller CF. We need to make per-CF flush decisions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6295) Possible performance improvement in client batch operations: presplit and send in background


[ 
https://issues.apache.org/jira/browse/HBASE-6295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689519#comment-13689519
 ] 

Nicolas Liochon commented on HBASE-6295:


Can I do 2097152 / 79 = 26500 to compare with the performances tests previously 
described in 
http://www.spaggiari.org/media/blogs/hbase/pictures/performances_20130321.pdf?

Because the performances were better previously (~35k / rows second).

Same for 2097152 / 114  = 18396 vs. ~30k

Or is it calculated differently?


Anyway, thanks a lot for all these great tests. I will commit tomorrow morning 
my time if there is no objection.

 Possible performance improvement in client batch operations: presplit and 
 send in background
 

 Key: HBASE-6295
 URL: https://issues.apache.org/jira/browse/HBASE-6295
 Project: HBase
  Issue Type: Improvement
  Components: Client, Performance
Affects Versions: 0.95.2
Reporter: Nicolas Liochon
Assignee: Nicolas Liochon
  Labels: noob
 Fix For: 0.98.0

 Attachments: 6295.v11.patch, 6295.v12.patch, 6295.v14.patch, 
 6295.v1.patch, 6295.v2.patch, 6295.v3.patch, 6295.v4.patch, 6295.v5.patch, 
 6295.v6.patch, 6295.v8.patch, 6295.v9.patch


 today batch algo is:
 {noformat}
 for Operation o: ListOp{
   add o to todolist
   if todolist  maxsize or o last in list
 split todolist per location
 send split lists to region servers
 clear todolist
 wait
 }
 {noformat}
 We could:
 - create immediately the final object instead of an intermediate array
 - split per location immediately
 - instead of sending when the list as a whole is full, send it when there is 
 enough data for a single location
 It would be:
 {noformat}
 for Operation o: ListOp{
   get location
   add o to todo location.todolist
   if (location.todolist  maxLocationSize)
 send location.todolist to region server 
 clear location.todolist
 // don't wait, continue the loop
 }
 send remaining
 wait
 {noformat}
 It's not trivial to write if you add error management: retried list must be 
 shared with the operations added in the todolist. But it's doable.
 It's interesting mainly for 'big' writes

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-8662) [rest] support impersonation


[ 
https://issues.apache.org/jira/browse/HBASE-8662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689521#comment-13689521
 ] 

Jimmy Xiang commented on HBASE-8662:


The AuthFilter in your patch looks very familiar. Got from Ooize?

 [rest] support impersonation
 

 Key: HBASE-8662
 URL: https://issues.apache.org/jira/browse/HBASE-8662
 Project: HBase
  Issue Type: Sub-task
  Components: REST, security
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Fix For: 0.98.0

 Attachments: method_doas.patch, secure_rest.patch, trunk-8662.patch, 
 trunk-8662_v2.patch, trunk-8662_v3.patch


 Currently, our client API uses a fixed user: the current user. It should 
 accept a user passed in, if authenticated.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Comment Edited] (HBASE-8662) [rest] support impersonation


[ 
https://issues.apache.org/jira/browse/HBASE-8662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689521#comment-13689521
 ] 

Jimmy Xiang edited comment on HBASE-8662 at 6/20/13 6:59 PM:
-

The AuthFilter in your patch looks very familiar. Got from Ooize?  Do we need 
the optionsServlet?

  was (Author: jxiang):
The AuthFilter in your patch looks very familiar. Got from Ooize?
  
 [rest] support impersonation
 

 Key: HBASE-8662
 URL: https://issues.apache.org/jira/browse/HBASE-8662
 Project: HBase
  Issue Type: Sub-task
  Components: REST, security
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Fix For: 0.98.0

 Attachments: method_doas.patch, secure_rest.patch, trunk-8662.patch, 
 trunk-8662_v2.patch, trunk-8662_v3.patch


 Currently, our client API uses a fixed user: the current user. It should 
 accept a user passed in, if authenticated.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6295) Possible performance improvement in client batch operations: presplit and send in background


[ 
https://issues.apache.org/jira/browse/HBASE-6295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689578#comment-13689578
 ] 

Jean-Marc Spaggiari commented on HBASE-6295:


It's time for x lines, depending of the tests it's not the same number of lines.
For RandomReadTest you need to divide by 1048576
For RandomScanWithRange100Test you need to divide by 4096
For RandomSeekScanTest you need to divide by 40960.
For RandomWriteTest you need to divide by 1048576
For SequentialWriteTest you need to divide by 1048576

This is the number of lines per ms. So multiply by 1000 to have the same 
result. Some are rows/minutes, so just adjust that.

So if you want to compare, here are the numbers in the same format as te PDF 
that I usually produce:
||Test||Trunk||Nic||0.95||
|org,apache,hadoop,hbase,PerformanceEvaluation$RandomReadTest|1377.08|1420.14|1390.50|
|org,apache,hadoop,hbase,PerformanceEvaluation$RandomScanWithRange100Test|11243.12|10992.68|10971.09|
|org,apache,hadoop,hbase,PerformanceEvaluation$RandomSeekScanTest|304.66|296.43|305.25|
|org,apache,hadoop,hbase,PerformanceEvaluation$RandomWriteTest|9176.07|13619.59|9134.09|
|org,apache,hadoop,hbase,PerformanceEvaluation$SequentialWriteTest|13592.40|42655.52|13255.12|

I already noticed the RandomWriteTest impact compared to 0.94 branch and 0.95...

I will re-run the 0.94 tests to make sure, but overall, I really think 0.95 is 
not doing as good as 0.95 for the RandomWriteTest.

 Possible performance improvement in client batch operations: presplit and 
 send in background
 

 Key: HBASE-6295
 URL: https://issues.apache.org/jira/browse/HBASE-6295
 Project: HBase
  Issue Type: Improvement
  Components: Client, Performance
Affects Versions: 0.95.2
Reporter: Nicolas Liochon
Assignee: Nicolas Liochon
  Labels: noob
 Fix For: 0.98.0

 Attachments: 6295.v11.patch, 6295.v12.patch, 6295.v14.patch, 
 6295.v1.patch, 6295.v2.patch, 6295.v3.patch, 6295.v4.patch, 6295.v5.patch, 
 6295.v6.patch, 6295.v8.patch, 6295.v9.patch


 today batch algo is:
 {noformat}
 for Operation o: ListOp{
   add o to todolist
   if todolist  maxsize or o last in list
 split todolist per location
 send split lists to region servers
 clear todolist
 wait
 }
 {noformat}
 We could:
 - create immediately the final object instead of an intermediate array
 - split per location immediately
 - instead of sending when the list as a whole is full, send it when there is 
 enough data for a single location
 It would be:
 {noformat}
 for Operation o: ListOp{
   get location
   add o to todo location.todolist
   if (location.todolist  maxLocationSize)
 send location.todolist to region server 
 clear location.todolist
 // don't wait, continue the loop
 }
 send remaining
 wait
 {noformat}
 It's not trivial to write if you add error management: retried list must be 
 shared with the operations added in the todolist. But it's doable.
 It's interesting mainly for 'big' writes

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-8662) [rest] support impersonation


[ 
https://issues.apache.org/jira/browse/HBASE-8662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689623#comment-13689623
 ] 

Francis Liu commented on HBASE-8662:


Yep. OptionServlet, what for?

 [rest] support impersonation
 

 Key: HBASE-8662
 URL: https://issues.apache.org/jira/browse/HBASE-8662
 Project: HBase
  Issue Type: Sub-task
  Components: REST, security
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Fix For: 0.98.0

 Attachments: method_doas.patch, secure_rest.patch, trunk-8662.patch, 
 trunk-8662_v2.patch, trunk-8662_v3.patch


 Currently, our client API uses a fixed user: the current user. It should 
 accept a user passed in, if authenticated.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-8015) Support for Namespaces


[ 
https://issues.apache.org/jira/browse/HBASE-8015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689631#comment-13689631
 ] 

Francis Liu commented on HBASE-8015:


{quote}
Francis, we should still do the Elliott suggestion even if dot, right? The dot 
would be for 'external' tools or a useful facility in shell but we want 
namespaces to be first class in API too.
{quote}
The approach I proposed earlier would avoid having to do all the api stuff as 
part of the first namespace checkin as well as make use of '.' as a delimeter. 
The suprises are as I mentioned. We can incrementally add the apis. 

Sounds like we are going with overloading all the existing apis to take a 
namespace parameter. If so what would be the behavior when using the old api? 
Will it always reference default namespace or will we support fully qualified 
table names? 

For some reason I'm not getting any jira or RB emails. Will take a look.



 Support for Namespaces
 --

 Key: HBASE-8015
 URL: https://issues.apache.org/jira/browse/HBASE-8015
 Project: HBase
  Issue Type: New Feature
Reporter: Francis Liu
Assignee: Francis Liu
 Attachments: HBASE-8015_draft_94.patch, Namespace Design.pdf




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-8721) Deletes can mask puts that happen after the delete

[
https://issues.apache.org/jira/browse/HBASE-8721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689717#comment-13689717
]

Sergey Shelukhin commented on HBASE-8721:
-

btw, HBase does support point version deletes as far as I see. So specific
version can be deleted if desired. Should we add APIs to delete latest
version? We can even add API to delete all existing versions, won't be very
efficient with many versions (scan or get+bunch of deletes on server side), but
it will work without changing internals

Deletes can mask puts that happen after the delete
--

Key: HBASE-8721
URL: https://issues.apache.org/jira/browse/HBASE-8721
Project: HBase
Issue Type: Improvement
Components: regionserver
Reporter: Feng Honghua
Attachments: HBASE-8721-0.94-V0.patch

[jira] [Resolved] (HBASE-1177) Delay when client is located on the same node as the regionserver

[
https://issues.apache.org/jira/browse/HBASE-1177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

stack resolved HBASE-1177.
--

Resolution: Invalid

Resolving as no longer valid. Looks like Nagles' anyways.

Delay when client is located on the same node as the regionserver
-

Key: HBASE-1177
URL: https://issues.apache.org/jira/browse/HBASE-1177
Project: HBase
Issue Type: Bug
Components: Performance
Affects Versions: 0.19.0
Environment: Linux 2.6.25 x86_64
Reporter: Jonathan Gray
Labels: noob
Attachments: Contribution of getClosest to getRow time.jpg,
Contribution of next to getRow time.jpg, Contribution of seekTo to getClosest
time.jpg, Elapsed time of RowResults.readFields.jpg, getRow + round-trip vs #
columns.jpg, getRow times.jpg, ReadDelayTest.java, RowResults.readFields
zoomed.jpg, screenshot-1.jpg, screenshot-2.jpg, screenshot-3.jpg,
screenshot-4.jpg, zoom of columns vs round-trip blowup.jpg

During testing of HBASE-80, we uncovered a strange 40ms delay for random
reads. We ran a series of tests and found that it only happens when the
client is on the same node as the RS and for a certain range of payloads (not
specifically related to number of columns or size of them, only total
payload). It appears to be precisely 40ms every time.
Unsure if this is particular to our architecture, but it does happen on all
nodes we've tried. Issue completely goes away with very large payloads or
moving the client.
Will post a test program tomorrow if anyone can test on a different
architecture.
Making a blocker for 0.20. Since this happens when you have an MR task
running local to the RS, and this is what we try to do, might also consider
making this a blocker for 0.19.1.

[jira] [Commented] (HBASE-4755) HBase based block placement in DFS

2013-06-20 Thread Devaraj Das (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689754#comment-13689754
 ] 

Devaraj Das commented on HBASE-4755:


[~jiangbinglover], yes HBase would need to periodically refresh the mappings, 
and also when compactions happen, the data would be rewritten in the three 
current nodes. I need to implement the balancer in FavoredNodeLoadBalancer 
(balanceCluster method). I should have something shortly.

 HBase based block placement in DFS
 --

 Key: HBASE-4755
 URL: https://issues.apache.org/jira/browse/HBASE-4755
 Project: HBase
  Issue Type: New Feature
Affects Versions: 0.94.0
Reporter: Karthik Ranganathan
Assignee: Christopher Gist
Priority: Critical
 Attachments: 4755-wip-1.patch, hbase-4755-notes.txt


 The feature as is only useful for HBase clusters that care about data 
 locality on regionservers, but this feature can also enable a lot of nice 
 features down the road.
 The basic idea is as follows: instead of letting HDFS determine where to 
 replicate data (r=3) by place blocks on various regions, it is better to let 
 HBase do so by providing hints to HDFS through the DFS client. That way 
 instead of replicating data at a blocks level, we can replicate data at a 
 per-region level (each region owned by a promary, a secondary and a tertiary 
 regionserver). This is better for 2 things:
 - Can make region failover faster on clusters which benefit from data affinity
 - On large clusters with random block placement policy, this helps reduce the 
 probability of data loss
 The algo is as follows:
 - Each region in META will have 3 columns which are the preferred 
 regionservers for that region (primary, secondary and tertiary)
 - Preferred assignment can be controlled by a config knob
 - Upon cluster start, HMaster will enter a mapping from each region to 3 
 regionservers (random hash, could use current locality, etc)
 - The load balancer would assign out regions preferring region assignments to 
 primary over secondary over tertiary over any other node
 - Periodically (say weekly, configurable) the HMaster would run a locality 
 checked and make sure the map it has for region to regionservers is optimal.
 Down the road, this can be enhanced to control region placement in the 
 following cases:
 - Mixed hardware SKU where some regionservers can hold fewer regions
 - Load balancing across tables where we dont want multiple regions of a table 
 to get assigned to the same regionservers
 - Multi-tenancy, where we can restrict the assignment of the regions of some 
 table to a subset of regionservers, so an abusive app cannot take down the 
 whole HBase cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HBASE-8776) port HBASE-8723 to 0.94

Sergey Shelukhin created HBASE-8776:
---

 Summary: port HBASE-8723 to 0.94
 Key: HBASE-8776
 URL: https://issues.apache.org/jira/browse/HBASE-8776
 Project: HBase
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-8776) port HBASE-8723 to 0.94


 [ 
https://issues.apache.org/jira/browse/HBASE-8776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-8776:


Fix Version/s: 0.94.9

 port HBASE-8723 to 0.94
 ---

 Key: HBASE-8776
 URL: https://issues.apache.org/jira/browse/HBASE-8776
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.8
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: 0.94.9

 Attachments: HBASE-8776-v0.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-8776) port HBASE-8723 to 0.94


 [ 
https://issues.apache.org/jira/browse/HBASE-8776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-8776:


Attachment: HBASE-8776-v0.patch

I am increasing retry count less aggressively than original; this should be 
more than enough to ride over server failure given the default negotiated ZK 
timeout of 40s.

 port HBASE-8723 to 0.94
 ---

 Key: HBASE-8776
 URL: https://issues.apache.org/jira/browse/HBASE-8776
 Project: HBase
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HBASE-8776-v0.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-8776) port HBASE-8723 to 0.94


 [ 
https://issues.apache.org/jira/browse/HBASE-8776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-8776:


Status: Patch Available  (was: Open)

 port HBASE-8723 to 0.94
 ---

 Key: HBASE-8776
 URL: https://issues.apache.org/jira/browse/HBASE-8776
 Project: HBase
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HBASE-8776-v0.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-8776) port HBASE-8723 to 0.94


 [ 
https://issues.apache.org/jira/browse/HBASE-8776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-8776:


Affects Version/s: 0.94.8

 port HBASE-8723 to 0.94
 ---

 Key: HBASE-8776
 URL: https://issues.apache.org/jira/browse/HBASE-8776
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.8
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HBASE-8776-v0.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-8776) port HBASE-8723 to 0.94


[ 
https://issues.apache.org/jira/browse/HBASE-8776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689781#comment-13689781
 ] 

Sergey Shelukhin commented on HBASE-8776:
-

[~lhofhansl] are you ok with this change to client retries?

 port HBASE-8723 to 0.94
 ---

 Key: HBASE-8776
 URL: https://issues.apache.org/jira/browse/HBASE-8776
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.8
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: 0.94.9

 Attachments: HBASE-8776-v0.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-8776) port HBASE-8723 to 0.94

2013-06-20 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-8776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689783#comment-13689783
 ] 

Hadoop QA commented on HBASE-8776:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12588947/HBASE-8776-v0.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6086//console

This message is automatically generated.

 port HBASE-8723 to 0.94
 ---

 Key: HBASE-8776
 URL: https://issues.apache.org/jira/browse/HBASE-8776
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.8
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: 0.94.9

 Attachments: HBASE-8776-v0.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HBASE-8777) HBase client should determine the destination server after retry time

Sergey Shelukhin created HBASE-8777:
---

 Summary: HBase client should determine the destination server 
after retry time
 Key: HBASE-8777
 URL: https://issues.apache.org/jira/browse/HBASE-8777
 Project: HBase
  Issue Type: Improvement
  Components: Client
Reporter: Sergey Shelukhin


HBase currently determines which server to go to, then creates delayed callable 
with pre-determined server and goes there. For later 16-32-... second retries 
this approach is suboptimal, the cluster could have seen massive changes in the 
meantime, so retry might be completely useless.
We should re-locate regions after the delay, at least for longer retries. Given 
how grouping is currently done it would be a bit of a refactoring.

The effect of this is alleviated (to a degree) on trunk by server-based retries 
(if we fail going to the pre-delay server after delay and then determine the 
server has changed, we will go to the new server immediately, so we only lose 
the failed round-trip time); on 94, if the region is opened on some other 
server during the delay, we'd go to the old one, fail, then find out it's on 
different server, wait a bunch more time because it's a late-stage retry and 
THEN go to the new one, as far as I see. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-8776) port HBASE-8723 to 0.94


[ 
https://issues.apache.org/jira/browse/HBASE-8776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689801#comment-13689801
 ] 

Lars Hofhansl commented on HBASE-8776:
--

Don't the current defaults already add up to 47? 1+1+1+2+2+4+4+8+8+16 = 47
10 seems good enough, unless I am missing something. Will check the original 
jira tomorrow.


 port HBASE-8723 to 0.94
 ---

 Key: HBASE-8776
 URL: https://issues.apache.org/jira/browse/HBASE-8776
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.8
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: 0.94.9

 Attachments: HBASE-8776-v0.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HBASE-6620) Test org.apache.hadoop.hbase.client.TestFromClientSideWithCoprocessor.testPoolBehavior flaps in autobuilds.


 [ 
https://issues.apache.org/jira/browse/HBASE-6620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HBASE-6620.
--

Resolution: Cannot Reproduce

We have not seen this in a long time.  Closing.  Can open a new one if we see 
it again.

 Test 
 org.apache.hadoop.hbase.client.TestFromClientSideWithCoprocessor.testPoolBehavior
  flaps in autobuilds.
 ---

 Key: HBASE-6620
 URL: https://issues.apache.org/jira/browse/HBASE-6620
 Project: HBase
  Issue Type: Bug
  Components: Client
Reporter: Sameer Vaishampayan

 Test flaps in autobuilds with assertion failure.
 org.apache.hadoop.hbase.client.TestFromClientSideWithCoprocessor.testPoolBehavior
 Failing for the past 1 build (Since #2602 )
 Took 3 ms.
 Error Message
 expected:3 but was:4
 Stacktrace
 java.lang.AssertionError: expected:3 but was:4
   at org.junit.Assert.fail(Assert.java:93)
   at org.junit.Assert.failNotEquals(Assert.java:647)
   at org.junit.Assert.assertEquals(Assert.java:128)
   at org.junit.Assert.assertEquals(Assert.java:472)
   at org.junit.Assert.assertEquals(Assert.java:456)
   at 
 org.apache.hadoop.hbase.client.TestFromClientSide.testPoolBehavior(TestFromClientSide.java:4334)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
   at 
 org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
   at 
 org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
   at 
 org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
   at 
 org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
   at 
 org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30)
   at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:47)
   at org.junit.rules.RunRules.evaluate(RunRules.java:18)
   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263)
   at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68)
   at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47)
   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231)
   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60)
   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229)
   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50)
   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222)
   at 
 org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
   at 
 org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30)
   at org.junit.runners.ParentRunner.run(ParentRunner.java:300)
   at org.junit.runners.Suite.runChild(Suite.java:128)
   at org.junit.runners.Suite.runChild(Suite.java:24)
   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231)
   at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-8015) Support for Namespaces

2013-06-20 Thread Enis Soztutar (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-8015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689812#comment-13689812
 ] 

Enis Soztutar commented on HBASE-8015:
--

One problem with option 4 is that we want to pay the price of migration only 
one between 0.94-0.96. If we do that, then it means we have to carry the 
exception tables code for all the releases going forward. Option 1 better than 
this I think? Note that surprise #1 also applies here as well. 

bq. Sounds like we are going with overloading all the existing apis to take a 
namespace parameter. If so what would be the behavior when using the old api? 
Will it always reference default namespace or will we support fully qualified 
table names?
It should use the default ns. I think the idea is that there will not be a 
public facing thing called fully qualified table name in Elliot's approach. 
Although internally, we will need one, hence my tendency to go with option 2 
over 3 (see my above comment): namespace,table seems good enough for me. 

 Support for Namespaces
 --

 Key: HBASE-8015
 URL: https://issues.apache.org/jira/browse/HBASE-8015
 Project: HBase
  Issue Type: New Feature
Reporter: Francis Liu
Assignee: Francis Liu
 Attachments: HBASE-8015_draft_94.patch, Namespace Design.pdf




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-8777) HBase client should determine the destination server after retry time


[ 
https://issues.apache.org/jira/browse/HBASE-8777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689819#comment-13689819
 ] 

Nicolas Liochon commented on HBASE-8777:


It's actually implemented this way in 6295.



 HBase client should determine the destination server after retry time
 -

 Key: HBASE-8777
 URL: https://issues.apache.org/jira/browse/HBASE-8777
 Project: HBase
  Issue Type: Improvement
  Components: Client
Reporter: Sergey Shelukhin

 HBase currently determines which server to go to, then creates delayed 
 callable with pre-determined server and goes there. For later 16-32-... 
 second retries this approach is suboptimal, the cluster could have seen 
 massive changes in the meantime, so retry might be completely useless.
 We should re-locate regions after the delay, at least for longer retries. 
 Given how grouping is currently done it would be a bit of a refactoring.
 The effect of this is alleviated (to a degree) on trunk by server-based 
 retries (if we fail going to the pre-delay server after delay and then 
 determine the server has changed, we will go to the new server immediately, 
 so we only lose the failed round-trip time); on 94, if the region is opened 
 on some other server during the delay, we'd go to the old one, fail, then 
 find out it's on different server, wait a bunch more time because it's a 
 late-stage retry and THEN go to the new one, as far as I see. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-8778) Region assigments scan table directory making them slow for huge tables

2013-06-20 Thread Dave Latham (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dave Latham updated HBASE-8778:
---

Attachment: HBASE-8778-0.94.5.patch

One solution is to instead keep the table descriptor files in a subdirectory of 
the table directory so that only that subdirectory needs a scan.  The attached 
patch is one from 0.94.5 that implements this scheme.  In order to be 
applicable in a rolling restart scenario, the new descriptor is written to both 
the table directory and the subdirectory.  Readers first read the subdirectory, 
then fall back to the table directory.  In order to be robust against failures 
or races, a lock file is used in the subdirectory during writes.  The patch 
also refactors the FSTableDescriptors class to require a Configuration (to 
determine lock wait duration) as well as updates so that it more uniformly 
enforces the fsreadonly flag (RegionServers never do writes) and stick with 
using instance methods rather than static methods.  We are proceeding with this 
and hope to roll it out to our cluster.  To update to this patch once the 
writers (HBase Master, tools like hbck, merge, compact) are upgraded then old 
writers should not be used.

I would love to hear the opinion of the HBase community regarding this issue.  
Some questions:
 - Is it worth fixing? (I strongly believe so as it has a big impact on MTTR 
for large clusters)
 - What's the best approach to fixing?
   - Some other possibilities:
 - Using a lock file and well known table descriptor file rather than 
sequence ids
 - Relying on more descriptor caching rather than hitting hdfs on every 
region assignment (as bulk assignment already does).
 - Move table descriptors to a different location in hdfs (single location 
for all tables?)
 - Move table descriptors out of hdfs to ZK
 - How and when can we migrate to that approach?
   - For the patch above once the cluster has been upgraded and updated the 
location of the descriptor files to have a copy in the subdirectory it would be 
easy to have the next version use only those files.
   - Alternatively, for the singularity there could be a one-time piece of 
migration code that just moved the files there.


 Region assigments scan table directory making them slow for huge tables
 ---

 Key: HBASE-8778
 URL: https://issues.apache.org/jira/browse/HBASE-8778
 Project: HBase
  Issue Type: Improvement
Reporter: Dave Latham
 Attachments: HBASE-8778-0.94.5.patch


 On a table with 130k regions it takes about 3 seconds for a region server to 
 open a region once it has been assigned.
 Watching the threads for a region server running 0.94.5 that is opening many 
 such regions shows the thread opening the reigon in code like this:
 {noformat}
 PRI IPC Server handler 4 on 60020 daemon prio=10 tid=0x2aaac07e9000 
 nid=0x6566 runnable [0x4c46d000]
java.lang.Thread.State: RUNNABLE
 at java.lang.String.indexOf(String.java:1521)
 at java.net.URI$Parser.scan(URI.java:2912)
 at java.net.URI$Parser.parse(URI.java:3004)
 at java.net.URI.init(URI.java:736)
 at org.apache.hadoop.fs.Path.initialize(Path.java:145)
 at org.apache.hadoop.fs.Path.init(Path.java:126)
 at org.apache.hadoop.fs.Path.init(Path.java:50)
 at 
 org.apache.hadoop.hdfs.protocol.HdfsFileStatus.getFullPath(HdfsFileStatus.java:215)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem.makeQualified(DistributedFileSystem.java:252)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:311)
 at 
 org.apache.hadoop.fs.FilterFileSystem.listStatus(FilterFileSystem.java:159)
 at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:842)
 at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:867)
 at org.apache.hadoop.hbase.util.FSUtils.listStatus(FSUtils.java:1168)
 at 
 org.apache.hadoop.hbase.util.FSTableDescriptors.getTableInfoPath(FSTableDescriptors.java:269)
 at 
 org.apache.hadoop.hbase.util.FSTableDescriptors.getTableInfoPath(FSTableDescriptors.java:255)
 at 
 org.apache.hadoop.hbase.util.FSTableDescriptors.getTableInfoModtime(FSTableDescriptors.java:368)
 at 
 org.apache.hadoop.hbase.util.FSTableDescriptors.get(FSTableDescriptors.java:155)
 at 
 org.apache.hadoop.hbase.util.FSTableDescriptors.get(FSTableDescriptors.java:126)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion(HRegionServer.java:2834)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion(HRegionServer.java:2807)
 at sun.reflect.GeneratedMethodAccessor64.invoke(Unknown Source)
 at

[jira] [Created] (HBASE-8778) Region assigments scan table directory making them slow for huge tables

2013-06-20 Thread Dave Latham (JIRA)

Dave Latham created HBASE-8778:
--

 Summary: Region assigments scan table directory making them slow 
for huge tables
 Key: HBASE-8778
 URL: https://issues.apache.org/jira/browse/HBASE-8778
 Project: HBase
  Issue Type: Improvement
Reporter: Dave Latham
 Attachments: HBASE-8778-0.94.5.patch

On a table with 130k regions it takes about 3 seconds for a region server to 
open a region once it has been assigned.

Watching the threads for a region server running 0.94.5 that is opening many 
such regions shows the thread opening the reigon in code like this:
{noformat}
PRI IPC Server handler 4 on 60020 daemon prio=10 tid=0x2aaac07e9000 
nid=0x6566 runnable [0x4c46d000]
   java.lang.Thread.State: RUNNABLE
at java.lang.String.indexOf(String.java:1521)
at java.net.URI$Parser.scan(URI.java:2912)
at java.net.URI$Parser.parse(URI.java:3004)
at java.net.URI.init(URI.java:736)
at org.apache.hadoop.fs.Path.initialize(Path.java:145)
at org.apache.hadoop.fs.Path.init(Path.java:126)
at org.apache.hadoop.fs.Path.init(Path.java:50)
at 
org.apache.hadoop.hdfs.protocol.HdfsFileStatus.getFullPath(HdfsFileStatus.java:215)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.makeQualified(DistributedFileSystem.java:252)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:311)
at 
org.apache.hadoop.fs.FilterFileSystem.listStatus(FilterFileSystem.java:159)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:842)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:867)
at org.apache.hadoop.hbase.util.FSUtils.listStatus(FSUtils.java:1168)
at 
org.apache.hadoop.hbase.util.FSTableDescriptors.getTableInfoPath(FSTableDescriptors.java:269)
at 
org.apache.hadoop.hbase.util.FSTableDescriptors.getTableInfoPath(FSTableDescriptors.java:255)
at 
org.apache.hadoop.hbase.util.FSTableDescriptors.getTableInfoModtime(FSTableDescriptors.java:368)
at 
org.apache.hadoop.hbase.util.FSTableDescriptors.get(FSTableDescriptors.java:155)
at 
org.apache.hadoop.hbase.util.FSTableDescriptors.get(FSTableDescriptors.java:126)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion(HRegionServer.java:2834)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion(HRegionServer.java:2807)
at sun.reflect.GeneratedMethodAccessor64.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:320)
at 
org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1426)
{noformat}

To open the region, the region server first loads the latest HTableDescriptor.  
Since HBASE-4553 HTableDescriptor's are stored in the file system at 
/hbase/tableDir/.tableinfo.sequenceNum.  The file with the largest 
sequenceNum is the current descriptor.  This is done so that the current 
descirptor is updated atomically.  However, since the filename is not known in 
advance FSTableDescriptors it has to do a FileSystem.listStatus operation which 
has to list all files in the directory to find it.  The directory also contains 
all the region directories, so in our case it has to load 130k FileStatus 
objects.  Even using a globStatus matching function still transfers all the 
objects to the client before performing the pattern matching.  Furthermore HDFS 
uses a default of transferring 1000 directory entries in each RPC call, so it 
requires 130 roundtrips to the namenode to fetch all the directory entries.

Consequently, to reassign all the regions of a table (or a constant fraction 
thereof) requires time proportional to the square of the number of regions.

In our case, if a region server fails with 200 such regions, it takes 10+ 
minutes for them all to be reassigned, after the zk expiration and log 
splitting.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-8015) Support for Namespaces


[ 
https://issues.apache.org/jira/browse/HBASE-8015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689856#comment-13689856
 ] 

stack commented on HBASE-8015:
--

bq.  If so what would be the behavior when using the old api? Will it always 
reference default namespace or will we support fully qualified table names?

I think the old API will be against default NS.

The FQTN (Fully Qualified Table Name) would be an internal or something that 
could be passed to external tools (command-line, shell).



 Support for Namespaces
 --

 Key: HBASE-8015
 URL: https://issues.apache.org/jira/browse/HBASE-8015
 Project: HBase
  Issue Type: New Feature
Reporter: Francis Liu
Assignee: Francis Liu
 Attachments: HBASE-8015_draft_94.patch, Namespace Design.pdf




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-8776) port HBASE-8723 to 0.94


[ 
https://issues.apache.org/jira/browse/HBASE-8776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689861#comment-13689861
 ] 

Sergey Shelukhin commented on HBASE-8776:
-

there's only one 8, and 32.
The problem is that we determine the server before delay, so recovery has to 
happen before the delay for last retry (I filed a JIRA for that). 
1+1+1+2+2+4+4+8+16 = 39. Recovery after zk timeout is also not instant.

 port HBASE-8723 to 0.94
 ---

 Key: HBASE-8776
 URL: https://issues.apache.org/jira/browse/HBASE-8776
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.8
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: 0.94.9

 Attachments: HBASE-8776-v0.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6295) Possible performance improvement in client batch operations: presplit and send in background


[ 
https://issues.apache.org/jira/browse/HBASE-6295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689867#comment-13689867
 ] 

Sergey Shelukhin commented on HBASE-6295:
-

Hmm, I just noticed this test removed usage of 
errorsByServer.calculateBackoffTime.
Can it please be put back? I have to withdraw my +1... :(

 Possible performance improvement in client batch operations: presplit and 
 send in background
 

 Key: HBASE-6295
 URL: https://issues.apache.org/jira/browse/HBASE-6295
 Project: HBase
  Issue Type: Improvement
  Components: Client, Performance
Affects Versions: 0.95.2
Reporter: Nicolas Liochon
Assignee: Nicolas Liochon
  Labels: noob
 Fix For: 0.98.0

 Attachments: 6295.v11.patch, 6295.v12.patch, 6295.v14.patch, 
 6295.v1.patch, 6295.v2.patch, 6295.v3.patch, 6295.v4.patch, 6295.v5.patch, 
 6295.v6.patch, 6295.v8.patch, 6295.v9.patch


 today batch algo is:
 {noformat}
 for Operation o: ListOp{
   add o to todolist
   if todolist  maxsize or o last in list
 split todolist per location
 send split lists to region servers
 clear todolist
 wait
 }
 {noformat}
 We could:
 - create immediately the final object instead of an intermediate array
 - split per location immediately
 - instead of sending when the list as a whole is full, send it when there is 
 enough data for a single location
 It would be:
 {noformat}
 for Operation o: ListOp{
   get location
   add o to todo location.todolist
   if (location.todolist  maxLocationSize)
 send location.todolist to region server 
 clear location.todolist
 // don't wait, continue the loop
 }
 send remaining
 wait
 {noformat}
 It's not trivial to write if you add error management: retried list must be 
 shared with the operations added in the todolist. But it's doable.
 It's interesting mainly for 'big' writes

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-8777) HBase client should determine the destination server after retry time


[ 
https://issues.apache.org/jira/browse/HBASE-8777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689869#comment-13689869
 ] 

Sergey Shelukhin commented on HBASE-8777:
-

not in 94 though?

 HBase client should determine the destination server after retry time
 -

 Key: HBASE-8777
 URL: https://issues.apache.org/jira/browse/HBASE-8777
 Project: HBase
  Issue Type: Improvement
  Components: Client
Reporter: Sergey Shelukhin

 HBase currently determines which server to go to, then creates delayed 
 callable with pre-determined server and goes there. For later 16-32-... 
 second retries this approach is suboptimal, the cluster could have seen 
 massive changes in the meantime, so retry might be completely useless.
 We should re-locate regions after the delay, at least for longer retries. 
 Given how grouping is currently done it would be a bit of a refactoring.
 The effect of this is alleviated (to a degree) on trunk by server-based 
 retries (if we fail going to the pre-delay server after delay and then 
 determine the server has changed, we will go to the new server immediately, 
 so we only lose the failed round-trip time); on 94, if the region is opened 
 on some other server during the delay, we'd go to the old one, fail, then 
 find out it's on different server, wait a bunch more time because it's a 
 late-stage retry and THEN go to the new one, as far as I see. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-8777) HBase client should determine the destination server after retry time