https://issues.apache.org/jira/browse/HBASE-7813
On Mon, Feb 11, 2013 at 8:44 AM, Varun Sharma <[email protected]> wrote: > I think I found a bug with the BulkDeleteEndpoint which is causing me to > lose entire rows even with COLUMN deletes. I filed a JIRA for the same and > can upload a patch. > > > On Mon, Feb 11, 2013 at 7:36 AM, Varun Sharma <[email protected]> wrote: > >> No, >> >> Endpoint executes with normal QoS but it initiates a scan which seems to >> be execute on High QoS looking at the handlers. Though, I am not totally >> sure, maybe that region server was housing the .META table and those were >> actually scan.next operations for the META table. So I will need to confirm >> this. >> >> Varun >> >> >> On Mon, Feb 11, 2013 at 4:50 AM, Anoop Sam John <[email protected]>wrote: >> >>> You mean the end point is geetting executed with high QoS? You checked >>> with some logs? >>> >>> -Anoop- >>> ________________________________________ >>> From: Varun Sharma [[email protected]] >>> Sent: Monday, February 11, 2013 4:05 AM >>> To: [email protected]; lars hofhansl >>> Subject: Re: Get on a row with multiple columns >>> >>> Back to BulkDeleteEndpoint, i got it to work but why are the >>> scanner.next() >>> calls executing on the Priority handler queue ? >>> >>> Varun >>> >>> On Sat, Feb 9, 2013 at 8:46 AM, lars hofhansl <[email protected]> wrote: >>> >>> > The answer is "probably" :) >>> > It's disabled in 0.96 by default. Check out HBASE-7008 ( >>> > https://issues.apache.org/jira/browse/HBASE-7008) and the discussion >>> > there. >>> > >>> > Also check out the discussion in HBASE-5943 and HADOOP-8069 ( >>> > https://issues.apache.org/jira/browse/HADOOP-8069) >>> > >>> > >>> > -- Lars >>> > >>> > >>> > >>> > ________________________________ >>> > From: Jean-Marc Spaggiari <[email protected]> >>> > To: [email protected] >>> > Sent: Saturday, February 9, 2013 5:02 AM >>> > Subject: Re: Get on a row with multiple columns >>> > >>> > Lars, should we always consider disabling Nagle? What's the down side? >>> > >>> > JM >>> > >>> > 2013/2/9, Varun Sharma <[email protected]>: >>> > > Yeah, I meant true... >>> > > >>> > > On Sat, Feb 9, 2013 at 12:17 AM, lars hofhansl <[email protected]> >>> wrote: >>> > > >>> > >> Should be set to true. If tcpnodelay is set to true, Nagle's is >>> > disabled. >>> > >> >>> > >> -- Lars >>> > >> >>> > >> >>> > >> >>> > >> ________________________________ >>> > >> From: Varun Sharma <[email protected]> >>> > >> To: [email protected]; lars hofhansl <[email protected]> >>> > >> Sent: Saturday, February 9, 2013 12:11 AM >>> > >> Subject: Re: Get on a row with multiple columns >>> > >> >>> > >> >>> > >> Okay I did my research - these need to be set to false. I agree. >>> > >> >>> > >> >>> > >> On Sat, Feb 9, 2013 at 12:05 AM, Varun Sharma <[email protected]> >>> > >> wrote: >>> > >> >>> > >> I have ipc.client.tcpnodelay, ipc.server.tcpnodelay set to false >>> and the >>> > >> hbase one - [hbase].ipc.client.tcpnodelay set to true. Do these >>> induce >>> > >> network latency ? >>> > >> > >>> > >> > >>> > >> >On Fri, Feb 8, 2013 at 11:57 PM, lars hofhansl <[email protected]> >>> > wrote: >>> > >> > >>> > >> >Sorry.. I meant set these two config parameters to true (not false >>> as I >>> > >> state below). >>> > >> >> >>> > >> >> >>> > >> >> >>> > >> >> >>> > >> >>----- Original Message ----- >>> > >> >>From: lars hofhansl <[email protected]> >>> > >> >>To: "[email protected]" <[email protected]> >>> > >> >>Cc: >>> > >> >>Sent: Friday, February 8, 2013 11:41 PM >>> > >> >>Subject: Re: Get on a row with multiple columns >>> > >> >> >>> > >> >>Only somewhat related. Seeing the magic 40ms random read time >>> there. >>> > >> >> Did >>> > >> you disable Nagle's? >>> > >> >>(set hbase.ipc.client.tcpnodelay and ipc.server.tcpnodelay to >>> false in >>> > >> hbase-site.xml). >>> > >> >> >>> > >> >>________________________________ >>> > >> >>From: Varun Sharma <[email protected]> >>> > >> >>To: [email protected]; lars hofhansl <[email protected]> >>> > >> >>Sent: Friday, February 8, 2013 10:45 PM >>> > >> >>Subject: Re: Get on a row with multiple columns >>> > >> >> >>> > >> >>The use case is like your twitter feed. Tweets from people u >>> follow. >>> > >> >> When >>> > >> >>someone unfollows, you need to delete a bunch of his tweets from >>> the >>> > >> >>following feed. So, its frequent, and we are essentially running >>> into >>> > >> some >>> > >> >>extreme corner cases like the one above. We need high write >>> throughput >>> > >> for >>> > >> >>this, since when someone tweets, we need to fanout the tweet to >>> all >>> > the >>> > >> >>followers. We need the ability to do fast deletes (unfollow) and >>> fast >>> > >> adds >>> > >> >>(follow) and also be able to do fast random gets - when a real >>> user >>> > >> >> loads >>> > >> >>the feed. I doubt we will able to play much with the schema here >>> since >>> > >> >> we >>> > >> >>need to support a bunch of use cases. >>> > >> >> >>> > >> >>@lars: It does not take 30 seconds to place 300 delete markers. It >>> > >> >> takes >>> > >> 30 >>> > >> >>seconds to first find which of those 300 pins are in the set of >>> > columns >>> > >> >>present - this invokes 300 gets and then place the appropriate >>> delete >>> > >> >>markers. Note that we can have tens of thousands of columns in a >>> > single >>> > >> row >>> > >> >>so a single get is not cheap. >>> > >> >> >>> > >> >>If we were to just place delete markers, that is very fast. But >>> when >>> > >> >>started doing that, our random read performance suffered because >>> of >>> > too >>> > >> >>many delete markers. The 90th percentile on random reads shot up >>> from >>> > >> >> 40 >>> > >> >>milliseconds to 150 milliseconds, which is not acceptable for our >>> > >> usecase. >>> > >> >> >>> > >> >>Thanks >>> > >> >>Varun >>> > >> >> >>> > >> >>On Fri, Feb 8, 2013 at 10:33 PM, lars hofhansl <[email protected]> >>> > >> >> wrote: >>> > >> >> >>> > >> >>> Can you organize your columns and then delete by column family? >>> > >> >>> >>> > >> >>> deleteColumn without specifying a TS is expensive, since HBase >>> first >>> > >> has >>> > >> >>> to figure out what the latest TS is. >>> > >> >>> >>> > >> >>> Should be better in 0.94.1 or later since deletes are batched >>> like >>> > >> >>> Puts >>> > >> >>> (still need to retrieve the latest version, though). >>> > >> >>> >>> > >> >>> In 0.94.3 or later you can also the BulkDeleteEndPoint, which >>> > >> >>> basically >>> > >> >>> let's specify a scan condition and then place specific delete >>> marker >>> > >> for >>> > >> >>> all KVs encountered. >>> > >> >>> >>> > >> >>> >>> > >> >>> If you wanted to get really >>> > >> >>> fancy, you could hook up a coprocessor to the compaction >>> process and >>> > >> >>> simply filter all KVs you no longer want (without ever placing >>> any >>> > >> >>> delete markers). >>> > >> >>> >>> > >> >>> >>> > >> >>> Are you saying it takes 15 seconds to place 300 version delete >>> > >> markers?! >>> > >> >>> >>> > >> >>> >>> > >> >>> -- Lars >>> > >> >>> >>> > >> >>> >>> > >> >>> >>> > >> >>> ________________________________ >>> > >> >>> From: Varun Sharma <[email protected]> >>> > >> >>> To: [email protected] >>> > >> >>> Sent: Friday, February 8, 2013 10:05 PM >>> > >> >>> Subject: Re: Get on a row with multiple columns >>> > >> >>> >>> > >> >>> We are given a set of 300 columns to delete. I tested two cases: >>> > >> >>> >>> > >> >>> 1) deleteColumns() - with the 's' >>> > >> >>> >>> > >> >>> This function simply adds delete markers for 300 columns, in our >>> > >> >>> case, >>> > >> >>> typically only a fraction of these columns are actually present >>> - >>> > 10. >>> > >> After >>> > >> >>> starting to use deleteColumns, we starting seeing a drop in >>> cluster >>> > >> wide >>> > >> >>> random read performance - 90th percentile latency worsened, so >>> did >>> > >> >>> 99th >>> > >> >>> probably because of having to traverse delete markers. I >>> attribute >>> > >> this to >>> > >> >>> profusion of delete markers in the cluster. Major compactions >>> slowed >>> > >> down >>> > >> >>> by almost 50 percent probably because of having to clean out >>> > >> significantly >>> > >> >>> more delete markers. >>> > >> >>> >>> > >> >>> 2) deleteColumn() >>> > >> >>> >>> > >> >>> Ended up with untolerable 15 second calls, which clogged all the >>> > >> handlers. >>> > >> >>> Making the cluster pretty much unresponsive. >>> > >> >>> >>> > >> >>> On Fri, Feb 8, 2013 at 9:55 PM, Ted Yu <[email protected]> >>> wrote: >>> > >> >>> >>> > >> >>> > For the 300 column deletes, can you show us how the Delete(s) >>> are >>> > >> >>> > constructed ? >>> > >> >>> > >>> > >> >>> > Do you use this method ? >>> > >> >>> > >>> > >> >>> > public Delete deleteColumns(byte [] family, byte [] >>> qualifier) { >>> > >> >>> > Thanks >>> > >> >>> > >>> > >> >>> > On Fri, Feb 8, 2013 at 9:44 PM, Varun Sharma < >>> [email protected] >>> > > >>> > >> >>> wrote: >>> > >> >>> > >>> > >> >>> > > So a Get call with multiple columns on a single row should >>> be >>> > >> >>> > > much >>> > >> >>> faster >>> > >> >>> > > than independent Get(s) on each of those columns for that >>> row. I >>> > >> >>> > > am >>> > >> >>> > > basically seeing severely poor performance (~ 15 seconds) >>> for >>> > >> certain >>> > >> >>> > > deleteColumn() calls and I am seeing that there is a >>> > >> >>> > > prepareDeleteTimestamps() function in HRegion.java which >>> first >>> > >> tries to >>> > >> >>> > > locate the column by doing individual gets on each column >>> you >>> > >> >>> > > want >>> > >> to >>> > >> >>> > > delete (I am doing 300 column deletes). Now, I think this >>> should >>> > >> ideall >>> > >> >>> > by >>> > >> >>> > > 1 get call with the batch of 300 columns so that one scan >>> can >>> > >> retrieve >>> > >> >>> > the >>> > >> >>> > > columns and the columns that are found, are indeed deleted. >>> > >> >>> > > >>> > >> >>> > > Before I try this fix, I wanted to get an opinion if it will >>> > make >>> > >> >>> > > a >>> > >> >>> > > difference to batch the get() and it seems from your >>> answer, it >>> > >> should. >>> > >> >>> > > >>> > >> >>> > > On Fri, Feb 8, 2013 at 9:34 PM, lars hofhansl < >>> [email protected] >>> > > >>> > >> >>> wrote: >>> > >> >>> > > >>> > >> >>> > > > Everything is stored as a KeyValue in HBase. >>> > >> >>> > > > The Key part of a KeyValue contains the row key, column >>> > family, >>> > >> >>> column >>> > >> >>> > > > name, and timestamp in that order. >>> > >> >>> > > > Each column family has it's own store and store files. >>> > >> >>> > > > >>> > >> >>> > > > So in a nutshell a get is executed by starting a scan at >>> the >>> > >> >>> > > > row >>> > >> key >>> > >> >>> > > > (which is a prefix of the key) in each store (CF) and then >>> > >> scanning >>> > >> >>> > > forward >>> > >> >>> > > > in each store until the next row key is reached. (in >>> reality >>> > it >>> > >> is a >>> > >> >>> > bit >>> > >> >>> > > > more complicated due to multiple versions, skipping >>> columns, >>> > >> >>> > > > etc) >>> > >> >>> > > > >>> > >> >>> > > > >>> > >> >>> > > > -- Lars >>> > >> >>> > > > ________________________________ >>> > >> >>> > > > From: Varun Sharma <[email protected]> >>> > >> >>> > > > To: [email protected] >>> > >> >>> > > > Sent: Friday, February 8, 2013 9:22 PM >>> > >> >>> > > > Subject: Re: Get on a row with multiple columns >>> > >> >>> > > > >>> > >> >>> > > > Sorry, I was a little unclear with my question. >>> > >> >>> > > > >>> > >> >>> > > > Lets say you have >>> > >> >>> > > > >>> > >> >>> > > > Get get = new Get(row) >>> > >> >>> > > > get.addColumn("1"); >>> > >> >>> > > > get.addColumn("2"); >>> > >> >>> > > > . >>> > >> >>> > > > . >>> > >> >>> > > > . >>> > >> >>> > > > >>> > >> >>> > > > When internally hbase executes the batch get, it will >>> seek to >>> > >> column >>> > >> >>> > "1", >>> > >> >>> > > > now since data is lexicographically sorted, it does not >>> need >>> > to >>> > >> seek >>> > >> >>> > from >>> > >> >>> > > > the beginning to get to "2", it can continue seeking, >>> > >> >>> > > > henceforth >>> > >> >>> since >>> > >> >>> > > > column "2" will always be after column "1". I want to know >>> > >> whether >>> > >> >>> this >>> > >> >>> > > is >>> > >> >>> > > > how a multicolumn get on a row works or not. >>> > >> >>> > > > >>> > >> >>> > > > Thanks >>> > >> >>> > > > Varun >>> > >> >>> > > > >>> > >> >>> > > > On Fri, Feb 8, 2013 at 9:08 PM, Marcos Ortiz < >>> [email protected]> >>> > >> wrote: >>> > >> >>> > > > >>> > >> >>> > > > > Like Ishan said, a get give an instance of the Result >>> class. >>> > >> >>> > > > > All utility methods that you can use are: >>> > >> >>> > > > > byte[] getValue(byte[] family, byte[] qualifier) >>> > >> >>> > > > > byte[] value() >>> > >> >>> > > > > byte[] getRow() >>> > >> >>> > > > > int size() >>> > >> >>> > > > > boolean isEmpty() >>> > >> >>> > > > > KeyValue[] raw() # Like Ishan said, all data here is >>> sorted >>> > >> >>> > > > > List<KeyValue> list() >>> > >> >>> > > > > >>> > >> >>> > > > > >>> > >> >>> > > > > >>> > >> >>> > > > > >>> > >> >>> > > > > On 02/08/2013 11:29 PM, Ishan Chhabra wrote: >>> > >> >>> > > > > >>> > >> >>> > > > >> Based on what I read in Lars' book, a get will return a >>> > >> result a >>> > >> >>> > > Result, >>> > >> >>> > > > >> which is internally a KeyValue[]. This KeyValue[] is >>> sorted >>> > >> by the >>> > >> >>> > key >>> > >> >>> > > > and >>> > >> >>> > > > >> you access this array using raw or list methods on the >>> > >> >>> > > > >> Result >>> > >> >>> > object. >>> > >> >>> > > > >> >>> > >> >>> > > > >> >>> > >> >>> > > > >> On Fri, Feb 8, 2013 at 5:40 PM, Varun Sharma < >>> > >> [email protected] >>> > >> >>> > >>> > >> >>> > > > wrote: >>> > >> >>> > > > >> >>> > >> >>> > > > >> +user >>> > >> >>> > > > >>> >>> > >> >>> > > > >>> On Fri, Feb 8, 2013 at 5:38 PM, Varun Sharma < >>> > >> >>> [email protected]> >>> > >> >>> > > > >>> wrote: >>> > >> >>> > > > >>> >>> > >> >>> > > > >>> Hi, >>> > >> >>> > > > >>>> >>> > >> >>> > > > >>>> When I do a Get on a row with multiple column >>> qualifiers. >>> > >> Do we >>> > >> >>> > sort >>> > >> >>> > > > the >>> > >> >>> > > > >>>> column qualifers and make use of the sorted order >>> when we >>> > >> get >>> > >> >>> the >>> > >> >>> > > > >>>> >>> > >> >>> > > > >>> results ? >>> > >> >>> > > > >>> >>> > >> >>> > > > >>>> Thanks >>> > >> >>> > > > >>>> Varun >>> > >> >>> > > > >>>> >>> > >> >>> > > > >>>> >>> > >> >>> > > > >> >>> > >> >>> > > > >> >>> > >> >>> > > > > -- >>> > >> >>> > > > > Marcos Ortiz Valmaseda, >>> > >> >>> > > > > Product Manager && Data Scientist at UCI >>> > >> >>> > > > > Blog: http://marcosluis2186.**posterous.com< >>> > >> >>> > > > http://marcosluis2186.posterous.com> >>> > >> >>> > > > > Twitter: @marcosluis2186 >>> > >> >>> > > > > <http://twitter.com/**marcosluis2186< >>> > >> >>> > > > http://twitter.com/marcosluis2186> >>> > >> >>> > > > > > >>> > >> >>> > > > > >>> > >> >>> > > > >>> > >> >>> > > >>> > >> >>> > >>> > >> >>> >>> > >> >> >>> > >> >> >>> > >> > >>> > >> >>> > > >>> > >>> >> >> >
