Hi Lars, I upgraded both hadoop to 1.03 & hbase to 0.94.1 . I have been testing and there's no sign for that bug, so your doubts were right, it was the getLastIncrement hack.
Thanks a lot for helping, Mohamed On Thu, Sep 20, 2012 at 5:37 PM, lars hofhansl <[email protected]> wrote: > There are two clues: > > 1. "stem~transport/f1:w:transporte/-9223372036854775808/Minimum/vlen=0" is > of type "minimum". A KV of that type should never be application visible! > > 2. getLastIncrement. getLastIncrement was a "hack" in 0.92 and before to > improve Increment performance (it basically looks into the memstore first > and only if it does not find the right KV there looks into the HFiles). It > was removed in 0.94 (because of LazySeek this optimization is no longer > necessary, and because it was hack, didn't deal correctly with deletes, > etc). > > This is probably another problem with the getLastIncrement hack, which > somehow sees an internal KV (of type minimum or maximum). > > Still interested in testing code. Even more interested in whether you can > still reproduce with 0.94 (which I doubt). > > -- Lars > > ________________________________ > From: Mohamed Ibrahim <[email protected]> > To: [email protected]; lars hofhansl <[email protected]> > Sent: Thursday, September 20, 2012 2:17 PM > Subject: Re: java.io.IOEcxeption key k1 followed by a smaller key k2 > > > Hi Lars, > > I'm back on this bug, trying to write a piece of code to reproduce this > bug. Looking at the exception again, it seems that the keys are equal: > > Sun Sep 16 22:28:51 EDT 2012, > org.apache.hadoop.hbase.client.HTable$8@28381fce, java.io.IOException: > java.io.IOException: Key > stem~transport/f1:w:transporte/-9223372036854775808/Minimum/vlen=0 followed > by a smaller key > stem~transport/f1:w:transportation/1347480815026/Put/vlen=8 in cf f1 > > So both keys are stem~transport . One of them has vlen=0, also it seems > that the -ve number is a time stamp. I can't see a row with that key and > that qualifier when I run scan from hbase shell. May be it is something > stuck in memory and was deleted ?? I looked at line 3660 of HRegion, which > is supposed to create that scanner to find the last counter. It seems to be > inspecting the memory as well. > > Here is the link: > http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hbase/hbase/0.92.1/org/apache/hadoop/hbase/regionserver/HRegion.java/ > > Also, here is the full stack dump again for reference: > Sun Sep 16 22:28:51 EDT 2012, > org.apache.hadoop.hbase.client.HTable$8@28381fce, java.io.IOException: > java.io.IOException: Key > stem~transport/f1:w:transporte/-9223372036854775808/Minimum/vlen=0 followed > by a smaller key > stem~transport/f1:w:transportation/1347480815026/Put/vlen=8 in cf f1 > at > org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:266) > at > org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:138) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:2945) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:2901) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:2918) > at > org.apache.hadoop.hbase.regionserver.HRegion.getLastIncrement(HRegion.java:3660) > at > org.apache.hadoop.hbase.regionserver.HRegion.incrementColumnValue(HRegion.java:3844) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.incrementColumnValue(HRegionServer.java:2982) > at sun.reflect.GeneratedMethodAccessor25.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:616) > at > org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1326) > > > at > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithRetries(HConnectionManager.java:1345) > at > org.apache.hadoop.hbase.client.HTable.incrementColumnValue(HTable.java:870) > at > org.apache.hadoop.hbase.client.HTable.incrementColumnValue(HTable.java:850) > > Mohamed > > On Mon, Sep 17, 2012 at 1:00 PM, Mohamed Ibrahim <[email protected]> > wrote: > > Hi Lars, > > > > > >That's exactly what I will be working on. I will update you as soon as I > have that code. I'm travelling today, so it might be towards the end of the > week. > > > > > >Thanks for the help, > >Mohamed > > > > > > > >On Mon, Sep 17, 2012 at 12:06 AM, lars hofhansl <[email protected]> > wrote: > > > >It would be good track this down. Any way you can share the tools you use > to load the data? > >>Is it easy for you to reproduce this problem? > >> > >>It's possible (but not likely) that there is a bug in Hadoop 1.0. You > should use 1.0.3. > >> > >>-- Lars > >> > >> > >> > >> > >>________________________________ > >> From: Mohamed Ibrahim <[email protected]> > >>To: [email protected]; lars hofhansl <[email protected]>; Stack < > [email protected]> > >>Sent: Sunday, September 16, 2012 7:13 PM > >>Subject: Re: java.io.IOEcxeption key k1 followed by a smaller key k2 > >> > >> > >> > >>Hello Lars / Stack, > >> > >>Thank you for responding. > >> > >>The date on the files is March 9th 2012. It's been up since then, I > restarted hbase and hadoop once. I only have a single node that I'm running > my tests on. I'm currently running 0.92.1 on hadoop 1.0 . I hope I'm using > the correct mix. > >> > >>I'm not using any external tools other than hbase java api, and > inspecting the data using the shell. I ran my program, the following day I > found the stack dump on the console. I checked by scanning the table that > had the exception for the smaller key k1, with limit 2 from the shell and > no exceptions were thrown. I can also see that the following key is larger > than the first one, so nothing is wrong. > >> > >>I faced the same exception before and it happened more frequently when I > did a lot of Puts and Deletes using Htable.batch . Essentially I was > updating several inverted indexes each in its own table on the data of rows > when they get inserted, and instead of doing single Deletes & Puts on the > indexes I used batch. batch improved the performance but threw this > exception more. I stopped using batch and now doing single Puts & Deletes. > The exception rarely gets thrown, but still sporadically gets thrown. > >> > >>I will read about the hfile tool, thanks for the pointers. I will also > try figure out a set of steps that would repeat the exception so it is more > helpful. I will also try 0.94.1 with batch and see if it will happen again, > and will let you know and will file a bug if I can repeat it consistently. > >> > >>Thank you, > >>Mohamed Ibrahim > >> > >> > >>On Sun, Sep 16, 2012 at 7:33 PM, lars hofhansl <[email protected]> > wrote: > >> > >>Hmm... HBASE-6579 gets rid of that check, because we thought it no > longer necessary. > >>>No do you remember what you did leading up to this? > >>>Did you write these HFiles with some other tool? Done some bulk import, > etc? > >>> > >>> > >>>-- Lars > >>> > >>> > >>> > >>>________________________________ > >>> From: Mohamed Ibrahim <[email protected]> > >>>To: [email protected] > >>>Sent: Sunday, September 16, 2012 5:59 AM > >>>Subject: java.io.IOEcxeption key k1 followed by a smaller key k2 > >>> > >>> > >>>Hello All, > >>> > >>>I am using hbase 0.92.1 on hadoop 1 . I am getting those exceptions, > and it > >>>seems to me that it means that the hbase file is not sorted in order. So > >>>when the scanner goes through it, it finds a smaller key after its > current. > >>> > >>>Is that related to https://issues.apache.org/jira/browse/HBASE-6579 ?? > >>> > >>>It looks like upgrading to 0.94.1 (current stable) won't fix the issue. > Any > >>>recommendations ?? > >>> > >>>Here is the stack dump: > >>> at > > >>>org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:266) > >>> at > > >>>org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:138) > >>> at > > >>>org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:2945) > >>> at > > >>>org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:2901) > >>> at > > >>>org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:2918) > >>> at > > >>>org.apache.hadoop.hbase.regionserver.HRegion.getLastIncrement(HRegion.java:3660) > >>> at > > >>>org.apache.hadoop.hbase.regionserver.HRegion.incrementColumnValue(HRegion.java:3844) > >>> at > > >>>org.apache.hadoop.hbase.regionserver.HRegionServer.incrementColumnValue(HRegionServer.java:2982) > >>> at sun.reflect.GeneratedMethodAccessor25.invoke(Unknown Source) > >>> at > > >>>sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > >>> at java.lang.reflect.Method.invoke(Method.java:616) > >>> at > > >>>org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364) > >>> at > > >>>org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1326) > >>> > >>> > >>> at > > >>>org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithRetries(HConnectionManager.java:1345) > >>> at > > >>>org.apache.hadoop.hbase.client.HTable.incrementColumnValue(HTable.java:870) > >>> at > > >>>org.apache.hadoop.hbase.client.HTable.incrementColumnValue(HTable.java:850) > >>> <Rest of the stack is inside my code> > >>> > >>>Thanks, > >>>Mohamed Ibrahim > > >
