I understand the weirdness :) realized It may not be helpful to even use addFamily or addColumn, since row keys are sorted and the start/stop rows provide good filtering. Nontheless, it is an issue with HBase.
I was able to have scanners perform a little better by removing a loop which I knew only held a single entry (Since HBase would have to check dataset for second next call). Performance is still not great when I need to loop over about 1000 entries, and need to change the start/stop row on Scan object (Unless it's already cached). Had another question: What type of caching improvments have been made in .20? When using existing start/stop row, I see much better performance (3X-4X). Also, what other things can I do to improve performance with tall tables? I have 2 tall tables with 2 scanner loops based on start/stop row. Thanks for all the help here :) Thanks. Ryan Rawson wrote: > > Hey, > > That is a very weird problem you are seeing. I have not seen it in > any of my own (somewhat) limited excursions. For now obviously > addFamily() is the way to go, but it would be good to get at why this > is happening. > > Would you be able to minimize it to a reasonable table and possibly > provide us with some example data files? Or perhaps a specific repro > case? > > thanks for putting up with weirdness :-) For what it's worth, > addFamily() scans are that much more inefficient depending on the > volume of data returned. > > On Mon, Jun 29, 2009 at 3:10 PM, llpind<[email protected]> wrote: >> >> I got the same exception when I increased to 120000. >> >> If I simiply remove addColumn, and do an addFamily, it works fine (just >> takes long time, since I have lots of records I'm looping over). Anyone >> know what the problem is? >> >> I was trying to increase perfromance by adding the type as a column >> family >> therefore allowing me to filter beforehand. >> >> Thanks. >> >> >> stack-3 wrote: >>> >>> So, you are spending > 60 seconds in the client before going back to the >>> regionserver? If so, can you change this? Or up the lease period on >>> scanners > 60 seconds. >>> St.Ack >>> >>> On Mon, Jun 29, 2009 at 1:26 PM, llpind <[email protected]> wrote: >>> >>>> >>>> Both are defaults (60 second lease period, and 1 scanner caching). >>>> Yeah >>>> it >>>> is taking longer than 60 seconds. >>>> >>>> >>>> >>>> stack-3 wrote: >>>> > >>>> > Try upping timeout on scanners? >>>> > >>>> > Is your scanner spending longer than hbase.regionserver.lease.period >>>> -- >>>> 60 >>>> > seconds -- in map before it goes back to the server? >>>> > >>>> > What do you have for this value: hbase.client.scanner.caching? >>>> > >>>> > Is it 1 or something else? If > 1, then you'll be in client while N >>>> are >>>> > processed; there'll be no trip back to server to renew server-side >>>> lease. >>>> > >>>> > St.Ack >>>> > >>>> > On Fri, Jun 26, 2009 at 2:49 PM, llpind <[email protected]> >>>> wrote: >>>> > >>>> >> >>>> >> Tried it like this: >>>> >> >>>> >> Scan linkScan = new Scan (Bytes.toBytes(e + "|"), Bytes.toBytes(e + >>>> >> "|A")); >>>> >> linkScan.addColumn(Bytes.toBytes("type"), >>>> >> Bytes.toBytes("ELECTRONICS")); >>>> >> ResultScanner scanner = tblEntity.getScanner(linkScan); >>>> >> >>>> >> for (Result linkRowResult : scanner ) { >>>> >> String row = Bytes.toString(linkRowResult.getRow()); >>>> >> } >>>> >> >>>> >> >>>> >> Same exception. I would like to mentioned there is an outter loop >>>> around >>>> >> the block of code above changing variable 'e' to something else. >>>> Also, >>>> >> I >>>> >> checked the type:ELECTRONICS column, and it isn't that sparse. As >>>> you >>>> >> can >>>> >> imagine an electronics department has a lot of items in it (in >>>> relation >>>> >> to >>>> >> other departments), but a given item (row. e.g. Samsung blah) is >>>> sparse. >>>> >> Meaning given all the filters it may be sparse dataset. If it's >>>> becuase >>>> >> its >>>> >> sparse, why would it work when i dont use addColumn, but use >>>> addFamily >>>> >> instead? >>>> >> >>>> >> >>>> >> >>>> >> >>>> >> Ryan Rawson wrote: >>>> >> > >>>> >> > Can you try with the constructor: >>>> >> > Scan scanSpec = new Scan(startRow, stopRow); >>>> >> > >>>> >> > thanks, >>>> >> > -ryan >>>> >> > >>>> >> > On Fri, Jun 26, 2009 at 2:18 PM, llpind<[email protected]> >>>> wrote: >>>> >> >> >>>> >> >> It works find in shell when I do: >>>> >> >> >>>> >> >> scan 'tblStore', {COLUMNS =>'type:ELECTRONICS', STARTROW => >>>> >> 'TYPE1|TV|', >>>> >> >> STOPROW => 'TYPE1|TV|A'} >>>> >> >> >>>> >> >> Am I doing something wrong in the API call? >>>> >> >> >>>> >> >> >>>> >> >> >>>> >> >> llpind wrote: >>>> >> >>> >>>> >> >>> Yeah I do have a type:ELECTRONIC column. I have a lot of data >>>> in >>>> the >>>> >> >>> tall >>>> >> >>> table, so it may be sparse. I'm giving it filters like >>>> start/stop >>>> >> row >>>> >> >>> key and column family/qualifier. This should still work IMO. >>>> What >>>> >> >>> other >>>> >> >>> could be causing this? >>>> >> >>> >>>> >> >>> >>>> >> >>> Ryan Rawson wrote: >>>> >> >>>> >>>> >> >>>> are you sure you have that column in your data? If you are >>>> >> searching >>>> >> >>>> for a column that doesnt exist or is very very very sparse, the >>>> >> >>>> scanner will spend a lot of time searching only to find >>>> nothing, >>>> >> thus >>>> >> >>>> ending up with these kinds of exceptions.... >>>> >> >>>> >>>> >> >>>> On Fri, Jun 26, 2009 at 12:35 PM, >>>> llpind<[email protected]> >>>> >> wrote: >>>> >> >>>>> >>>> >> >>>>> This exception does not happen if I remove the addColumn, and >>>> leave >>>> >> >>>>> only >>>> >> >>>>> addFamily (linkScan.addFamily(Bytes.toBytes("type")) . >>>> >> >>>>> >>>> >> >>>>> also, forgot I also have start and stop rows set: >>>> >> >>>>> >>>> >> >>>>> >>>> >> >>>>> >>>> >> >>>>> Scan linkScan = new Scan();. >>>> >> >>>>> linkScan.addColumn(Bytes.toBytes("type"), >>>> >> >>>>> Bytes.toBytes("ELECTRONICS")); >>>> >> >>>>> linkScan.setStartRow (Bytes.toBytes(e + "|")); >>>> >> >>>>> linkScan.setStopRow (Bytes.toBytes(e + " ")); >>>> >> >>>>> ResultScanner scanner = tblEntity.getScanner(linkScan); >>>> >> >>>>> for (Result linkRowResult : scanner ) { >>>> >> >>>>> String row = Bytes.toString(linkRowResult.getRow()); >>>> >> >>>>> } >>>> >> >>>>> >>>> >> >>>>> >>>> >> >>>>> >>>> >> >>>>> >>>> >> >>>>> llpind wrote: >>>> >> >>>>>> >>>> >> >>>>>> Hey, >>>> >> >>>>>> >>>> >> >>>>>> I'm doing the following to get a scanner on a tall table: >>>> >> >>>>>> >>>> >> >>>>>> Scan linkScan = new Scan();. >>>> >> >>>>>> linkScan.addColumn(Bytes.toBytes("type"), >>>> >> >>>>>> Bytes.toBytes("ELECTRONICS")); >>>> >> >>>>>> ResultScanner scanner = tblEntity.getScanner(linkScan); >>>> >> >>>>>> for (Result linkRowResult : scanner ) { >>>> >> >>>>>> String row = Bytes.toString(linkRowResult.getRow()); >>>> >> >>>>>> } >>>> >> >>>>>> >>>> >> >>>>>> >>>> >> >>>>>> >>>> ================================================================ >>>> >> >>>>>> at >>>> org.apache.hadoop.util.RunJar.main(RunJar.java:156) >>>> >> >>>>>> Caused by: org.apache.hadoop.hbase.UnknownScannerException: >>>> >> >>>>>> org.apache.hadoop.hbase.UnknownScannerException: >>>> >> -2823498412219891315 >>>> >> >>>>>> at >>>> >> >>>>>> >>>> >> >>>> org.apache.hadoop.hbase.regionserver.HRegionServer.close(HRegionServer.java:1894) >>>> >> >>>>>> at >>>> sun.reflect.GeneratedMethodAccessor9.invoke(Unknown >>>> >> >>>>>> Source) >>>> >> >>>>>> at >>>> >> >>>>>> >>>> >> >>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >>>> >> >>>>>> at java.lang.reflect.Method.invoke(Method.java:597) >>>> >> >>>>>> at >>>> >> >>>>>> >>>> >> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:643) >>>> >> >>>>>> at >>>> >> >>>>>> >>>> >> >>>> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:913) >>>> >> >>>>>> >>>> >> >>>>>> at >>>> >> >>>>>> sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native >>>> >> >>>>>> Method) >>>> >> >>>>>> at >>>> >> >>>>>> >>>> >> >>>> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) >>>> >> >>>>>> at >>>> >> >>>>>> >>>> >> >>>> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) >>>> >> >>>>>> at >>>> >> >>>>>> >>>> java.lang.reflect.Constructor.newInstance(Constructor.java:513) >>>> >> >>>>>> at >>>> >> >>>>>> >>>> >> >>>> org.apache.hadoop.hbase.RemoteExceptionHandler.decodeRemoteException(RemoteExceptionHandler.java:94) >>>> >> >>>>>> at >>>> >> >>>>>> >>>> >> >>>> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getRegionServerWithRetries(HConnectionManager.java:928) >>>> >> >>>>>> at >>>> >> >>>>>> >>>> >> >>>> org.apache.hadoop.hbase.client.HTable$ClientScanner.nextScanner(HTable.java:1764) >>>> >> >>>>>> at >>>> >> >>>>>> >>>> >> >>>> org.apache.hadoop.hbase.client.HTable$ClientScanner.next(HTable.java:1859) >>>> >> >>>>>> at >>>> >> >>>>>> >>>> >> >>>> org.apache.hadoop.hbase.client.HTable$ClientScanner$1.hasNext(HTable.java:1915) >>>> >> >>>>>> ... 8 more >>>> >> >>>>>> >>>> >> >>>>>> >>>> >> >>>>>> >>>> >> >>>>> >>>> >> >>>>> -- >>>> >> >>>>> View this message in context: >>>> >> >>>>> >>>> >> >>>> http://www.nabble.com/Scanner-exceptions-in-.20-tp24225950p24226108.html >>>> >> >>>>> Sent from the HBase User mailing list archive at Nabble.com. >>>> >> >>>>> >>>> >> >>>>> >>>> >> >>>> >>>> >> >>>> >>>> >> >>> >>>> >> >>> >>>> >> >> >>>> >> >> -- >>>> >> >> View this message in context: >>>> >> >> >>>> >> >>>> http://www.nabble.com/Scanner-exceptions-in-.20-tp24225950p24227420.html >>>> >> >> Sent from the HBase User mailing list archive at Nabble.com. >>>> >> >> >>>> >> >> >>>> >> > >>>> >> > >>>> >> >>>> >> -- >>>> >> View this message in context: >>>> >> >>>> http://www.nabble.com/Scanner-exceptions-in-.20-tp24225950p24227741.html >>>> >> Sent from the HBase User mailing list archive at Nabble.com. >>>> >> >>>> >> >>>> > >>>> > >>>> >>>> -- >>>> View this message in context: >>>> http://www.nabble.com/Scanner-exceptions-in-.20-tp24225950p24261192.html >>>> Sent from the HBase User mailing list archive at Nabble.com. >>>> >>>> >>> >>> >> >> -- >> View this message in context: >> http://www.nabble.com/Scanner-exceptions-in-.20-tp24225950p24261684.html >> Sent from the HBase User mailing list archive at Nabble.com. >> >> > > -- View this message in context: http://www.nabble.com/Scanner-exceptions-in-.20-tp24225950p24275481.html Sent from the HBase User mailing list archive at Nabble.com.
