In both cases you are doing a full table scan? Try from shell with DEBUG enable. You'll see the regions being loaded. May help you narrow in on problem region or at least on problem regionserver.
St.Ack On Wed, Oct 21, 2009 at 7:19 AM, Ananth T. Sarathy < [email protected]> wrote: > Anyone have any further thoughts on this? > Ananth T Sarathy > > > On Tue, Oct 20, 2009 at 6:37 PM, Ananth T. Sarathy < > [email protected]> wrote: > > > Well that's not the case. Every Row has that column. In fact the second > > snippet i sent is with a column with many less rows. (1k vs 25k) but > comes > > back pretty quickly. > > > > By forever, I mean i have watched my logs do nothing for a half hour > before > > giving up. > > > > > > Ananth T Sarathy > > > > > > > > On Tue, Oct 20, 2009 at 5:03 PM, Ryan Rawson <[email protected]> wrote: > > > >> If you are asking for a column that is very sparse and doesnt exist, > >> it will cause HBase to read through the entire region to find 100 > >> matching rows. This could take a while, you said 'forever', but could > >> you quantify that? > >> > >> On Tue, Oct 20, 2009 at 1:58 PM, Jean-Daniel Cryans < > [email protected]> > >> wrote: > >> > Scanner pre-fetching is always faster, so something must be wrong with > >> > your region server. Check the logs, top, etc > >> > > >> > WRT to row size, it's pretty much a matter of how many bytes you have > >> > in each column and sum them up (plus some overhead with the keys). > >> > > >> > You want filters, check the filter package in the javadoc. > >> > > >> > J-D > >> > > >> > On Tue, Oct 20, 2009 at 1:52 PM, Ananth T. Sarathy > >> > <[email protected]> wrote: > >> >> Ok, but how come > >> >> when I run a similiar call (with less returned rows 1000 vs 25k in > the > >> >> previous one) it runs through the iterator very quickly? (See Below) > >> >> > >> >> Also, how do I determine the row size? It's just text data, and > really > >> not > >> >> much. > >> >> > >> >> Finally, is there a way to query for rows that do not have a column? > >> (Ie all > >> >> rows without Files:path1) > >> >> > >> >> HBaseTableDataManagerImpl htdmni = new > >> HBaseTableDataManagerImpl( > >> >> "GS_Applications"); > >> >> > >> >> String[] columns = { "Files:path1" }; > >> >> log.info("Getting all Rows with Files"); > >> >> Scanner s = htdmni.getScannerForAllRows(columns); > >> >> log.info("Got all Rows with Files"); > >> >> > >> >> Iterator<RowResult> iter = s.iterator(); > >> >> out > >> >> > >> >> > >> > .write("Application_Full_Name,Version,Application_installer_name,Operating > >> >> System, Application_platform > >> >> > >> > ,Application_sub_category,md5Hash,Sha1Hash,Sha256Hash,filepath,fileName,modified,size,operation\n"); > >> >> out.write("<BR>"); > >> >> while (iter.hasNext()) > >> >> { > >> >> > >> >> Ananth T Sarathy > >> >> > >> >> > >> >> On Tue, Oct 20, 2009 at 4:44 PM, Jean-Daniel Cryans < > >> [email protected]>wrote: > >> >> > >> >>> If you have a very slow data source (S3), then it fetches 100 row > >> >>> before coming back to your client with all of them and that can take > a > >> >>> lot of time. Also make sure that 100 of your rows can fit in a > region > >> >>> server's memory. How big is each row? > >> >>> > >> >>> J-D > >> >>> > >> >>> On Tue, Oct 20, 2009 at 1:32 PM, Ananth T. Sarathy > >> >>> <[email protected]> wrote: > >> >>> > I am running this code where > >> >>> > > >> >>> > getScannerForAllRows(columns) just does return > >> table.getScanner(columns); > >> >>> > > >> >>> > and the table has setScannerCaching(100); > >> >>> > > >> >>> > But it spins forever after getting the iterator. Why would that > be? > >> How > >> >>> can > >> >>> > I speed it up? > >> >>> > > >> >>> > HBaseTableDataManagerImpl htdmni = new > >> HBaseTableDataManagerImpl( > >> >>> > "GS_Applications"); > >> >>> > > >> >>> > String[] columns = { "Files:Name" }; > >> >>> > log.info("Getting all Rows with Files"); > >> >>> > Scanner s = htdmni.getScannerForAllRows(columns); > >> >>> > log.info("Got all Rows with Files"); > >> >>> > log.info("Getting Iterator"); > >> >>> > > >> >>> > Iterator<RowResult> iter = s.iterator(); > >> >>> > log.info("Got Iterator"); > >> >>> > > >> >>> > while (iter.hasNext()) > >> >>> > { > >> >>> > log.info("Getting next Row"); > >> >>> > RowResult rr = iter.next(); > >> >>> > > >> >>> > > >> >>> > Ananth T Sarathy > >> >>> > > >> >>> > >> >> > >> > > >> > > > > >
