Anyone have any further thoughts on this? Ananth T Sarathy
On Tue, Oct 20, 2009 at 6:37 PM, Ananth T. Sarathy < [email protected]> wrote: > Well that's not the case. Every Row has that column. In fact the second > snippet i sent is with a column with many less rows. (1k vs 25k) but comes > back pretty quickly. > > By forever, I mean i have watched my logs do nothing for a half hour before > giving up. > > > Ananth T Sarathy > > > > On Tue, Oct 20, 2009 at 5:03 PM, Ryan Rawson <[email protected]> wrote: > >> If you are asking for a column that is very sparse and doesnt exist, >> it will cause HBase to read through the entire region to find 100 >> matching rows. This could take a while, you said 'forever', but could >> you quantify that? >> >> On Tue, Oct 20, 2009 at 1:58 PM, Jean-Daniel Cryans <[email protected]> >> wrote: >> > Scanner pre-fetching is always faster, so something must be wrong with >> > your region server. Check the logs, top, etc >> > >> > WRT to row size, it's pretty much a matter of how many bytes you have >> > in each column and sum them up (plus some overhead with the keys). >> > >> > You want filters, check the filter package in the javadoc. >> > >> > J-D >> > >> > On Tue, Oct 20, 2009 at 1:52 PM, Ananth T. Sarathy >> > <[email protected]> wrote: >> >> Ok, but how come >> >> when I run a similiar call (with less returned rows 1000 vs 25k in the >> >> previous one) it runs through the iterator very quickly? (See Below) >> >> >> >> Also, how do I determine the row size? It's just text data, and really >> not >> >> much. >> >> >> >> Finally, is there a way to query for rows that do not have a column? >> (Ie all >> >> rows without Files:path1) >> >> >> >> HBaseTableDataManagerImpl htdmni = new >> HBaseTableDataManagerImpl( >> >> "GS_Applications"); >> >> >> >> String[] columns = { "Files:path1" }; >> >> log.info("Getting all Rows with Files"); >> >> Scanner s = htdmni.getScannerForAllRows(columns); >> >> log.info("Got all Rows with Files"); >> >> >> >> Iterator<RowResult> iter = s.iterator(); >> >> out >> >> >> >> >> .write("Application_Full_Name,Version,Application_installer_name,Operating >> >> System, Application_platform >> >> >> ,Application_sub_category,md5Hash,Sha1Hash,Sha256Hash,filepath,fileName,modified,size,operation\n"); >> >> out.write("<BR>"); >> >> while (iter.hasNext()) >> >> { >> >> >> >> Ananth T Sarathy >> >> >> >> >> >> On Tue, Oct 20, 2009 at 4:44 PM, Jean-Daniel Cryans < >> [email protected]>wrote: >> >> >> >>> If you have a very slow data source (S3), then it fetches 100 row >> >>> before coming back to your client with all of them and that can take a >> >>> lot of time. Also make sure that 100 of your rows can fit in a region >> >>> server's memory. How big is each row? >> >>> >> >>> J-D >> >>> >> >>> On Tue, Oct 20, 2009 at 1:32 PM, Ananth T. Sarathy >> >>> <[email protected]> wrote: >> >>> > I am running this code where >> >>> > >> >>> > getScannerForAllRows(columns) just does return >> table.getScanner(columns); >> >>> > >> >>> > and the table has setScannerCaching(100); >> >>> > >> >>> > But it spins forever after getting the iterator. Why would that be? >> How >> >>> can >> >>> > I speed it up? >> >>> > >> >>> > HBaseTableDataManagerImpl htdmni = new >> HBaseTableDataManagerImpl( >> >>> > "GS_Applications"); >> >>> > >> >>> > String[] columns = { "Files:Name" }; >> >>> > log.info("Getting all Rows with Files"); >> >>> > Scanner s = htdmni.getScannerForAllRows(columns); >> >>> > log.info("Got all Rows with Files"); >> >>> > log.info("Getting Iterator"); >> >>> > >> >>> > Iterator<RowResult> iter = s.iterator(); >> >>> > log.info("Got Iterator"); >> >>> > >> >>> > while (iter.hasNext()) >> >>> > { >> >>> > log.info("Getting next Row"); >> >>> > RowResult rr = iter.next(); >> >>> > >> >>> > >> >>> > Ananth T Sarathy >> >>> > >> >>> >> >> >> > >> > >
