Well that's not the case. Every Row has that column. In fact the second snippet i sent is with a column with many less rows. (1k vs 25k) but comes back pretty quickly.
By forever, I mean i have watched my logs do nothing for a half hour before giving up. Ananth T Sarathy On Tue, Oct 20, 2009 at 5:03 PM, Ryan Rawson <[email protected]> wrote: > If you are asking for a column that is very sparse and doesnt exist, > it will cause HBase to read through the entire region to find 100 > matching rows. This could take a while, you said 'forever', but could > you quantify that? > > On Tue, Oct 20, 2009 at 1:58 PM, Jean-Daniel Cryans <[email protected]> > wrote: > > Scanner pre-fetching is always faster, so something must be wrong with > > your region server. Check the logs, top, etc > > > > WRT to row size, it's pretty much a matter of how many bytes you have > > in each column and sum them up (plus some overhead with the keys). > > > > You want filters, check the filter package in the javadoc. > > > > J-D > > > > On Tue, Oct 20, 2009 at 1:52 PM, Ananth T. Sarathy > > <[email protected]> wrote: > >> Ok, but how come > >> when I run a similiar call (with less returned rows 1000 vs 25k in the > >> previous one) it runs through the iterator very quickly? (See Below) > >> > >> Also, how do I determine the row size? It's just text data, and really > not > >> much. > >> > >> Finally, is there a way to query for rows that do not have a column? (Ie > all > >> rows without Files:path1) > >> > >> HBaseTableDataManagerImpl htdmni = new HBaseTableDataManagerImpl( > >> "GS_Applications"); > >> > >> String[] columns = { "Files:path1" }; > >> log.info("Getting all Rows with Files"); > >> Scanner s = htdmni.getScannerForAllRows(columns); > >> log.info("Got all Rows with Files"); > >> > >> Iterator<RowResult> iter = s.iterator(); > >> out > >> > >> > .write("Application_Full_Name,Version,Application_installer_name,Operating > >> System, Application_platform > >> > ,Application_sub_category,md5Hash,Sha1Hash,Sha256Hash,filepath,fileName,modified,size,operation\n"); > >> out.write("<BR>"); > >> while (iter.hasNext()) > >> { > >> > >> Ananth T Sarathy > >> > >> > >> On Tue, Oct 20, 2009 at 4:44 PM, Jean-Daniel Cryans < > [email protected]>wrote: > >> > >>> If you have a very slow data source (S3), then it fetches 100 row > >>> before coming back to your client with all of them and that can take a > >>> lot of time. Also make sure that 100 of your rows can fit in a region > >>> server's memory. How big is each row? > >>> > >>> J-D > >>> > >>> On Tue, Oct 20, 2009 at 1:32 PM, Ananth T. Sarathy > >>> <[email protected]> wrote: > >>> > I am running this code where > >>> > > >>> > getScannerForAllRows(columns) just does return > table.getScanner(columns); > >>> > > >>> > and the table has setScannerCaching(100); > >>> > > >>> > But it spins forever after getting the iterator. Why would that be? > How > >>> can > >>> > I speed it up? > >>> > > >>> > HBaseTableDataManagerImpl htdmni = new > HBaseTableDataManagerImpl( > >>> > "GS_Applications"); > >>> > > >>> > String[] columns = { "Files:Name" }; > >>> > log.info("Getting all Rows with Files"); > >>> > Scanner s = htdmni.getScannerForAllRows(columns); > >>> > log.info("Got all Rows with Files"); > >>> > log.info("Getting Iterator"); > >>> > > >>> > Iterator<RowResult> iter = s.iterator(); > >>> > log.info("Got Iterator"); > >>> > > >>> > while (iter.hasNext()) > >>> > { > >>> > log.info("Getting next Row"); > >>> > RowResult rr = iter.next(); > >>> > > >>> > > >>> > Ananth T Sarathy > >>> > > >>> > >> > > >
