Well that's not the case. Every Row has that column.  In fact the second
snippet i sent  is with a column with many less rows. (1k vs 25k) but comes
back pretty quickly.

By forever, I mean i have watched my logs do nothing for a half hour before
giving up.


Ananth T Sarathy


On Tue, Oct 20, 2009 at 5:03 PM, Ryan Rawson <[email protected]> wrote:

> If you are asking for a column that is very sparse and doesnt exist,
> it will cause HBase to read through the entire region to find 100
> matching rows. This could take a while, you said 'forever', but could
> you quantify that?
>
> On Tue, Oct 20, 2009 at 1:58 PM, Jean-Daniel Cryans <[email protected]>
> wrote:
> > Scanner pre-fetching is always faster, so something must be wrong with
> > your region server. Check the logs, top, etc
> >
> > WRT to row size, it's pretty much a matter of how many bytes you have
> > in each column and sum them up (plus some overhead with the keys).
> >
> > You want filters, check the filter package in the javadoc.
> >
> > J-D
> >
> > On Tue, Oct 20, 2009 at 1:52 PM, Ananth T. Sarathy
> > <[email protected]> wrote:
> >> Ok, but how come
> >> when I run a similiar call (with less returned rows 1000 vs 25k in the
> >> previous one) it runs through the iterator very quickly?  (See Below)
> >>
> >> Also, how do I determine the row size? It's just text data, and really
> not
> >> much.
> >>
> >> Finally, is there a way to query for rows that do not have a column? (Ie
> all
> >> rows without Files:path1)
> >>
> >>        HBaseTableDataManagerImpl htdmni = new HBaseTableDataManagerImpl(
> >>                "GS_Applications");
> >>
> >>        String[] columns = { "Files:path1" };
> >>        log.info("Getting all Rows with Files");
> >>        Scanner s = htdmni.getScannerForAllRows(columns);
> >>        log.info("Got all Rows with Files");
> >>
> >>        Iterator<RowResult> iter = s.iterator();
> >>        out
> >>
> >>
> .write("Application_Full_Name,Version,Application_installer_name,Operating
> >> System, Application_platform
> >>
> ,Application_sub_category,md5Hash,Sha1Hash,Sha256Hash,filepath,fileName,modified,size,operation\n");
> >>        out.write("<BR>");
> >>        while (iter.hasNext())
> >>        {
> >>
> >> Ananth T Sarathy
> >>
> >>
> >> On Tue, Oct 20, 2009 at 4:44 PM, Jean-Daniel Cryans <
> [email protected]>wrote:
> >>
> >>> If you have a very slow data source (S3), then it fetches 100 row
> >>> before coming back to your client with all of them and that can take a
> >>> lot of time. Also make sure that 100 of your rows can fit in a region
> >>> server's memory. How big is each row?
> >>>
> >>> J-D
> >>>
> >>> On Tue, Oct 20, 2009 at 1:32 PM, Ananth T. Sarathy
> >>> <[email protected]> wrote:
> >>> > I am running this code where
> >>> >
> >>> > getScannerForAllRows(columns) just does return
> table.getScanner(columns);
> >>> >
> >>> > and the table   has setScannerCaching(100);
> >>> >
> >>> > But it spins forever after getting the iterator. Why would that be?
> How
> >>> can
> >>> > I speed it up?
> >>> >
> >>> >        HBaseTableDataManagerImpl htdmni = new
> HBaseTableDataManagerImpl(
> >>> >                "GS_Applications");
> >>> >
> >>> >        String[] columns = { "Files:Name" };
> >>> >        log.info("Getting all Rows with Files");
> >>> >        Scanner s = htdmni.getScannerForAllRows(columns);
> >>> >        log.info("Got all Rows with Files");
> >>> >        log.info("Getting Iterator");
> >>> >
> >>> >        Iterator<RowResult> iter = s.iterator();
> >>> >        log.info("Got Iterator");
> >>> >
> >>> >        while (iter.hasNext())
> >>> >        {
> >>> >            log.info("Getting next Row");
> >>> >            RowResult rr = iter.next();
> >>> >
> >>> >
> >>> > Ananth T Sarathy
> >>> >
> >>>
> >>
> >
>

Reply via email to