As I said yesterday, you can check the logs, top, etc. One particular
thing of interest would be a jstack of your region server's process
while it's scanning and not returning.

J-D

On Wed, Oct 21, 2009 at 7:19 AM, Ananth T. Sarathy
<[email protected]> wrote:
> Anyone have any further thoughts on this?
> Ananth T Sarathy
>
>
> On Tue, Oct 20, 2009 at 6:37 PM, Ananth T. Sarathy <
> [email protected]> wrote:
>
>> Well that's not the case. Every Row has that column.  In fact the second
>> snippet i sent  is with a column with many less rows. (1k vs 25k) but comes
>> back pretty quickly.
>>
>> By forever, I mean i have watched my logs do nothing for a half hour before
>> giving up.
>>
>>
>> Ananth T Sarathy
>>
>>
>>
>> On Tue, Oct 20, 2009 at 5:03 PM, Ryan Rawson <[email protected]> wrote:
>>
>>> If you are asking for a column that is very sparse and doesnt exist,
>>> it will cause HBase to read through the entire region to find 100
>>> matching rows. This could take a while, you said 'forever', but could
>>> you quantify that?
>>>
>>> On Tue, Oct 20, 2009 at 1:58 PM, Jean-Daniel Cryans <[email protected]>
>>> wrote:
>>> > Scanner pre-fetching is always faster, so something must be wrong with
>>> > your region server. Check the logs, top, etc
>>> >
>>> > WRT to row size, it's pretty much a matter of how many bytes you have
>>> > in each column and sum them up (plus some overhead with the keys).
>>> >
>>> > You want filters, check the filter package in the javadoc.
>>> >
>>> > J-D
>>> >
>>> > On Tue, Oct 20, 2009 at 1:52 PM, Ananth T. Sarathy
>>> > <[email protected]> wrote:
>>> >> Ok, but how come
>>> >> when I run a similiar call (with less returned rows 1000 vs 25k in the
>>> >> previous one) it runs through the iterator very quickly?  (See Below)
>>> >>
>>> >> Also, how do I determine the row size? It's just text data, and really
>>> not
>>> >> much.
>>> >>
>>> >> Finally, is there a way to query for rows that do not have a column?
>>> (Ie all
>>> >> rows without Files:path1)
>>> >>
>>> >>        HBaseTableDataManagerImpl htdmni = new
>>> HBaseTableDataManagerImpl(
>>> >>                "GS_Applications");
>>> >>
>>> >>        String[] columns = { "Files:path1" };
>>> >>        log.info("Getting all Rows with Files");
>>> >>        Scanner s = htdmni.getScannerForAllRows(columns);
>>> >>        log.info("Got all Rows with Files");
>>> >>
>>> >>        Iterator<RowResult> iter = s.iterator();
>>> >>        out
>>> >>
>>> >>
>>> .write("Application_Full_Name,Version,Application_installer_name,Operating
>>> >> System, Application_platform
>>> >>
>>> ,Application_sub_category,md5Hash,Sha1Hash,Sha256Hash,filepath,fileName,modified,size,operation\n");
>>> >>        out.write("<BR>");
>>> >>        while (iter.hasNext())
>>> >>        {
>>> >>
>>> >> Ananth T Sarathy
>>> >>
>>> >>
>>> >> On Tue, Oct 20, 2009 at 4:44 PM, Jean-Daniel Cryans <
>>> [email protected]>wrote:
>>> >>
>>> >>> If you have a very slow data source (S3), then it fetches 100 row
>>> >>> before coming back to your client with all of them and that can take a
>>> >>> lot of time. Also make sure that 100 of your rows can fit in a region
>>> >>> server's memory. How big is each row?
>>> >>>
>>> >>> J-D
>>> >>>
>>> >>> On Tue, Oct 20, 2009 at 1:32 PM, Ananth T. Sarathy
>>> >>> <[email protected]> wrote:
>>> >>> > I am running this code where
>>> >>> >
>>> >>> > getScannerForAllRows(columns) just does return
>>> table.getScanner(columns);
>>> >>> >
>>> >>> > and the table   has setScannerCaching(100);
>>> >>> >
>>> >>> > But it spins forever after getting the iterator. Why would that be?
>>> How
>>> >>> can
>>> >>> > I speed it up?
>>> >>> >
>>> >>> >        HBaseTableDataManagerImpl htdmni = new
>>> HBaseTableDataManagerImpl(
>>> >>> >                "GS_Applications");
>>> >>> >
>>> >>> >        String[] columns = { "Files:Name" };
>>> >>> >        log.info("Getting all Rows with Files");
>>> >>> >        Scanner s = htdmni.getScannerForAllRows(columns);
>>> >>> >        log.info("Got all Rows with Files");
>>> >>> >        log.info("Getting Iterator");
>>> >>> >
>>> >>> >        Iterator<RowResult> iter = s.iterator();
>>> >>> >        log.info("Got Iterator");
>>> >>> >
>>> >>> >        while (iter.hasNext())
>>> >>> >        {
>>> >>> >            log.info("Getting next Row");
>>> >>> >            RowResult rr = iter.next();
>>> >>> >
>>> >>> >
>>> >>> > Ananth T Sarathy
>>> >>> >
>>> >>>
>>> >>
>>> >
>>>
>>
>>
>

Reply via email to