You'll need to add the '-np' option on the scan command as well.

On 10/11/2013 03:05 PM, Jared Winick wrote:
After following the commands Eric lists to set the iterator for that table, instead of running 'egrep' in the shell, you could do this from the Linux command line

accumulo shell -u username -p password -e "scan -t foo" | wc -l


On Fri, Oct 11, 2013 at 11:42 AM, Eric Newton <[email protected] <mailto:[email protected]>> wrote:

    You can stack a counting Combiner over the FirstEntryInRowIterator and
    batch scan the table. If it's just a test data set with under a
    billion rows, you can just count the result set coming out of the
    FirstEntryInRowIterator.  You'll be I/O bound at the client, but it
    will work.

    This does it with the shell, but the output is kinda voluminous:

    root@test> createtable foo
    root@test foo> insert row1 cf col1 value
    root@test foo> insert row1 cf col2 value
    root@test foo> insert row1 cf col999 value
    root@test foo> insert row2 cf col1 value
    root@test foo> scan
    row1 cf:col1 []    value
    row1 cf:col2 []    value
    row1 cf:col999 []    value
    row2 cf:col1 []    value
    root@test foo> setiter -class
    org.apache.accumulo.core.iterators.FirstEntryInRowIterator -p 99 -scan
    Only allows iteration over the first entry per row
    ----------> set FirstEntryInRowIterator parameter scansBeforeSeek,
    Number of scans to try before seeking [10]: 10
    root@test foo> egrep .*
    row1 cf:col1 []    value
    row2 cf:col1 []    value


    On Fri, Oct 11, 2013 at 10:53 AM, Terry P. <[email protected]
    <mailto:[email protected]>> wrote:
    > Hi guys,
    > I'm still a bit of a newbie as I'm more of an admin than a
    developer, and
    > now that formal testing has begun, I have testers asking me how
    to get a
    > total count of records in Accumulo for verification purposes
    after test
    > ingests have been run.
    >
    > In our case when I say "records" I mean the number of distinct
    rowkeys, not
    > the total number of entries.
    >
    > Is there any way to do this using just the Accumulo shell, maybe
    by writing
    > an aggregator or other class that can be run from within the
    Accumulo shell?
    >
    > Many thanks in advance,
    > Terry
    >
    >
    > On Tue, Jan 22, 2013 at 6:03 PM, Terry P. <[email protected]
    <mailto:[email protected]>> wrote:
    >>
    >> Greetings everyone,
    >> I want to simply get the total count of rows in a table using
    the accumulo
    >> shell.  I'm very new to Accumulo so I apologize if it's a
    newbie question.
    >>
    >> I'm prototyping with the accumulo shell, and love how it can ingest
    >> records using exefile, so I've used python to generate a lot of
    test data.
    >> For some test cases in this sprint I need to verify the rows
    loaded match
    >> what's expected, hence the reason I need to get the total rows
    in a table.
    >>
    >> I'd bet there is some way to use setiter or setscaniter with
    the -agg
    >> option, but I can't figure it out.
    >>
    >> Any help would be greatly appreciated.
    >>
    >> Best regards,
    >> Terry
    >
    >



Reply via email to