Arshak,

Looks like that was a bug against 1.5.0 and fixed in 1.5.1.

https://issues.apache.org/jira/browse/ACCUMULO-1571

On 4/8/14, 7:24 PM, Arshak Navruzyan wrote:
I am trying to print out the histogram with that command but get the
usage message instead.  --dump option is working fine.   I'm on Accumulo
1.5.0

PACKAGE=org.apache.accumulo.core.file.rfile
bin/accumulo $PACKAGE.PrintInfo --histogram
/accumulo/tables/53/t-0003371/A0003jbg.rf

Usage: org.apache.accumulo.core.file.rfile.PrintInfo [options]  <file> {
<file> ... }

   Options:

     -d, --dump

        dump the key/value pairs

        Default: false

     -h, -?, --help, -help

        Default: false

         --historgram

        print a histogram of the key-value sizes

        Default: false


Unknown option: --histogram



On Sat, Feb 22, 2014 at 8:47 AM, Mike Drob <[email protected]
<mailto:[email protected]>> wrote:

    There's not a single good way that I am aware of, but there are a
    couple ways that will get you close.

    First, you can use the SortedKeyIterator to truncate values and
    potentially save yourself a lot of data transfer.
    Second, each RFile header block will track the columns contained, up
    to 1000 (possibly configurable). Check out PrintInfo[1].

    Mike

    [1]:
    
https://github.com/apache/accumulo/blob/master/core/src/main/java/org/apache/accumulo/core/file/rfile/PrintInfo.java


    On Sat, Feb 22, 2014 at 11:25 AM, Arshak Navruzyan
    <[email protected] <mailto:[email protected]>> wrote:

        I don't know the inner workings of the Rfiles enough but I was
        wondering if there is a faster way to get a unique list of
        columns in Accumulo (short of doing a full mapreduce).  Is there
        some way to skip ahead all the volumes and just get to the next
        column?

        Thanks



Reply via email to