Depending on the amount of data, you could do a scan -c for the colfams
you want, awk out the colqual and dump that to a file. Afterwards, you
could sort and uniq.
The MR example would be pretty simple too -- same idea as above. Very
similar to your run-of-the-mill wordcount. AccumuloInputFormat will let
you just fetch the colfams you're interested in.
map:
foreach Key in colfams:
emit colqual
reduce:
emit one instance of each colqual.
On 1/14/14, 6:06 PM, Dickson, Matt MR wrote:
*UNOFFICIAL*
Just for simplicity, this is a one of request for managment so I was
hoping to just scan via the shell and output to a file.
If I need to do it via a mr job I can do it that way and would be keen
to hear any suggestions.
------------------------------------------------------------------------
*From:* David Medinets [mailto:[email protected]]
*Sent:* Wednesday, 15 January 2014 09:36
*To:* accumulo-user
*Subject:* Re: List of unique qualifiers [SEC=UNOFFICIAL]
Why the restriction to the shell environment? A nice map-reduce job
would be ideal for this task.
On Tue, Jan 14, 2014 at 5:30 PM, Dickson, Matt MR
<[email protected] <mailto:[email protected]>> wrote:
__
*UNOFFICIAL*
Hi,
I need to extract a list of unique qualifier values on a table from
the Accumulo shell. For every column there is a column family that
identifies a specific qualifer, eg 'cityofbirth'. I would like to
get a unique list of all cities that are a listed in the qualifier
against 'cityofbirth' for all rows.
eg, If I had a table with
Rowid Family Qual
123 cityofbirth LosAngeles
133 cityofbirth Brisbane
222 cityofbirth London
124 cityofbirth London
124 cityofbirth London
I want a list that is just;
LosAngeles
London
Brisbane
Any suggestions on how to achieve this from the shell would great.
Thanks in advance.
Matt