Matt,
This should help:
Collection<Pair<Text,Text>> cols = Collections.singleton(new
Pair<Text,Text>(new Text("cityOfBirth"), null));
AccumuloInputFormat.fetchColumns(job, cols);
On Wed, Jan 15, 2014 at 7:29 PM, Dickson, Matt MR <
[email protected]> wrote:
> *UNOFFICIAL*
> Thanks Keith. I've run a simple mr job based on the UniqueColumns
> example, but due to the size of the table this is taking a very long time.
> Is it possible to pre-filter the data that goes to the MR job based on
> family, eg only run the MR job on columns with a specific column family of
> 'cityofbirth'? I am currently going through every column in the table and
> checking the column family in the mapper ... slow.
>
>
>
> ------------------------------
> *From:* Keith Turner [mailto:[email protected]]
> *Sent:* Wednesday, 15 January 2014 12:06
> *To:* [email protected]
>
> *Subject:* Re: List of unique qualifiers [SEC=UNOFFICIAL]
>
>
>
>
> On Tue, Jan 14, 2014 at 6:06 PM, Dickson, Matt MR <
> [email protected]> wrote:
>
>> *UNOFFICIAL*
>> Just for simplicity, this is a one of request for managment so I was
>> hoping to just scan via the shell and output to a file.
>>
>> If I need to do it via a mr job I can do it that way and would be keen to
>> hear any suggestions.
>>
>
> You could modify the following example in 1.4 to suit your needs.
>
>
> src/examples/simple/src/main/java/org/apache/accumulo/examples/simple/mapreduce/UniqueColumns.java
>
>
>>
>> ------------------------------
>> *From:* David Medinets [mailto:[email protected]]
>> *Sent:* Wednesday, 15 January 2014 09:36
>> *To:* accumulo-user
>> *Subject:* Re: List of unique qualifiers [SEC=UNOFFICIAL]
>>
>> Why the restriction to the shell environment? A nice map-reduce job
>> would be ideal for this task.
>>
>>
>> On Tue, Jan 14, 2014 at 5:30 PM, Dickson, Matt MR <
>> [email protected]> wrote:
>>
>>> *UNOFFICIAL*
>>> Hi,
>>>
>>> I need to extract a list of unique qualifier values on a table from the
>>> Accumulo shell. For every column there is a column family that identifies
>>> a specific qualifer, eg 'cityofbirth'. I would like to get a unique list
>>> of all cities that are a listed in the qualifier against 'cityofbirth' for
>>> all rows.
>>>
>>> eg, If I had a table with
>>>
>>> Rowid Family Qual
>>> 123 cityofbirth LosAngeles
>>> 133 cityofbirth Brisbane
>>> 222 cityofbirth London
>>> 124 cityofbirth London
>>> 124 cityofbirth London
>>>
>>> I want a list that is just;
>>> LosAngeles
>>> London
>>> Brisbane
>>>
>>> Any suggestions on how to achieve this from the shell would great.
>>>
>>> Thanks in advance.
>>> Matt
>>>
>>>
>>>
>>>
>>
>>
>