Clone the table, take the cloned table offline, use it for the map-reduce job, then delete it. All of this work can be done through the Java API which is nice if you'll be running the job more than once.
On Wed, Jan 15, 2014 at 8:27 PM, Corey Nolet <[email protected]> wrote: > Matt, > > This should help: > > Collection<Pair<Text,Text>> cols = Collections.singleton(new > Pair<Text,Text>(new Text("cityOfBirth"), null)); > AccumuloInputFormat.fetchColumns(job, cols); > > > > On Wed, Jan 15, 2014 at 7:29 PM, Dickson, Matt MR < > [email protected]> wrote: > >> *UNOFFICIAL* >> Thanks Keith. I've run a simple mr job based on the UniqueColumns >> example, but due to the size of the table this is taking a very long time. >> Is it possible to pre-filter the data that goes to the MR job based on >> family, eg only run the MR job on columns with a specific column family of >> 'cityofbirth'? I am currently going through every column in the table and >> checking the column family in the mapper ... slow. >> >> >> >> ------------------------------ >> *From:* Keith Turner [mailto:[email protected]] >> *Sent:* Wednesday, 15 January 2014 12:06 >> *To:* [email protected] >> >> *Subject:* Re: List of unique qualifiers [SEC=UNOFFICIAL] >> >> >> >> >> On Tue, Jan 14, 2014 at 6:06 PM, Dickson, Matt MR < >> [email protected]> wrote: >> >>> *UNOFFICIAL* >>> Just for simplicity, this is a one of request for managment so I was >>> hoping to just scan via the shell and output to a file. >>> >>> If I need to do it via a mr job I can do it that way and would be keen >>> to hear any suggestions. >>> >> >> You could modify the following example in 1.4 to suit your needs. >> >> >> src/examples/simple/src/main/java/org/apache/accumulo/examples/simple/mapreduce/UniqueColumns.java >> >> >>> >>> ------------------------------ >>> *From:* David Medinets [mailto:[email protected]] >>> *Sent:* Wednesday, 15 January 2014 09:36 >>> *To:* accumulo-user >>> *Subject:* Re: List of unique qualifiers [SEC=UNOFFICIAL] >>> >>> Why the restriction to the shell environment? A nice map-reduce job >>> would be ideal for this task. >>> >>> >>> On Tue, Jan 14, 2014 at 5:30 PM, Dickson, Matt MR < >>> [email protected]> wrote: >>> >>>> *UNOFFICIAL* >>>> Hi, >>>> >>>> I need to extract a list of unique qualifier values on a table from the >>>> Accumulo shell. For every column there is a column family that identifies >>>> a specific qualifer, eg 'cityofbirth'. I would like to get a unique list >>>> of all cities that are a listed in the qualifier against 'cityofbirth' for >>>> all rows. >>>> >>>> eg, If I had a table with >>>> >>>> Rowid Family Qual >>>> 123 cityofbirth LosAngeles >>>> 133 cityofbirth Brisbane >>>> 222 cityofbirth London >>>> 124 cityofbirth London >>>> 124 cityofbirth London >>>> >>>> I want a list that is just; >>>> LosAngeles >>>> London >>>> Brisbane >>>> >>>> Any suggestions on how to achieve this from the shell would great. >>>> >>>> Thanks in advance. >>>> Matt >>>> >>>> >>>> >>>> >>> >>> >> >
