You will have to write your own InputFormat class which will parse your file and pass records to your reducer.
-Eric On Wed, Apr 3, 2013 at 2:29 PM, Aji Janis <[email protected]> wrote: > Looking at the BulkIngestExample, it uses GenerateTestData and creates a > .txt file which contians Key: Value pair and correct me if I am wrong but > each new line is a new row right? > > I need to know how to have family and qualifiers also. In other words, > > 1) Do I set up a .txt file that can be converted into an Accumulo RF File > using AccumuloFileOutputFormat which can then be imported into my table? > > 2) if yes, what is the format of the .txt file. > > > > > On Wed, Apr 3, 2013 at 2:19 PM, Eric Newton <[email protected]> wrote: > >> Your data needs to be in the RFile format, and more importantly it needs >> to be sorted. >> >> It's handy to use a Map/Reduce job to convert/sort your data. See the >> BulkIngestExample. >> >> -Eric >> >> >> On Wed, Apr 3, 2013 at 2:15 PM, Aji Janis <[email protected]> wrote: >> >>> I have some data in a text file in the following format. >>> >>> rowid1 columnFamily1 colQualifier1 value >>> rowid1 columnFamily1 colQualifier2 value >>> rowid1 columnFamily2 colQualifier1 value >>> rowid2 columnFamily1 colQualifier1 value >>> rowid3 columnFamily1 colQualifier1 value >>> >>> I want to import this data into a table in accumulo. My end goal is to >>> understand how to use the BulkImport feature in accumulo. I tried to login >>> to the accumulo shell as root and then run: >>> >>> #table mytable >>> #importdirectory /home/inputDir /home/failureDir true >>> >>> but it didn't work. My data file was saved as data.txt in >>> /home/inputDir. I tried to create the dir/file structure in hdfs and linux >>> but neither worked. When trying locally, it keeps complaining about >>> failureDir not existing. >>> ... >>> java.io.FileNotFoundException: File does not exist: failures >>> >>> When trying with files on hdfs, I get no error on the console but the >>> logger had the following messages: >>> ... >>> [tableOps.BulkImport] WARN : hdfs://node....//inputDir/data.txt does not >>> have a valid extension, ignoring >>> >>> or, >>> >>> [tableOps.BulkImport] WARN : hdfs://node....//inputDir/data.txt is not a >>> map file, ignoring >>> >>> >>> Suggestions? Am I not setting up the job right? Thank you for help in >>> advance. >>> >>> >>> On Wed, Apr 3, 2013 at 2:04 PM, Aji Janis <[email protected]> wrote: >>> >>>> I have some data in a text file in the following format: >>>> >>>> rowid1 columnFamily colQualifier value >>>> rowid1 columnFamily colQualifier value >>>> rowid1 columnFamily colQualifier value >>>> >>> >>> >> >
