Excuse me, 'col3' sorts lexicographically *after* 'col16'. On Thu, Dec 6, 2012 at 9:07 AM, William Slacum < [email protected]> wrote:
> 'col3' sorts lexicographically before 'col16'. you'll either need to > encode your numerics or zero pad them. > > > On Thu, Dec 6, 2012 at 9:03 AM, Andrew Catterall < > [email protected]> wrote: > >> Hi, >> >> >> I am trying to run a bulk ingest to import data into Accumulo but it is >> failing at the reduce task with the below error: >> >> >> >> java.lang.IllegalStateException: Keys appended out-of-order. New key >> client@20121206123059@0014efca-d8e8-492e-83cb-e5b6b7c49f7a foo:col3 [ >> myVis] 9223372036854775807 false, previous key >> client@20121206123059@0014efca-d8e8-492e-83cb-e5b6b7c49f7a >> foo:col16 [myVis] 9223372036854775807 false >> >> at >> org.apache.accumulo.core.file.rfile.RFile$Writer.append(RFile.java:378) >> >> >> >> Could this be caused by the order at which the writes are being done? >> >> >> *-- Background* >> >> * >> * >> >> The input file is a tab separated file. A sample row would look like: >> >> Data1 Data2 Data3 Data4 Data5 … DataN >> >> >> >> The map parses the data, for each row, into a Map<String, String>. This >> will contain the following: >> >> Col1 Data1 >> >> Col2 Data2 >> >> Col3 Data3 >> >> … >> >> ColN DataN >> >> >> An outputKey is then generated for this row in the format * >> client@timeStamp@randomUUID* >> >> Then for each entry in Map<String, String> a outputValue is generated in >> the format *ColN|DataN* >> >> The outputKey and outputValue are written to Context. >> >> >> >> This completes successfully, however, the reduce task fails. >> >> >> My ReduceClass is as follows: >> >> >> >> *public* *static* *class* ReduceClass *extends* >> Reducer<Text,Text,Key,Value> >> { >> >> *public* *void* reduce(Text key, Iterable<Text> keyValues, >> Context output) *throws* IOException, InterruptedException { >> >> >> >> // for each value belonging to the key >> >> *for* (Text keyValue : keyValues) { >> >> >> >> //split the keyValue into *Col* and Data >> >> String[] values = keyValue.toString().split("\\|"); >> >> >> >> // Generate key >> >> Key outputKey = *new* Key(key, *new* Text("foo"), * >> new* Text(values[0]), *new* Text("myVis")); >> >> >> >> // Generate value >> >> Value outputValue = *new* Value(values[1].getBytes(), >> 0, values[1].length()); >> >> >> >> // Write to context >> >> output.write(outputKey, outputValue); >> >> } >> >> } >> >> } >> >> >> >> >> *-- Expected output* >> >> >> >> I am expecting the contents of the Accumulo table to be as follows: >> >> >> >> client@20121206123059@0014efca-d8e8-492e-83cb-e5b6b7c49f7a foo:Col1 [ >> myVis] Data1 >> >> client@20121206123059@0014efca-d8e8-492e-83cb-e5b6b7c49f7a foo:Col2 [ >> myVis] Data2 >> >> client@20121206123059@0014efca-d8e8-492e-83cb-e5b6b7c49f7a foo:Col3 [ >> myVis] Data3 >> >> client@20121206123059@0014efca-d8e8-492e-83cb-e5b6b7c49f7a foo:Col4 [ >> myVis] Data4 >> >> client@20121206123059@0014efca-d8e8-492e-83cb-e5b6b7c49f7a foo:Col5 [ >> myVis] Data5 >> >> … >> >> client@20121206123059@0014efca-d8e8-492e-83cb-e5b6b7c49f7a foo:ColN [ >> myVis] DataN >> >> >> >> >> >> Thanks, >> >> Andrew >> > >
