Hi i am doing bulk load into HBase as HFileFormat, by
using saveAsNewAPIHadoopFile
when i try to write i am getting an exception
java.io.IOException: Added a key not lexically larger than previous.
following is the code snippet
case class HBaseRow(rowKey: ImmutableBytesWritable, kv: KeyValue)
val kAvroDF = sqlContext.read.format("com.databricks.spark.avro").load(args(0))
val kRDD = kAvroDF.select("seqid", "mi", "moc", "FID", "WID").rdd
val trRDD = kRDD.map(a => preparePUT(a(1).asInstanceOf[String],
a(3).asInstanceOf[String]))
val kvRDD = trRDD.flatMap(a => a).map(a => (a.rowKey, a.kv))
saveAsHFile(kvRDD, args(1))
prepare put returns a list of HBaseRow( ImmutableBytesWritable,KeyValue)
sorted on KeyValue, where i do a flat map on the rdd and
prepare a RDD(ImmutableBytesWritable,KeyValue) and pass it to saveASHFile
does flatmap operation on RDD changes the sorted order??
can anyone provide me how to resolve this issue.
Thanks,
-Yeshwanth