Hi,
This is my code,
import org.apache.hadoop.hbase.CellUtil
/**
* JF: convert a Result object into a string with column family and
qualifier names. Sth like
*
'columnfamily1:columnqualifier1:value1;columnfamily2:columnqualifier2:value2'
etc.
* k-v pairs are separated by ';'. different columns for each cell is
separated by ':'.
* Notice that we don't need the row key here, because it has been converted
by
* ImmutableBytesWritableToStringConverter.
*/
class CustomHBaseResultToStringConverter extends Converter[Any, String] {
override def convert(obj: Any): String = {
val result = obj.asInstanceOf[Result]
result.rawCells().map(cell =>
List(Bytes.toString(CellUtil.cloneFamily(cell)),
Bytes.toString(CellUtil.cloneQualifier(cell)),
Bytes.toString(CellUtil.cloneValue(cell))).mkString(":")).mkString(";")
}
}
I recommend you to use different delimiters (to replace ":" or ";" ) if you
have data with those stuff
in them. I am not a seasoned scala programmer, so there might be a more
flexible solution. For
example, make the delimiters dynamically assignable.
I will try to open a PR probably later today.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/pyspark-get-column-family-and-qualifier-names-from-hbase-table-tp18613p18744.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]