[
https://issues.apache.org/jira/browse/HBASE-14801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15174376#comment-15174376
]
Zhan Zhang commented on HBASE-14801:
------------------------------------
The purpose of this patch is to change the hbase catalog definition to be json
based. With the change, it is more formalized, less error prone and easy to
extend for future feature support, for example support write, customerized
serdes, avro support, etc.
For example, following is the new format for hbase catalog
def writeCatalog = s"""{
|"table":{"namespace":"default", "name":"table1"},
|"rowkey":"key",
|"columns":{
|"col0":{"cf":"rowkey", "col":"key", "type":"string"},
|"col1":{"cf":"cf1", "col":"col1", "type":"string"},
|"col2":{"cf":"cf2", "col":"col2", "type":"double"},
|"col3":{"cf":"cf3", "col":"col3", "type":"float"},
|"col4":{"cf":"cf4", "col":"col4", "type":"int"},
|"col5":{"cf":"cf5", "col":"col5", "type":"bigint"}}
|}
|}""".stripMargin
Read:
def withCatalog(cat: String): DataFrame = {
sqlContext
.read
.options(Map(HBaseTableCatalog.tableCatalog->cat))
.format("org.apache.hadoop.hbase.spark")
.load()
}
val df = withCatalog(writeCatalog)
Write:
sc.parallelize(data).toDF.write.options(
Map(HBaseTableCatalog.tableCatalog -> writeCatalog,
HBaseTableCatalog.newTable -> "5"))
.format("org.apache.hadoop.hbase.spark")
.save()
> Enhance the Spark-HBase connector catalog with json format
> ----------------------------------------------------------
>
> Key: HBASE-14801
> URL: https://issues.apache.org/jira/browse/HBASE-14801
> Project: HBase
> Issue Type: Sub-task
> Reporter: Zhan Zhang
> Assignee: Zhan Zhang
> Attachments: HBASE-14801-1.patch, HBASE-14801-2.patch,
> HBASE-14801-3.patch, HBASE-14801-4.patch, HBASE-14801-5.patch
>
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)