[
https://issues.apache.org/jira/browse/HBASE-451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13016223#comment-13016223
]
stack commented on HBASE-451:
-----------------------------
Warning. This is a big one Subbu.
So, I used to think that storing the schema in zk was the way to go but Ryan
argues, correctly I believe, that zk should only carry transient data if only
so we can copy hdfs content and then we can bring up the data elsewhere under
another cluster (Otherwise, we'll have to copy hdfs and zk state to replicate
cluster data elsewhere -- a pain). So, we could write schema into a table or
into hdfs. We could write the table schema into a new catalog table named
schemas or, I believe it was Andrew Purtell who suggested, we put the schema
into a new column family in .META. table into the first region only. If we
wrote it into hdfs, we could write it into a .tabledescriptor file as we write
the .regioninfo file now under each region. On startup, I'd think that we'd
read hdfs or a schema table and then per table add a znode up in zk. On each
new table edit, we'd update the znode. All regionservers would be watching the
zk table proxy and would know to reread the schema on watcher trigger.
On JSON serializing, yeah, that'd be sweet but might be a bit much to bite off
as part of this issue. Maybe just go w/ Writables until its all running? Open
new issue to add serialization of types to JSON (Todd mentions that if we
avro'd this stuff, we could make use of an avro-to-json gateway that apparently
avro has).
> Remove HTableDescriptor from HRegionInfo
> ----------------------------------------
>
> Key: HBASE-451
> URL: https://issues.apache.org/jira/browse/HBASE-451
> Project: HBase
> Issue Type: Improvement
> Components: master, regionserver
> Affects Versions: 0.2.0
> Reporter: Jim Kellerman
> Priority: Critical
> Fix For: 0.92.0
>
>
> There is an HRegionInfo for every region in HBase. Currently HRegionInfo also
> contains the HTableDescriptor (the schema). That means we store the schema n
> times where n is the number of regions in the table.
> Additionally, for every region of the same table that the region server has
> open, there is a copy of the schema. Thus it is stored in memory once for
> each open region.
> If HRegionInfo merely contained the table name the HTableDescriptor could be
> stored in a separate file and easily found.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira