I don't think this is a good example.
Find the the difference between the two physical schemas for same
logical data modeling of relational database using an relationship
tables on RDBMS and a list of column qualifiers on BigTable.
On Fri, Mar 28, 2008 at 2:28 PM, Goel, Ankur <[EMAIL PROTECTED]> wrote:
> Hi Bryan,
> Here is the sample schema I have (looks closer to RDBMS, I
> know)
>
> TABLE: seed_list
>
> DESCRIPTION: Used to store seed urls (both old and newly discovered).
> Initially populated with some seed URLs. The crawl
> controller
> picks up the seeds from this table that have status=0 (Not
> Visited)
> or status=2 (Visited, but ready for re-crawl) and feeds
> these seeds
> in batch to different crawl engines that it knows about.
>
> SCHEMA: Columns families below
>
> {"referer_id:", "100"}, // Integer here is Max_Length
> {"url:","1500"},
> {"site:","500"},
> {"last_crawl_date:", "1000"},
> {"next_crawl_date:", "1000"},
> {"create_date:","100"},
> {"status:","100"},
> {"strike:", "100"},
> {"language:","150"},
> {"topic:","500"},
> {"depth:","100000"}
>
> Common attributes are [max versions: 1, compression: NONE, in memory:
> false, block cache enabled: true, max length: 100, bloom filter: none]
>
>
> TABLE: web_content
>
> DESCRIPTION: Used to store information retrived after crawling a URL.
> Each crawl engines provides information about URL it
> crawled.
> This information is then stored in this table depending
> upon
> the profile settings (what should be stored?)
> SCHEMA: Column families below
>
> {"url:", "1500"},
> {"site:","500"},
> {"content_type:","100"},
> {"title:", "1000"},
> {"content:", Integer.MAX_VALUE + ""},
> {"parsed_text:",Integer.MAX_VALUE + ""},
> {"crawl_date:", "1000"},
> {"last_modified_date:","100"},
> {"http_headers:","10000"},
> {"content_length:","11"},
> {"outlinks_count:","100000"}
>
> Common attributes are [max versions: 1, compression: BLOCK, in memory:
> false, block cache enabled: true, max length: 100, bloom filter: none]
>
> Please feel free to suggest modifications/enhancements for column
> oriented
> Design.
>
> Thanks
> -Ankur
>
>
> -----Original Message-----
> From: Bryan Duxbury [mailto:[EMAIL PROTECTED]
> Sent: Friday, March 28, 2008 10:33 AM
> To: [email protected]
> Subject: HBase Sample Schemas
>
> All,
>
> One of the more common types of questions we get from people new to
> HBase are about the differences in the schema between HBase and
> relational databases. So that we can generate some good examples of
> RDBMS schemas and their counterparts as they might be represented in
> HBase, could you guys post some small (1-5 entities) schemas that you
> might be interested in using and a few sentences about how you'd like to
> consume them? We can then discuss possible options and see how things
> might look. This will also help Stack, Jim, and myself to notice
> interesting access patterns we might want to support.
>
> Thanks in advance,
>
> Bryan
>
--
B. Regards,
Edward J. Yoon