Re: Advantages of blur's data model

Aaron McCurry Fri, 03 Jul 2015 06:11:43 -0700

This can be an open ended question but I will try to explain how I see
things.

Blur was designed to house many tables all with potentially different
schemas and types.  These tables can be vastly different sizes in either
record count, record size, record grouping (aka Rows) as well as content.

As for the family concept inside of Blur, at this point it really isn't
required.  Originally it was in place because of the row query (aka join
query), it made it easier to separate data types.  One of the original use
cases of Blur was to take a snowflake type schema (
https://en.wikipedia.org/wiki/Snowflake_schema) in a rdbms, get joined
together through a series of MR jobs and create a star like schema (
https://en.wikipedia.org/wiki/Star_schema) and then load that data into
Blur to execute queries.  Each of the data marts in the start schema
natually fit into a family in a single table in Blur.  Then the rowid in
Blur was used to join the records across families (data mart tables).  So
we used Blur's table as a collection of data mart tables and we called it a
super mart :-).  Most people don't have this use case when coming from
Lucene, Solr, ES.

In the next couple of days I will be posting a roadmap for at least 0.3.0
and then 0.4.0.  In these you will see where we are going to try to
decouple the legacy data model (see above) from the index serving and
execution engine in Blur.  Basically it should allow people to still
operate the legacy data model, a pure document model, or create their own
type of system.

Hope this answers your question.

Aaron

On Thu, Jul 2, 2015 at 11:49 AM, rahul challapalli <
[email protected]> wrote:

> Hi,
>
> Since I am not an end-user, I never really understood the advantages of a
> table based data model in blur. What are its advantages with respect to
> something similar in solr or es?
>
> - Rahul
>

Re: Advantages of blur's data model

Reply via email to