jiang hehui created HADOOP-13304:
------------------------------------

             Summary: distributed database for store , mapreduce for compute
                 Key: HADOOP-13304
                 URL: https://issues.apache.org/jira/browse/HADOOP-13304
             Project: Hadoop Common
          Issue Type: New Feature
          Components: fs
    Affects Versions: 2.6.4
            Reporter: jiang hehui


in hadoop ,hdfs is responsible for store , mapreduce is responsible for compute 
.
my idea is that data are stored in distributed database , data compute is like 
mapreduce.

!http://images2015.cnblogs.com/blog/439702/201606/439702-20160621124133334-32823985.png!

* insert: 
using two-phase commit ,according to the split policy ,just execute insert in 
nodes

* delete: 
using two-phase commit ,according to the split policy ,just execute delete in 
nodes

* update:
using two-phase commit, according to the split policy, if record node does not 
change ,just execute update in nodes, if record node change, first delete old 
value in source node , and insert new value in destination node .
* select:
** simple select (like data just in one node , or data fusion across multi 
nodes not need)is just the same like standalone database server;
** complex select (like distinct , group by, order by, sub query, join across 
multi nodes),we call a job 
{panel}
{color:red}job are parsed into stages , stages have lineage , all stages in a 
job make up dag( Directed Acyclic Graph ) ,every stage contains mapsql 
,shuffle, reducesql .
when receive request sql, according to metadata ,generate the execution plan 
which contain the dag , including stage and mapsql ,shuffle, reducesql in each 
stage; then just execute the plan , and return result to client.

as in spark , it is the same ; rdd is table , job is job;
as mapreduce in hadoop, it is the same ; mapsql is map , shuffle is shuffle , 
reducesql is reduce.
{color}
{panel}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to