Repository: incubator-griffin-site
Updated Branches:
  refs/heads/asf-site b440ad97f -> 75036365e


Site updated: 2017-05-17 09:11:02


Project: http://git-wip-us.apache.org/repos/asf/incubator-griffin-site/repo
Commit: 
http://git-wip-us.apache.org/repos/asf/incubator-griffin-site/commit/75036365
Tree: 
http://git-wip-us.apache.org/repos/asf/incubator-griffin-site/tree/75036365
Diff: 
http://git-wip-us.apache.org/repos/asf/incubator-griffin-site/diff/75036365

Branch: refs/heads/asf-site
Commit: 75036365eaf1039792473408574d18cefa477c65
Parents: b440ad9
Author: guoyp <[email protected]>
Authored: Wed May 17 09:11:02 2017 +0800
Committer: guoyp <[email protected]>
Committed: Wed May 17 09:11:02 2017 +0800

----------------------------------------------------------------------
 2017/03/03/plan/index.html      |   2 +-
 2017/03/04/community/index.html |   2 +-
 2017/03/30/home/index.html      |  13 ++++++++-----
 images/arch.png                 | Bin 0 -> 307285 bytes
 images/techstack.png            | Bin 0 -> 127993 bytes
 index.html                      |  13 +++++++------
 6 files changed, 17 insertions(+), 13 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-griffin-site/blob/75036365/2017/03/03/plan/index.html
----------------------------------------------------------------------
diff --git a/2017/03/03/plan/index.html b/2017/03/03/plan/index.html
index b8d7cc1..82b1d8c 100644
--- a/2017/03/03/plan/index.html
+++ b/2017/03/03/plan/index.html
@@ -237,7 +237,7 @@ profiling target data asset, providing statistics by 
differen">
       
     </div>
     <footer class="article-footer">
-      <a data-url="http://yoursite.com/2017/03/03/plan/"; 
data-id="cj1x9wwuy0002y0pot1in4xz2" class="article-share-link">Partager</a>
+      <a data-url="http://yoursite.com/2017/03/03/plan/"; 
data-id="cj2sajr130002i2po3fai9oe8" class="article-share-link">Partager</a>
       
       
     </footer>

http://git-wip-us.apache.org/repos/asf/incubator-griffin-site/blob/75036365/2017/03/04/community/index.html
----------------------------------------------------------------------
diff --git a/2017/03/04/community/index.html b/2017/03/04/community/index.html
index c01f5c8..cbb5a05 100644
--- a/2017/03/04/community/index.html
+++ b/2017/03/04/community/index.html
@@ -123,7 +123,7 @@ Wikihttps://cwiki.apache.org/confluence/display/GRIFFIN/G";>
       
     </div>
     <footer class="article-footer">
-      <a data-url="http://yoursite.com/2017/03/04/community/"; 
data-id="cj1x9wwus0000y0pois748bng" class="article-share-link">Partager</a>
+      <a data-url="http://yoursite.com/2017/03/04/community/"; 
data-id="cj2sajr0y0000i2powdn8gg1f" class="article-share-link">Partager</a>
       
       
     </footer>

http://git-wip-us.apache.org/repos/asf/incubator-griffin-site/blob/75036365/2017/03/30/home/index.html
----------------------------------------------------------------------
diff --git a/2017/03/30/home/index.html b/2017/03/30/home/index.html
index 998dd32..4326eb2 100644
--- a/2017/03/30/home/index.html
+++ b/2017/03/30/home/index.html
@@ -12,7 +12,9 @@
 <meta property="og:site_name" content="Apache Griffin">
 <meta property="og:description" content="AbstractApache Griffin is a Data 
Quality Service platform built on Apache Hadoop and Apache Spark. It provides a 
framework process for defining data quality model, executing data quality 
measurement,">
 <meta property="og:image" 
content="http://yoursite.com/images/Business_Process.png";>
-<meta property="og:updated_time" content="2017-04-21T03:08:14.000Z">
+<meta property="og:image" content="http://yoursite.com/images/arch.png";>
+<meta property="og:image" content="http://yoursite.com/images/techstack.png";>
+<meta property="og:updated_time" content="2017-05-17T01:03:37.000Z">
 <meta name="twitter:card" content="summary">
 <meta name="twitter:title" content="Apache Griffin">
 <meta name="twitter:description" content="AbstractApache Griffin is a Data 
Quality Service platform built on Apache Hadoop and Apache Spark. It provides a 
framework process for defining data quality model, executing data quality 
measurement,">
@@ -92,7 +94,7 @@
     <div class="article-entry" itemprop="articleBody">
       
         <h2 id="Abstract"><a href="#Abstract" class="headerlink" 
title="Abstract"></a>Abstract</h2><p>Apache Griffin is a Data Quality Service 
platform built on Apache Hadoop and Apache Spark. It provides a framework 
process for defining data quality model, executing data quality measurement, 
automating data profiling and validation, as well as a unified data quality 
visualization across multiple data systems.  It tries to address the data 
quality challenges in big data and streaming context.</p>
-<h2 id="Overview-of-Apache-Griffin"><a href="#Overview-of-Apache-Griffin" 
class="headerlink" title="Overview of Apache Griffin"></a>Overview of Apache 
Griffin</h2><p>At eBay, when people use big data (Hadoop or other streaming 
systems), measurement of data quality is a big challenge. Different teams have 
built customized tools to detect and analyze data quality issues within their 
own domains. As a platform organization, we think of taking a platform approach 
to commonly occurring patterns. As such, we are building a platform to provide 
shared Infrastructure and generic features to solve common data quality pain 
points. This would enable us to build trusted data assets.</p>
+<h2 id="Overview-of-Apache-Griffin"><a href="#Overview-of-Apache-Griffin" 
class="headerlink" title="Overview of Apache Griffin"></a>Overview of Apache 
Griffin</h2><p>When people use big data (Hadoop or other streaming systems), 
measurement of data quality is a big challenge. Different teams have built 
customized tools to detect and analyze data quality issues within their own 
domains. As a platform organization, we think of taking a platform approach to 
commonly occurring patterns. As such, we are building a platform to provide 
shared Infrastructure and generic features to solve common data quality pain 
points. This would enable us to build trusted data assets.</p>
 <p>Currently it is very difficult and costly to do data quality validation 
when we have large volumes of related data flowing across multi-platforms 
(streaming and batch). Take eBay’s Real-time Personalization Platform as a 
sample; Everyday we have to validate the data quality for ~600M records. Data 
quality often becomes one big challenge in this complex environment and massive 
scale.</p>
 <p>We detect the following at eBay:</p>
 <ol>
@@ -120,8 +122,9 @@
 <p>For near real time analysis, we consume data from messaging system, then 
our data quality model will compute our real time data quality metrics in our 
spark cluster. for data storage, we use time series database in our back end to 
fulfill front end request.</p>
 <p><strong>Apache Griffin Service</strong>:</p>
 <p>We have RESTful web services to accomplish all the functionalities of 
Apache Griffin, such as register data-set, create data quality model, publish 
metrics, retrieve metrics, add subscription, etc. So, the developers can 
develop their own user interface based on these web serivces.</p>
-<h2 id="Main-business-process"><a href="#Main-business-process" 
class="headerlink" title="Main business process"></a>Main business 
process</h2><p>Here’s the business process diagram</p>
-<p><img src="/images/Business_Process.png" alt=""></p>
+<h2 id="Main-business-process"><a href="#Main-business-process" 
class="headerlink" title="Main business process"></a>Main business 
process</h2><p><img src="/images/Business_Process.png" alt=""></p>
+<h2 id="Architecture-diagram"><a href="#Architecture-diagram" 
class="headerlink" title="Architecture diagram"></a>Architecture 
diagram</h2><p><img src="/images/arch.png" alt=""></p>
+<h2 id="Tech-stack"><a href="#Tech-stack" class="headerlink" title="Tech 
stack"></a>Tech stack</h2><p><img src="/images/techstack.png" alt=""></p>
 <h2 id="Rationale"><a href="#Rationale" class="headerlink" 
title="Rationale"></a>Rationale</h2><p>The challenge we face at eBay is that 
our data volume is becoming bigger and bigger, systems process become more 
complex, while we do not have a unified data quality solution to ensure the 
trusted data sets which provide confidences on data quality to our data 
consumers.  The key challenges on data quality includes:</p>
 <ol>
 <li>Existing commercial data quality solution cannot address data quality 
lineage among systems, cannot scale out to support fast growing data at 
eBay</li>
@@ -143,7 +146,7 @@
       
     </div>
     <footer class="article-footer">
-      <a data-url="http://yoursite.com/2017/03/30/home/"; 
data-id="cj1x9wwuw0001y0pop82l43c9" class="article-share-link">Partager</a>
+      <a data-url="http://yoursite.com/2017/03/30/home/"; 
data-id="cj2sajr110001i2poo3d8pmqd" class="article-share-link">Partager</a>
       
       
     </footer>

http://git-wip-us.apache.org/repos/asf/incubator-griffin-site/blob/75036365/images/arch.png
----------------------------------------------------------------------
diff --git a/images/arch.png b/images/arch.png
new file mode 100644
index 0000000..93bc755
Binary files /dev/null and b/images/arch.png differ

http://git-wip-us.apache.org/repos/asf/incubator-griffin-site/blob/75036365/images/techstack.png
----------------------------------------------------------------------
diff --git a/images/techstack.png b/images/techstack.png
new file mode 100644
index 0000000..ebc5540
Binary files /dev/null and b/images/techstack.png differ

http://git-wip-us.apache.org/repos/asf/incubator-griffin-site/blob/75036365/index.html
----------------------------------------------------------------------
diff --git a/index.html b/index.html
index 99e37e2..958796d 100644
--- a/index.html
+++ b/index.html
@@ -88,7 +88,7 @@
     <div class="article-entry" itemprop="articleBody">
       
         <h2 id="Abstract"><a href="#Abstract" class="headerlink" 
title="Abstract"></a>Abstract</h2><p>Apache Griffin is a Data Quality Service 
platform built on Apache Hadoop and Apache Spark. It provides a framework 
process for defining data quality model, executing data quality measurement, 
automating data profiling and validation, as well as a unified data quality 
visualization across multiple data systems.  It tries to address the data 
quality challenges in big data and streaming context.</p>
-<h2 id="Overview-of-Apache-Griffin"><a href="#Overview-of-Apache-Griffin" 
class="headerlink" title="Overview of Apache Griffin"></a>Overview of Apache 
Griffin</h2><p>At eBay, when people use big data (Hadoop or other streaming 
systems), measurement of data quality is a big challenge. Different teams have 
built customized tools to detect and analyze data quality issues within their 
own domains. As a platform organization, we think of taking a platform approach 
to commonly occurring patterns. As such, we are building a platform to provide 
shared Infrastructure and generic features to solve common data quality pain 
points. This would enable us to build trusted data assets.</p>
+<h2 id="Overview-of-Apache-Griffin"><a href="#Overview-of-Apache-Griffin" 
class="headerlink" title="Overview of Apache Griffin"></a>Overview of Apache 
Griffin</h2><p>When people use big data (Hadoop or other streaming systems), 
measurement of data quality is a big challenge. Different teams have built 
customized tools to detect and analyze data quality issues within their own 
domains. As a platform organization, we think of taking a platform approach to 
commonly occurring patterns. As such, we are building a platform to provide 
shared Infrastructure and generic features to solve common data quality pain 
points. This would enable us to build trusted data assets.</p>
 <p>Currently it is very difficult and costly to do data quality validation 
when we have large volumes of related data flowing across multi-platforms 
(streaming and batch). Take eBay’s Real-time Personalization Platform as a 
sample; Everyday we have to validate the data quality for ~600M records. Data 
quality often becomes one big challenge in this complex environment and massive 
scale.</p>
 <p>We detect the following at eBay:</p>
 <ol>
@@ -116,8 +116,9 @@
 <p>For near real time analysis, we consume data from messaging system, then 
our data quality model will compute our real time data quality metrics in our 
spark cluster. for data storage, we use time series database in our back end to 
fulfill front end request.</p>
 <p><strong>Apache Griffin Service</strong>:</p>
 <p>We have RESTful web services to accomplish all the functionalities of 
Apache Griffin, such as register data-set, create data quality model, publish 
metrics, retrieve metrics, add subscription, etc. So, the developers can 
develop their own user interface based on these web serivces.</p>
-<h2 id="Main-business-process"><a href="#Main-business-process" 
class="headerlink" title="Main business process"></a>Main business 
process</h2><p>Here’s the business process diagram</p>
-<p><img src="/images/Business_Process.png" alt=""></p>
+<h2 id="Main-business-process"><a href="#Main-business-process" 
class="headerlink" title="Main business process"></a>Main business 
process</h2><p><img src="/images/Business_Process.png" alt=""></p>
+<h2 id="Architecture-diagram"><a href="#Architecture-diagram" 
class="headerlink" title="Architecture diagram"></a>Architecture 
diagram</h2><p><img src="/images/arch.png" alt=""></p>
+<h2 id="Tech-stack"><a href="#Tech-stack" class="headerlink" title="Tech 
stack"></a>Tech stack</h2><p><img src="/images/techstack.png" alt=""></p>
 <h2 id="Rationale"><a href="#Rationale" class="headerlink" 
title="Rationale"></a>Rationale</h2><p>The challenge we face at eBay is that 
our data volume is becoming bigger and bigger, systems process become more 
complex, while we do not have a unified data quality solution to ensure the 
trusted data sets which provide confidences on data quality to our data 
consumers.  The key challenges on data quality includes:</p>
 <ol>
 <li>Existing commercial data quality solution cannot address data quality 
lineage among systems, cannot scale out to support fast growing data at 
eBay</li>
@@ -139,7 +140,7 @@
       
     </div>
     <footer class="article-footer">
-      <a data-url="http://yoursite.com/2017/03/30/home/"; 
data-id="cj1x9wwuw0001y0pop82l43c9" class="article-share-link">Partager</a>
+      <a data-url="http://yoursite.com/2017/03/30/home/"; 
data-id="cj2sajr110001i2poo3d8pmqd" class="article-share-link">Partager</a>
       
       
     </footer>
@@ -193,7 +194,7 @@
       
     </div>
     <footer class="article-footer">
-      <a data-url="http://yoursite.com/2017/03/04/community/"; 
data-id="cj1x9wwus0000y0pois748bng" class="article-share-link">Partager</a>
+      <a data-url="http://yoursite.com/2017/03/04/community/"; 
data-id="cj2sajr0y0000i2powdn8gg1f" class="article-share-link">Partager</a>
       
       
     </footer>
@@ -322,7 +323,7 @@
       
     </div>
     <footer class="article-footer">
-      <a data-url="http://yoursite.com/2017/03/03/plan/"; 
data-id="cj1x9wwuy0002y0pot1in4xz2" class="article-share-link">Partager</a>
+      <a data-url="http://yoursite.com/2017/03/03/plan/"; 
data-id="cj2sajr130002i2po3fai9oe8" class="article-share-link">Partager</a>
       
       
     </footer>

Reply via email to