This is an automated email from the ASF dual-hosted git repository.
vogievetsky pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/druid-website.git
The following commit(s) were added to refs/heads/asf-site by this push:
new 5faeb4d6 Autobuild (#209)
5faeb4d6 is described below
commit 5faeb4d6e617843c775d264c3fa988bc47ef4c94
Author: Vadim Ogievetsky <[email protected]>
AuthorDate: Thu Feb 2 18:39:04 2023 -0800
Autobuild (#209)
---
community/index.html | 1 +
css/index.css | 288 ++++++++++++++++++++++++++++++++++-
img/graphical_ui_application_v2.png | Bin 0 -> 75591 bytes
img/ingestion_layer_stream_batch.png | Bin 0 -> 68344 bytes
img/scatter_gather_diagram.png | Bin 0 -> 204447 bytes
index.html | 153 +++++++++++++------
technology.html | 176 +++++----------------
7 files changed, 427 insertions(+), 191 deletions(-)
diff --git a/community/index.html b/community/index.html
index ac4a9fc2..63513a2b 100644
--- a/community/index.html
+++ b/community/index.html
@@ -147,6 +147,7 @@ discussion about project development.</li>
<ul>
<li><strong>User mailing list:</strong> <a
href="https://groups.google.com/forum/#!forum/druid-user">[email protected]</a>
for general
discussion, questions, and announcements.</li>
+<li><strong>LinkedIn</strong> Connect with other Apache Druid professionals in
the <a href="https://www.linkedin.com/groups/8791983/">LinkedIn group</a></li>
<li><strong>Meetups:</strong> Check out <a
href="https://www.meetup.com/topics/apache-druid/">Apache Druid on
meetup.com</a> for links to regular
meetups in cities all over the world.</li>
<li><strong>StackOverflow:</strong> While the user mailing list is the primary
resource for asking questions, if you prefer
diff --git a/css/index.css b/css/index.css
index 6eb61a04..aa3733b9 100644
--- a/css/index.css
+++ b/css/index.css
@@ -34,7 +34,7 @@
font-weight: 600;
margin-top: 8px;
margin-bottom: 26px;
- max-width: 820px;
+ max-width: 910px;
margin-left: auto;
margin-right: auto; }
.druid-masthead b {
@@ -48,3 +48,289 @@
font-size: 1.4em; }
.druid-masthead .button {
min-width: 130px; } }
+
+@media (min-width: 992px) {
+ .card {
+ min-height: 294px; } }
+
+@media (min-width: 1200px) {
+ .card {
+ min-height: 230px; } }
+
+.key-druid-features {
+ padding-top: 25px; }
+
+.key-druid-features * p {
+ margin-top: 5px !important; }
+
+.key-druid-features * .card-margin {
+ margin-bottom: 1.875rem; }
+
+.key-druid-features * .card {
+ border: 0;
+ box-shadow: 0px 0px 10px 0px rgba(82, 63, 105, 0.1);
+ -webkit-box-shadow: 0px 0px 10px 0px rgba(82, 63, 105, 0.1);
+ -moz-box-shadow: 0px 0px 10px 0px rgba(82, 63, 105, 0.1);
+ -ms-box-shadow: 0px 0px 10px 0px rgba(82, 63, 105, 0.1);
+ position: relative;
+ display: flex;
+ flex-direction: column;
+ min-width: 0;
+ word-wrap: break-word;
+ background-color: #ffffff;
+ background-clip: border-box;
+ border: 1px solid #e6e4e9;
+ border-radius: 8px;
+ padding-left: 20px;
+ padding-bottom: 10px;
+ padding-right: 10px; }
+
+.key-druid-features * .card .card-header.no-border {
+ border: 0; }
+
+.key-druid-features * .card .card-header {
+ background: none;
+ font-weight: 500;
+ display: flex;
+ align-items: center;
+ min-height: 50px; }
+
+.key-druid-features * .card-header:first-child {
+ border-radius: calc(8px - 1px) calc(8px - 1px) 0 0; }
+
+.key-druid-features * .widget-49 .widget-49-title-wrapper {
+ display: flex;
+ align-items: center; }
+
+.key-druid-features * .widget-49 .widget-49-title-wrapper
.widget-49-date-primary {
+ display: flex;
+ align-items: center;
+ justify-content: center;
+ flex-direction: column;
+ background-color: #edf1fc;
+ width: 4rem;
+ height: 4rem;
+ border-radius: 50%; }
+
+.key-druid-features * .widget-49 .widget-49-title-wrapper
.widget-49-date-primary .widget-49-date-day {
+ color: #4e73e5;
+ font-weight: 500;
+ font-size: 1.5rem;
+ line-height: 1; }
+
+.key-druid-features * .widget-49 .widget-49-title-wrapper
.widget-49-date-primary .widget-49-date-month {
+ color: #4e73e5;
+ line-height: 1;
+ font-size: 1rem;
+ text-transform: uppercase; }
+
+.key-druid-features * .widget-49 .widget-49-title-wrapper
.widget-49-date-secondary {
+ display: flex;
+ align-items: center;
+ justify-content: center;
+ flex-direction: column;
+ background-color: #fcfcfd;
+ width: 4rem;
+ height: 4rem;
+ border-radius: 50%; }
+
+.key-druid-features * .widget-49 .widget-49-title-wrapper
.widget-49-date-secondary .widget-49-date-day {
+ color: #dde1e9;
+ font-weight: 500;
+ font-size: 1.5rem;
+ line-height: 1; }
+
+.key-druid-features * .widget-49 .widget-49-title-wrapper
.widget-49-date-secondary .widget-49-date-month {
+ color: #dde1e9;
+ line-height: 1;
+ font-size: 1rem;
+ text-transform: uppercase; }
+
+.key-druid-features * .widget-49 .widget-49-title-wrapper
.widget-49-date-success {
+ display: flex;
+ align-items: center;
+ justify-content: center;
+ flex-direction: column;
+ background-color: #e8faf8;
+ width: 4rem;
+ height: 4rem;
+ border-radius: 50%; }
+
+.key-druid-features * .widget-49 .widget-49-title-wrapper
.widget-49-date-success .widget-49-date-day {
+ color: #17d1bd;
+ font-weight: 500;
+ font-size: 1.5rem;
+ line-height: 1; }
+
+.key-druid-features * .widget-49 .widget-49-title-wrapper
.widget-49-date-success .widget-49-date-month {
+ color: #17d1bd;
+ line-height: 1;
+ font-size: 1rem;
+ text-transform: uppercase; }
+
+.key-druid-features * .widget-49 .widget-49-title-wrapper .widget-49-date-info
{
+ display: flex;
+ align-items: center;
+ justify-content: center;
+ flex-direction: column;
+ background-color: #ebf7ff;
+ width: 4rem;
+ height: 4rem;
+ border-radius: 50%; }
+
+.key-druid-features * .widget-49 .widget-49-title-wrapper .widget-49-date-info
.widget-49-date-day {
+ color: #36afff;
+ font-weight: 500;
+ font-size: 1.5rem;
+ line-height: 1; }
+
+.key-druid-features * .widget-49 .widget-49-title-wrapper .widget-49-date-info
.widget-49-date-month {
+ color: #36afff;
+ line-height: 1;
+ font-size: 1rem;
+ text-transform: uppercase; }
+
+.key-druid-features * .widget-49 .widget-49-title-wrapper
.widget-49-date-warning {
+ display: flex;
+ align-items: center;
+ justify-content: center;
+ flex-direction: column;
+ background-color: floralwhite;
+ width: 4rem;
+ height: 4rem;
+ border-radius: 50%; }
+
+.key-druid-features * .widget-49 .widget-49-title-wrapper
.widget-49-date-warning .widget-49-date-day {
+ color: #FFC868;
+ font-weight: 500;
+ font-size: 1.5rem;
+ line-height: 1; }
+
+.key-druid-features * .widget-49 .widget-49-title-wrapper
.widget-49-date-warning .widget-49-date-month {
+ color: #FFC868;
+ line-height: 1;
+ font-size: 1rem;
+ text-transform: uppercase; }
+
+.key-druid-features * .widget-49 .widget-49-title-wrapper
.widget-49-date-danger {
+ display: flex;
+ align-items: center;
+ justify-content: center;
+ flex-direction: column;
+ background-color: #feeeef;
+ width: 4rem;
+ height: 4rem;
+ border-radius: 50%; }
+
+.key-druid-features * .widget-49 .widget-49-title-wrapper
.widget-49-date-danger .widget-49-date-day {
+ color: #F95062;
+ font-weight: 500;
+ font-size: 1.5rem;
+ line-height: 1; }
+
+.key-druid-features * .widget-49 .widget-49-title-wrapper
.widget-49-date-danger .widget-49-date-month {
+ color: #F95062;
+ line-height: 1;
+ font-size: 1rem;
+ text-transform: uppercase; }
+
+.key-druid-features * .widget-49 .widget-49-title-wrapper
.widget-49-date-light {
+ display: flex;
+ align-items: center;
+ justify-content: center;
+ flex-direction: column;
+ background-color: #fefeff;
+ width: 4rem;
+ height: 4rem;
+ border-radius: 50%; }
+
+.key-druid-features * .widget-49 .widget-49-title-wrapper
.widget-49-date-light .widget-49-date-day {
+ color: #f7f9fa;
+ font-weight: 500;
+ font-size: 1.5rem;
+ line-height: 1; }
+
+.key-druid-features * .widget-49 .widget-49-title-wrapper
.widget-49-date-light .widget-49-date-month {
+ color: #f7f9fa;
+ line-height: 1;
+ font-size: 1rem;
+ text-transform: uppercase; }
+
+.key-druid-features * .widget-49 .widget-49-title-wrapper .widget-49-date-dark
{
+ display: flex;
+ align-items: center;
+ justify-content: center;
+ flex-direction: column;
+ background-color: #ebedee;
+ width: 4rem;
+ height: 4rem;
+ border-radius: 50%; }
+
+.key-druid-features * .widget-49 .widget-49-title-wrapper .widget-49-date-dark
.widget-49-date-day {
+ color: #394856;
+ font-weight: 500;
+ font-size: 1.5rem;
+ line-height: 1; }
+
+.key-druid-features * .widget-49 .widget-49-title-wrapper .widget-49-date-dark
.widget-49-date-month {
+ color: #394856;
+ line-height: 1;
+ font-size: 1rem;
+ text-transform: uppercase; }
+
+.key-druid-features * .widget-49 .widget-49-title-wrapper .widget-49-date-base
{
+ display: flex;
+ align-items: center;
+ justify-content: center;
+ flex-direction: column;
+ background-color: #f0fafb;
+ width: 4rem;
+ height: 4rem;
+ border-radius: 50%; }
+
+.key-druid-features * .widget-49 .widget-49-title-wrapper .widget-49-date-base
.widget-49-date-day {
+ color: #68CBD7;
+ font-weight: 500;
+ font-size: 1.5rem;
+ line-height: 1; }
+
+.key-druid-features * .widget-49 .widget-49-title-wrapper .widget-49-date-base
.widget-49-date-month {
+ color: #68CBD7;
+ line-height: 1;
+ font-size: 1rem;
+ text-transform: uppercase; }
+
+.key-druid-features * .widget-49 .widget-49-title-wrapper
.widget-49-meeting-info {
+ display: flex;
+ flex-direction: column;
+ margin-left: 1rem; }
+
+.key-druid-features * .widget-49 .widget-49-title-wrapper
.widget-49-meeting-info .widget-49-pro-title {
+ color: #3c4142;
+ font-size: 14px; }
+
+.key-druid-features * .widget-49 .widget-49-title-wrapper
.widget-49-meeting-info .widget-49-meeting-time {
+ color: #B1BAC5;
+ font-size: 13px; }
+
+.key-druid-features * .widget-49 .widget-49-meeting-points {
+ font-weight: 400;
+ font-size: 13px;
+ margin-top: .5rem; }
+
+.key-druid-features * .widget-49 .widget-49-meeting-points
.widget-49-meeting-item {
+ display: list-item;
+ color: #727686; }
+
+.key-druid-features * .widget-49 .widget-49-meeting-points
.widget-49-meeting-item span {
+ margin-left: .5rem; }
+
+.key-druid-features * .widget-49 .widget-49-meeting-action {
+ text-align: right; }
+
+.key-druid-features * .widget-49 .widget-49-meeting-action a {
+ text-transform: uppercase; }
+
+sup {
+ vertical-align: super;
+ font-size: smaller; }
diff --git a/img/graphical_ui_application_v2.png
b/img/graphical_ui_application_v2.png
new file mode 100644
index 00000000..17079af8
Binary files /dev/null and b/img/graphical_ui_application_v2.png differ
diff --git a/img/ingestion_layer_stream_batch.png
b/img/ingestion_layer_stream_batch.png
new file mode 100644
index 00000000..da5e08c9
Binary files /dev/null and b/img/ingestion_layer_stream_batch.png differ
diff --git a/img/scatter_gather_diagram.png b/img/scatter_gather_diagram.png
new file mode 100644
index 00000000..4566dc55
Binary files /dev/null and b/img/scatter_gather_diagram.png differ
diff --git a/index.html b/index.html
index 39eafb0f..b3a509ff 100644
--- a/index.html
+++ b/index.html
@@ -121,7 +121,8 @@
<div class="container">
<div class="row">
<div class="text-center">
- <p class="lead">Apache Druid is a real-time database to power modern
analytics applications.</p>
+ <h1>Apache<sup>®</sup> Druid</h1>
+ <p class="lead">Druid is a high performance, real-time analytics database
that delivers sub-second queries on streaming and batch data at scale and under
load.</p>
<p>
<a class="button" href="/downloads.html"><span class="fa
fa-download"></span> Download</a>
<a class="button" href="/community/join-slack?v=1"><span class="fab
fa-slack"></span> Join Slack</a>
@@ -141,41 +142,95 @@
</h2>
<div class="features">
<div class="feature">
- <span class="fa fa-chart-line fa"></span>
- <h5>Build fast, modern data analytics applications</h5>
+ <span class="fa fa-bolt"></span>
+ <h5>Sub-second queries at any scale</h5>
<p>
- Druid is designed for <a href='/use-cases'>workflows</a> where
fast ad-hoc analytics, instant data visibility, or supporting high concurrency
is important. As such, Druid is often used to power UIs where an interactive,
consistent user experience is desired.
+ Execute OLAP queries in milliseconds on high-cardinality and
high-dimensional data sets with billions to trillions of rows without
pre-defining or caching queries in advance.
</p>
</div>
<div class="feature">
- <span class="fa fa-forward fa"></span>
- <h5>Easy integration with your existing data pipelines</h5>
+ <span class="fa fa-dollar-sign"></span>
+ <h5>High concurrency at the lowest cost </h5>
<p>
- Druid streams data from message buses such as <a
href='http://kafka.apache.org/'>Kafka</a>, and <a
href='https://aws.amazon.com/kinesis/'>Amazon Kinesis</a>, and batch load files
from data lakes such as <a
href='https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html'>HDFS</a>,
and <a href='https://aws.amazon.com/s3/'>Amazon S3</a>. Druid supports most
popular file formats for structured and semi-structured data.
+ Build real-time analytics applications that supports 100s to
100,000s queries per second at consistent performance with a highly efficient
architecture that uses less infrastructure than other databases.
</p>
</div>
<div class="feature">
- <span class="fa fa-lightbulb fa"></span>
- <h5>Fast, consistent queries at high concurrency</h5>
+ <span class="fa fa-chart-line"></span>
+ <h5>Real-time and historical insights</h5>
<p>
- Druid has been <a
href='https://imply.io/post/performance-benchmark-druid-presto-hive'>benchmarked</a>
to greatly outperform legacy solutions. Druid combines novel storage ideas,
indexing structures, and both exact and approximate queries to return most
results in under a second.
+ Unlock streaming data potential through Druid's native integration
with Apache Kafka and Amazon Kinesis as it supports query-on-arrival at
millions of events per second, low latency ingestion, and guaranteed
consistency.
</p>
</div>
- <div class="feature">
- <span class="fa fa-unlock fa"></span>
- <h5>Broad applicability</h5>
- <p>
- Druid <a href='/use-cases'>unlocks new types of queries and
workflows</a> for clickstream, APM, supply chain, network telemetry, digital
marketing, risk/fraud, and many other types of data. Druid is purpose built for
rapid, ad-hoc queries on both real-time and historical data.
- </p>
+ </div>
+
+ <h2>
+ Key Druid Features
+ </h2>
+
+ <div class="row key-druid-features">
+ <div class="col-md-4">
+ <div class="card card-margin">
+ <div class="card-header no-border">
+ <h5 class="card-title">Interactive Query Engine</h5>
+ </div>
+ <div class="card-body pt-0">
+ <p>Druid utilizes scatter/gather for high speed queries with
data preloaded into memory or local storage to avoid data movement and network
latency</p>
+ </div>
+ </div>
</div>
- <div class="feature">
- <span class="fa fa-cloud fa"></span>
- <h5>Deploy in public, private, and hybrid clouds</h5>
- <p>
- Druid can be deployed in any *NIX environment on commodity
hardware, both in the cloud and on premise. Deploying Druid is easy: scaling up
and down is as simple as adding and removing Druid services.
- </p>
+ <div class="col-md-4">
+ <div class="card card-margin">
+ <div class="card-header no-border">
+ <h5 class="card-title">Tiering & QoS</h5>
+ </div>
+ <div class="card-body pt-0">
+ <p>Configurable tiering with quality of service enables the
ideal price-performance for mixed workloads, guarantees priority, and avoids
resource contention</p>
+ </div>
+ </div>
+ </div>
+ <div class="col-md-4">
+ <div class="card card-margin">
+ <div class="card-header no-border">
+ <h5 class="card-title">Optimized Data Format</h5>
+ </div>
+ <div class="card-body pt-0">
+ <p>Ingested data is automatically columnarized, time
indexed, dictionary encoded, bitmap indexed, and type-aware compressed</p>
+ </div>
+ </div>
+ </div>
+ <div class="col-md-4">
+ <div class="card card-margin">
+ <div class="card-header no-border">
+ <h5 class="card-title">Elastic Architecture</h5>
+ </div>
+ <div class="card-body pt-0">
+ <p>Loosely coupled components for ingestion, queries and
orchestration combined with a deep storage layer enable easy & quick scale-up &
scale-out</p>
+ </div>
+ </div>
+ </div>
+ <div class="col-md-4">
+ <div class="card card-margin">
+ <div class="card-header no-border">
+ <h5 class="card-title">True Stream Ingestion</h5>
+ </div>
+ <div class="card-body pt-0">
+ <p>A connector-free integration with streaming platforms
enables query-on-arrival, high scalability, low latency, and guaranteed
consistency</p>
+ </div>
+ </div>
+ </div>
+ <div class="col-md-4">
+ <div class="card card-margin">
+ <div class="card-header no-border">
+ <h5 class="card-title">Non-stop Reliability</h5>
+ </div>
+ <div class="card-body pt-0">
+ <p>Automatic data services including continuous backup,
automated recovery, and multi-node replication ensure high availability and
durability</p>
+ </div>
+ </div>
</div>
</div>
+
<h2>
Learn more
@@ -185,7 +240,7 @@
<span class="fa fa-power-off fa"></span>
<h5>Powered By</h5>
<p>
- Druid is proven in production at the <a
href='/druid-powered'>world’s leading companies</a> at massive scale.
+ Druid is proven in production at the <a
href='/druid-powered'>world's leading companies</a> at massive scale.
</p>
</div>
<div class="feature">
@@ -209,6 +264,13 @@
Get help from a <a href='/community/'>wide network of community
members</a> about using Druid.
</p>
</div>
+ <div class="feature">
+ <span class="fa fa-podcast fa"></span>
+ <h5>Podcast</h5>
+ <p>
+ Hear from the Druid community on <a
href="https://podcasts.apple.com/us/podcast/tales-at-scale/id1655951714">Apple</a>,
<a href="https://open.spotify.com/show/6KAKYLJvCVegsFfKvbfDnt">Spotify</a>,
and <a
href="https://podcasts.google.com/feed/aHR0cHM6Ly9mZWVkcy5saWJzeW4uY29tLzQ0ODE3OS9yc3M">Google</a>.
+ </p>
+ </div>
</div>
</div>
@@ -413,47 +475,42 @@
</h3>
<p>
- <a
href="https://blog.hellmar-becker.de/2023/01/22/apache-druid-data-lifecycle-management/">
- <span class="title">Apache Druid: Data Lifecycle Management</span><br>
- <span class="text-muted">Hellmar Becker - </span>
- <span class="text-muted">Imply</span><br>
- <span class="text-muted">Jan 22 2023</span>
+ <a
href="https://imply.io/blog/real-time-analytics-database-uses-partitioning-and-pruning-to-achieve-its-legendary-performance/">
+ <span class="title">Primary and secondary partitioning</span><br>
+ <span class="text-muted">Sergio Ferragut</span><br>
+ <span class="text-muted">Jan 27 2023</span>
</a>
</p>
<p>
- <a
href="https://blog.hellmar-becker.de/2022/03/20/druid-data-cookbook-quantiles-in-druid-with-datasketches/">
- <span class="title">Druid Data Cookbook: Quantiles in Druid with Data
Sketches</span><br>
- <span class="text-muted">Hellmar Becker - </span>
- <span class="text-muted">Imply</span><br>
- <span class="text-muted">Mar 20 2022</span>
+ <a
href="https://devops.com/stream-big-think-bigger-analyze-streaming-data-at-scale/">
+ <span class="title">Using Apache Druid for analyzing streaming
data</span><br>
+ <span class="text-muted">Julia Brouillette</span><br>
+ <span class="text-muted">Jan 27 2023</span>
</a>
</p>
<p>
- <a
href="https://blog.hellmar-becker.de/2022/02/09/druid-data-cookbook-ingestion-transforms/">
- <span class="title">Druid Data Cookbook: Ingestion Transforms</span><br>
- <span class="text-muted">Hellmar Becker - </span>
- <span class="text-muted">Imply</span><br>
- <span class="text-muted">Feb 9 2022</span>
+ <a href="https://www.youtube.com/watch?v=Bozxc3vP1PA">
+ <span class="title">Why Confluent analyzes Kafka streams with
Druid</span><br>
+ <span class="text-muted">Matt Armstrong</span><br>
+ <span class="text-muted">Dec 15 2022</span>
</a>
</p>
<p>
- <a href="https://imply.io/blog/multi-dimensional-range-partitioning/">
- <span class="title">Multi-dimensional range partitioning</span><br>
- <span class="text-muted">Kashif Faraz - </span>
- <span class="text-muted">Imply</span><br>
- <span class="text-muted">Feb 4 2022</span>
+ <a
href="https://imply.io/blog/native-support-for-semi-structured-data-in-apache-druid/">
+ <span class="title">Support for nested JSON columns in Druid</span><br>
+ <span class="text-muted">Karthik Kasibhatla</span><br>
+ <span class="text-muted">Dec 14 2022</span>
</a>
</p>
<p>
- <a
href="https://www.rilldata.com/blog/seeking-the-perfect-apache-druid-rollup">
- <span class="title">Seeking the Perfect Apache Druid Rollup</span><br>
- <span class="text-muted">Neil Buesing - </span>
- <span class="text-muted">Rill Data</span><br>
- <span class="text-muted">Dec 16 2021</span>
+ <a
href="https://imply.io/videos/apache-druids-fit-in-the-modern-data-stack/">
+ <span class="title">Apache Druid's fit in the modern data
stack</span><br>
+ <span class="text-muted">David Wang</span><br>
+ <span class="text-muted">Dec 2 2022</span>
</a>
</p>
diff --git a/technology.html b/technology.html
index 5a93aa17..6b6ef72d 100644
--- a/technology.html
+++ b/technology.html
@@ -125,183 +125,75 @@
<div class="container">
<div class="row">
<div class="col-md-10 col-md-offset-1">
- <p>Apache Druid is an open source distributed data store.
-Druid’s core design combines ideas from <a
href="https://en.wikipedia.org/wiki/Data_warehouse">data warehouses</a>, <a
href="https://en.wikipedia.org/wiki/Time_series_database">timeseries
databases</a>, and <a
href="https://en.wikipedia.org/wiki/Full-text_search">search systems</a> to
create a high performance real-time analytics database for a broad range of <a
href="/use-cases">use cases</a>. Druid merges key characteristics of each of
the 3 systems into its ingestion layer, storage fo [...]
+ <p>Apache Druid is used to power real-time analytics applications that
require fast queries at scale and under load on streaming and batch data. Druid
features a unique distributed architecture across its ingestion, storage, and
query layer to handle the scale needed for large aggregations with the
performance needed for applications.</p>
+
+<h2 id="architecture">Architecture</h2>
<div class="image-large">
- <img src="img/diagram-2.png" style="max-width: 360px">
+ <img src="img/diagram-7.png" style="max-width: 800px;">
</div>
-<p>Key features of Druid include:</p>
-
-<div class="features">
- <div class="feature">
- <span class="fa fa-columns fa"></span>
- <h5>Column-oriented storage</h5>
- <p>
- Druid stores and compresses each column individually, and only needs to
read the ones needed for a particular query, which supports fast scans,
rankings, and groupBys.
- </p>
- </div>
- <div class="feature">
- <span class="fa fa-search fa"></span>
- <h5>Native search indexes</h5>
- <p>
- Druid creates inverted indexes for string values for fast search and
filter.
- </p>
- </div>
- <div class="feature">
- <span class="fa fa-tint fa"></span>
- <h5>Streaming and batch ingest</h5>
- <p>
- Out-of-the-box connectors for Apache Kafka, HDFS, AWS S3, stream
processors, and more.
- </p>
- </div>
- <div class="feature">
- <span class="fa fa-stream fa"></span>
- <h5>Flexible schemas</h5>
- <p>
- Druid gracefully handles evolving schemas and <a
href="/docs/latest/ingestion/data-formats.html#flattenspec">nested data</a>.
- </p>
- </div>
- <div class="feature">
- <span class="fa fa-clock fa"></span>
- <h5>Time-optimized partitioning</h5>
- <p>
- Druid intelligently partitions data based on time and time-based queries
are significantly faster than traditional databases.
- </p>
- </div>
- <div class="feature">
- <span class="fa fa-align-left fa"></span>
- <h5>SQL support</h5>
- <p>
- In addition to its native <a href="/docs/latest/querying/querying">JSON
based language</a>, Druid speaks <a href="/docs/latest/querying/sql">SQL</a>
over either HTTP or JDBC.
- </p>
- </div>
- <div class="feature">
- <span class="fa fa-expand fa"></span>
- <h5>Horizontal scalability</h5>
- <p>
- Druid has been <a href="druid-powered">used in production</a> to ingest
millions of events/sec, retain years of data, and provide sub-second queries.
- </p>
- </div>
- <div class="feature">
- <span class="fa fa-balance-scale fa"></span>
- <h5>Easy operation</h5>
- <p>
- Scale up or down by just adding or removing servers, and Druid
automatically rebalances. Fault-tolerant architecture routes around server
failures.
- </p>
- </div>
-</div>
+<p>Druid is a services-based architecture that consists of independently
scalable services for ingestion, querying, and orchestration, each of which can
be fine-tuned to optimize cluster resources for mixed use cases and workloads.
For example, more resources can be directed to Druid’s query service while
providing less resources to ingestion as workloads change. Druid services can
fail without impact on the operations of other services.</p>
-<h2 id="integration">Integration</h2>
+<p>A Druid deployment is a scalable cluster of commodity hardware with node
types that serve specific functions. In a small configuration, all of these
nodes can run together on a single server (or even a laptop). For larger
deployments, one or more servers are dedicated to each node type and can scale
to thousands of nodes for higher throughput requirements.</p>
-<p>Druid is complementary to many open source data technologies in the <a
href="https://www.apache.org/">Apache Software Foundation</a> including <a
href="https://kafka.apache.org/">Apache Kafka</a>, <a
href="https://hadoop.apache.org/">Apache Hadoop</a>, <a
href="https://flink.apache.org/">Apache Flink</a>, and more.</p>
+<ul style="margin-left: 20px;">
+ <li>Master Nodes govern data availability and ingestion</li>
+ <li>Query Nodes accept queries, manage execution across the system, and
return the results</li>
+ <li>Data Nodes execute ingestion workloads and queries as well as store
queryable data</li>
+</ul>
-<p>Druid typically sits between a storage or processing layer and the end
user, and acts as a query layer to serve analytic workloads.</p>
+<p>In addition, Druid utilizes a deep storage layer - cloud object storage or
HDFS - that contains an additional copy of each data segment. It enables
background data movement between Druid processes and also provides a highly
durable data source to recover from system failures.</p>
-<div class="image-large">
- <img src="img/diagram-3.png" style="max-width: 580px;">
-</div>
+<p>For more information, please visit <a
href="/docs/latest/design/index.html">our docs page</a>.</p>
-<h2 id="ingestion">Ingestion</h2>
+<h2 id="ingestion-layer">Ingestion Layer</h2>
-<p>Druid supports both streaming and batch ingestion.
-Druid connects to a source of raw data, typically a message bus such as Apache
Kafka (for streaming data loads), or a distributed filesystem such as HDFS (for
batch data loads).</p>
+<p>In Druid, ingestion, sometimes called indexing, is loading data into
tables. Druid reads data from source systems, whether files or streams, and
stores the data in segments.</p>
-<p>Druid converts raw data stored in a source to a more read-optimized format
(called a Druid “segment”) in a process calling “indexing”.</p>
+<p>When data is ingested into Druid, it is automatically indexed, partitioned,
and, optionally, partially pre-aggregated (known as <a
href="https://druid.apache.org/docs/latest/tutorials/tutorial-rollup.html">"rollup"</a>).
Compressed bitmap indexes enable fast filtering and searching across multiple
columns. Data is partitioned by time and, optionally, by other dimensions.</p>
<div class="image-large">
- <img src="img/diagram-4.png" style="max-width: 580px;">
+ <img alt="Stream Ingestion Layer" src="img/ingestion_layer_stream_batch.png"
style="max-width: 580px;">
</div>
-<p>For more information, please visit <a
href="/docs/latest/ingestion/index.html">our docs page</a>.</p>
-
-<h2 id="storage">Storage</h2>
+<h3>Stream Data</h3>
-<p>Like many analytic data stores, Druid stores data in columns.
-Depending on the type of column (string, number, etc), different compression
and encoding methods are applied.
-Druid also builds different types of indexes based on the column type.</p>
+<p>Druid was designed from the outset for rapid ingestion and immediate
querying of stream data upon delivery. No connectors are needed as Druid
includes inherent exactly-once ingestion for data streams using Apache Kafka®
and Amazon Kinesis APIs. Druid’s continuous backup into deep storage also
ensures a zero RPO for stream data.</p>
-<p>Similar to search systems, Druid builds inverted indexes for string columns
for fast search and filter.
-Similar to timeseries databases, Druid intelligently partitions data by time
to enable fast time-oriented queries.</p>
+<h3>Batch Data</h3>
-<p>Unlike many traditional systems, Druid can optionally pre-aggregate data as
it is ingested.
-This pre-aggregation step is known as <a
href="/docs/latest/tutorials/tutorial-rollup.html">rollup</a>, and can lead to
dramatic storage savings.</p>
+<p>Druid easily ingests data from object stores including HDFS, Amazon S3,
Azure Blob, and Google Cloud Storage as well as data files from databases and
other sources. The data files can be in a number of common formats, including
JSON, CSV, TSV, Parquet, ORC, Avro, and Protobuf. Druid supports both SQL batch
import and in-database transformations.</p>
-<div class="image-large">
- <img src="img/diagram-5.png" style="max-width: 800px;">
-</div>
+<p>For more information, please visit <a
href="/docs/latest/ingestion/index.html">our docs page</a>.</p>
-<p>For more information, please visit <a
href="/docs/latest/design/segments.html">our docs page</a>.</p>
+<h2 id="storage-format">Storage Format</h2>
-<h2 id="querying">Querying</h2>
+<p>Druid stores data into segments. Each segment is a single file, typically
comprising up to a few million rows of data. Each Druid table can have anywhere
from one segment to millions of segments distributed across the cluster.</p>
-<p>Druid supports querying data through <a
href="/docs/latest/querying/querying">JSON-over-HTTP</a> and <a
href="/docs/latest/querying/sql">SQL</a>.
-In addition to standard SQL operators, Druid supports unique operators that
leverage its suite of approximate algorithms to provide rapid counting,
ranking, and quantiles.</p>
+<p>Within the segments, data storage is column-oriented. Queries only load the
specific columns needed for each request. Each column’s storage is optimized by
data type, which further improves the performance of scans and aggregations.
String columns are stored using compressed dictionary encoding, while numeric
columns are stored using compressed raw values.</p>
<div class="image-large">
- <img src="img/diagram-6.png" style="max-width: 580px;">
+ <img alt="Graphical User Interface, Application"
src="img/graphical_ui_application_v2.png" style="max-width: 580px;">
</div>
-<p>For more information, please visit <a
href="/docs/latest/querying/querying.html">our docs page</a>.</p>
-
-<h2 id="architecture">Architecture</h2>
+<p>For more information, please visit <a
href="/docs/latest/design/segments.html">our docs page</a>.</p>
-<p>Druid has a microservice-based architecture can be thought of as a
disassembled database.
-Each core service in Druid (ingestion, querying, and coordination) can be
separately or jointly deployed on commodity hardware.</p>
+<h2 id="interactive-queries">Interactive Queries</h2>
-<p>Druid explicitly names every main service to allow the operator to fine
tune each service based on the use case and workload.
-For example, an operator can dedicate more resources to Druid’s ingestion
service while giving less resources to Druid’s query service if the workload
requires it.</p>
+<p>Druid's interactive query engine is utilized for performance-sensitive
queries. The query engine and storage format were designed together to provide
maximum query performance using the fewest resources possible (as well as the
best price for performance for mixed workloads). </p>
-<p>Druid services can independently fail without impacting the operations of
other services.</p>
+<p>With this engine, Druid only reads from segments that are pre-loaded into
memory or local storage in the data nodes. This ensures fast performance as
data is co-located with computing resources and does not have to move across a
network. Data is then queried using scatter/gather for optimal
parallelization.</p>
<div class="image-large">
- <img src="img/diagram-7.png" style="max-width: 800px;">
+ <img alt="Interactive Querying Scatter Gather Diagram"
src="img/scatter_gather_diagram.png" style="max-width: 580px;">
</div>
-<p>For more information, please visit <a
href="/docs/latest/design/index.html">our docs page</a>.</p>
-
-<h2 id="operations">Operations</h2>
+<p>First, the query engine prunes the list of segments, creating a list of
which segments are relevant to the query based on time-internals and other
filters. Next, queries are divided into discrete pieces and sent in parallel
to the data nodes that are managing each relevant segment or copy of that
segment (“scatter”). On the data nodes, the sub-queries are processed and sent
back to the query nodes to merge the final result set (“gather”). </p>
-<p>Druid is designed to power applications that need to be up 24 hours a day,
7 days a week.
-As such, Druid possesses several features to ensure uptime and no data
loss.</p>
+<p>Scatter/gather works from the smallest single server cluster (all of Druid
on one server) to clusters with thousands of servers, enabling sub-second
performance for most queries even with very large data sets of multiple
petabytes.</p>
-<div class="features">
- <div class="feature">
- <span class="fa fa-clone fa"></span>
- <h5>Data replication</h5>
- <p>
- All data in Druid is replicated a configurable number of times so single
server failures have no impact on queries.
- </p>
- </div>
- <div class="feature">
- <span class="fa fa-th-large fa"></span>
- <h5>Independent services</h5>
- <p>
- Druid explicitly names all of its main services and each service can be
fine tuned based on use case.
- Services can independently fail without impacting other services.
- For example, if the ingestion services fails, no new data is loaded in
the system, but existing data remains queryable.
- </p>
- </div>
- <div class="feature">
- <span class="fa fa-cloud-download-alt fa"></span>
- <h5>Automatic data backup</h5>
- <p>
- Druid automatically backs up all indexed data to a filesystem such as
HDFS.
- You can lose your entire Druid cluster and quickly restore it from this
backed up data.
- </p>
- </div>
- <div class="feature">
- <span class="fa fa-sync-alt fa"></span>
- <h5>Rolling updates</h5>
- <p>
- You can update a Druid cluster with no downtime and no impact to end
users through rolling updates.
- All Druid releases are backwards compatible with the previous version.
- </p>
- </div>
-</div>
-
-<p>For more information, please visit <a
href="/docs/latest/operations/basic-cluster-tuning.html">our docs page</a>.</p>
+<p>For more information, please visit <a
href="/docs/latest/querying/querying.html">our docs page</a>.</p>
</div>
</div>
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]