jihoonson commented on a change in pull request #8311: Docusaurus build 
framework + ingestion doc refresh.
URL: https://github.com/apache/incubator-druid/pull/8311#discussion_r315428687
 
 

 ##########
 File path: docs/design/index.md
 ##########
 @@ -0,0 +1,100 @@
+---
+id: index
+title: "Introduction to Apache Druid"
+---
+
+<!--
+  ~ Licensed to the Apache Software Foundation (ASF) under one
+  ~ or more contributor license agreements.  See the NOTICE file
+  ~ distributed with this work for additional information
+  ~ regarding copyright ownership.  The ASF licenses this file
+  ~ to you under the Apache License, Version 2.0 (the
+  ~ "License"); you may not use this file except in compliance
+  ~ with the License.  You may obtain a copy of the License at
+  ~
+  ~   http://www.apache.org/licenses/LICENSE-2.0
+  ~
+  ~ Unless required by applicable law or agreed to in writing,
+  ~ software distributed under the License is distributed on an
+  ~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+  ~ KIND, either express or implied.  See the License for the
+  ~ specific language governing permissions and limitations
+  ~ under the License.
+  -->
+
+## What is Druid?
+
+Apache Druid (incubating) is a real-time analytics database designed for fast 
slice-and-dice analytics
+("[OLAP](http://en.wikipedia.org/wiki/Online_analytical_processing)" queries) 
on large data sets. Druid is most often
+used as a database for powering use cases where real-time ingest, fast query 
performance, and high uptime are important.
+As such, Druid is commonly used for powering GUIs of analytical applications, 
or as a backend for highly-concurrent APIs
+that need fast aggregations. Druid works best with event-oriented data.
+
+Common application areas for Druid include:
+
+- Clickstream analytics (web and mobile analytics)
+- Network telemetry analytics (network performance monitoring)
+- Server metrics storage
+- Supply chain analytics (manufacturing metrics)
+- Application performance metrics
+- Digital marketing/advertising analytics
+- Business intelligence / OLAP
+
+Druid's core architecture combines ideas from data warehouses, timeseries 
databases, and logsearch systems. Some of
+Druid's key features are:
+
+1. **Columnar storage format.** Druid uses column-oriented storage, meaning it 
only needs to load the exact columns
+needed for a particular query.  This gives a huge speed boost to queries that 
only hit a few columns. In addition, each
+column is stored optimized for its particular data type, which supports fast 
scans and aggregations.
+2. **Scalable distributed system.** Druid is typically deployed in clusters of 
tens to hundreds of servers, and can
+offer ingest rates of millions of records/sec, retention of trillions of 
records, and query latencies of sub-second to a
+few seconds.
+3. **Massively parallel processing.** Druid can process a query in parallel 
across the entire cluster.
+4. **Realtime or batch ingestion.** Druid can ingest data either real-time 
(ingested data is immediately available for
+querying) or in batches.
+5. **Self-healing, self-balancing, easy to operate.** As an operator, to scale 
the cluster out or in, simply add or
+remove servers and the cluster will rebalance itself automatically, in the 
background, without any downtime. If any
+Druid servers fail, the system will automatically route around the damage 
until those servers can be replaced. Druid
+is designed to run 24/7 with no need for planned downtimes for any reason, 
including configuration changes and software
+updates.
+6. **Cloud-native, fault-tolerant architecture that won't lose data.** Once 
Druid has ingested your data, a copy is
+stored safely in [deep storage](#deep-storage) (typically cloud storage, HDFS, 
or a shared filesystem). Your data can be
 
 Review comment:
   Broken link.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to