surekhasaharan commented on a change in pull request #6122: New docs intro
URL: https://github.com/apache/incubator-druid/pull/6122#discussion_r209023553
 
 

 ##########
 File path: docs/content/ingestion/overview.md
 ##########
 @@ -0,0 +1,279 @@
+---
+layout: doc_page
+---
+
+# Ingestion
+
+## Overview
+
+### Datasources and segments
+
+Druid data is stored in "datasources", which are similar to tables in a 
traditional RDBMS. Each datasource is
+partitioned by time and, optionally, further partitioned by other attributes. 
Each time range is called a "chunk" (for
+example, a single day, if your datasource is partitioned by day). Within a 
chunk, data is partitioned into one or more
+"segments". Each segment is a single file, typically comprising up to a few 
million rows of data. Since segments are
+organized into time chunks, it's sometimes helpful to think of segments as 
living on a timeline like the following:
+
+<img src="../../img/druid-timeline.png" width="800" />
+
+A datasource may have anywhere from just a few segments, up to hundreds of 
thousands and even millions of segments. Each
+segments starts life off being created on a MiddleManger, and at that point, 
is mutable and uncommitted. The segment
 
 Review comment:
   is this info repeated ? Same comments about `segment" and "MiddleManager" as 
above

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org

Reply via email to