[incubator-pinot] 03/13: Manual merge of 49a65a3987a78fce2a0851a65eb65762b4f3003d for docs

mcvsubbu Mon, 04 Nov 2019 14:33:32 -0800

This is an automated email from the ASF dual-hosted git repository.

mcvsubbu pushed a commit to branch 0.2.0
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git


commit 43214a82dbcd6f53bb4d1e5143eaf5016523b990
Author: Subbu Subramaniam <[email protected]>
AuthorDate: Mon Nov 4 14:22:23 2019 -0800

    Manual merge of 49a65a3987a78fce2a0851a65eb65762b4f3003d for docs
    
    The original commit included a doc file and source files that were not
    related to the documenation change. Doing a manual merge of the 
documentation
    file alone
---
 docs/architecture.rst | 22 ++++++++++++----------
 1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/docs/architecture.rst b/docs/architecture.rst
index 7c285ca..c7fc486 100644
--- a/docs/architecture.rst
+++ b/docs/architecture.rst
@@ -84,7 +84,7 @@ Pinot Tables
 ------------
 
 Pinot supports realtime, or offline, or hybrid tables. Data in Pinot tables is 
contained in the segments
-belonging to that table. A Pinot table is modeled as a Helix resource.  Each 
segment of a table is modeled as a Helix Partition,
+belonging to that table. A Pinot table is modeled as a Helix resource.  Each 
segment of a table is modeled as a Helix Partition.
 
 Table Schema defines column names and their metadata. Table configuration and 
schema is stored in zookeeper.
 
@@ -109,7 +109,7 @@ Ingesting Offline data
 Segments for offline tables are constructed outside of Pinot, typically in 
Hadoop via map-reduce jobs
 and ingested into Pinot via REST API provided by the Controller.
 Pinot provides libraries to create Pinot segments out of input files in AVRO, 
JSON or CSV formats in a hadoop job, and push
-the constructed segments to the controlers via REST APIs.
+the constructed segments to the controllers via REST APIs.
 
 When an Offline segment is ingested, the controller looks up the table's 
configuration and assigns the segment
 to the servers that host the table. It may assign multiple servers for each 
segment depending on the number of replicas
@@ -126,7 +126,7 @@ start include the new
 segments for queries. Brokers support different routing strategies depending 
on the type of table, the segment assignment
 strategy and the use case.
 
-Data in offline segments are immmutable (Rows cannot be added, deleted, or 
modified). However, segments may be replaced with modified data.
+Data in offline segments are immutable (Rows cannot be added, deleted, or 
modified). However, segments may be replaced with modified data.
 
 .. _ingesting-realtime-data:
 
@@ -143,17 +143,17 @@ including all partitions (or, just from one partition).
 A pinot table can be configured to consume from streams in one of two modes:
 
     * ``LowLevel``: This is the preferred mode of consumption. Pinot creates 
independent partition-level consumers for
-      each partition. Depending on the the configured number of replicas, 
multiple consumers may be created for 
+      each partition. Depending on the the configured number of replicas, 
multiple consumers may be created for
       each partition, taking care that no two replicas exist on the same 
server host. Therefore you need to provision
       *at least* as many hosts as the number of replcias configured.
 
     * ``HighLevel``: Pinot creates *one* stream-level consumer that consumes 
from all partitions. Each message consumed
       could be from any of the partitions of the stream. Depending on the 
configured number of replicas, multiple
-      stream-level consumers are created, taking care that no two replicas 
exist on the same server host.  Therefore 
+      stream-level consumers are created, taking care that no two replicas 
exist on the same server host.  Therefore
       you need to provision exactly as many hosts as the number of replicas 
configured.
 
 Of course, the underlying stream should support either mode of consumption in 
order for a Pinot table to use that
-mode. Kafka has support for both of these modes. See :ref:`pluggable-streams` 
for more information on support of other 
+mode. Kafka has support for both of these modes. See :ref:`pluggable-streams` 
for more information on support of other
 data streams in Pinot.
 
 In either mode, Pinot servers store the ingested rows in volatile memory until 
either one of the following conditions are met:
@@ -179,7 +179,7 @@ easy and automated mechanisms for replacing pinot servers, 
or expanding capacity
 that ensure that the completed segment is equivalent across all replicas.
 
 In ``HighLevel`` mode, the servers persist the consumed rows into local store 
(and **not** the segment store). Since consumption of rows
-can be from any partition, it is not possible to guarantee equivalence of 
segments across replicas. 
+can be from any partition, it is not possible to guarantee equivalence of 
segments across replicas.
 
 See `Consuming and Indexing rows in Realtime 
<https://cwiki.apache.org/confluence/display/PINOT/Consuming+and+Indexing+rows+in+Realtime>`_
 for details.
 
@@ -187,10 +187,12 @@ See `Consuming and Indexing rows in Realtime 
<https://cwiki.apache.org/confluenc
 Pinot Segments
 --------------
 
-A segment is laid out in a columnar format
-so that it can be directly mapped into memory for serving queries. Columns may 
be single or multi-valued. Column types may be
+A segment is laid out in a columnar format so that it can be directly mapped 
into memory for serving queries.
+
+Columns may be single or multi-valued. Column types may be
 STRING, INT, LONG, FLOAT, DOUBLE or BYTES. Columns may be declared to be 
metric or dimension (or specifically as a time dimension)
-in the schema.
+in the schema. Columns can have default null value. For example, the default 
null value of a integer column can be 0.
+Note: The default value of byte column has to be hex-encoded before adding to 
the schema.
 
 Pinot uses dictionary encoding to store values as a dictionary ID. Columns may 
be configured to be "no-dictionary" column in which
 case raw values are stored. Dictionary IDs are encoded using minimum number of 
bits for efficient storage (*e.g.* a column with cardinality


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[incubator-pinot] 03/13: Manual merge of 49a65a3987a78fce2a0851a65eb65762b4f3003d for docs

Reply via email to