[GitHub] [incubator-pinot] dinoocch commented on a change in pull request #3975: ReadTheDocs documentation for Table Configs, Monitoring, and Deployment

2019-03-22 Thread GitBox
dinoocch commented on a change in pull request #3975: ReadTheDocs documentation 
for Table Configs, Monitoring, and Deployment
URL: https://github.com/apache/incubator-pinot/pull/3975#discussion_r268309823
 
 

 ##
 File path: docs/tableconfig_schema.rst
 ##
 @@ -0,0 +1,172 @@
+..
+.. Licensed to the Apache Software Foundation (ASF) under one
+.. or more contributor license agreements.  See the NOTICE file
+.. distributed with this work for additional information
+.. regarding copyright ownership.  The ASF licenses this file
+.. to you under the Apache License, Version 2.0 (the
+.. "License"); you may not use this file except in compliance
+.. with the License.  You may obtain a copy of the License at
+..
+..   http://www.apache.org/licenses/LICENSE-2.0
+..
+.. Unless required by applicable law or agreed to in writing,
+.. software distributed under the License is distributed on an
+.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+.. KIND, either express or implied.  See the License for the
+.. specific language governing permissions and limitations
+.. under the License.
+..
+
+Table Config
+===
+
+Table Config
+-
+
+Introduction to table configs
+~
+
+Using tables is how Pinot serves and organizes data. There are many settings 
in the table config which will influence how Pinot operates. The first and most 
significant distinction is using an offline versus a realtime table.
 
 Review comment:
   Up to you. I'd try to present a consistent format throughout your 
documentation though.
   
   For example, rather than giving the paragraph style, you could include some 
information in your description of the possible options for `tableType` below.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] dinoocch commented on a change in pull request #3975: ReadTheDocs documentation for Table Configs, Monitoring, and Deployment

2019-03-18 Thread GitBox
dinoocch commented on a change in pull request #3975: ReadTheDocs documentation 
for Table Configs, Monitoring, and Deployment
URL: https://github.com/apache/incubator-pinot/pull/3975#discussion_r266700535
 
 

 ##
 File path: docs/in_production.rst
 ##
 @@ -64,4 +67,32 @@ Configuring realtime data ingestion
 Monitoring Pinot
 
 
+In order for Pinot to provide effective service there is a core set of metrics 
which should be monitored to ensure service stability, fault tolerance and 
acceptable response times. In the section following, there are service level 
metrics which are recommended to be monitored.
+
+More info on metrics collection and viewing a complete set of available metric 
is available in the `Metrics `_ section.
+
+Pinot Server
+
+* Missing Segments - Number of missing segments - `NUM_MISSING_SEGMENTS 
`_
 
 Review comment:
   Probably good to give units as well where applicable (for example for 
consumption status, boolean might be confusing)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] dinoocch commented on a change in pull request #3975: ReadTheDocs documentation for Table Configs, Monitoring, and Deployment

2019-03-18 Thread GitBox
dinoocch commented on a change in pull request #3975: ReadTheDocs documentation 
for Table Configs, Monitoring, and Deployment
URL: https://github.com/apache/incubator-pinot/pull/3975#discussion_r266704832
 
 

 ##
 File path: docs/tableconfig_schema.rst
 ##
 @@ -0,0 +1,172 @@
+..
+.. Licensed to the Apache Software Foundation (ASF) under one
+.. or more contributor license agreements.  See the NOTICE file
+.. distributed with this work for additional information
+.. regarding copyright ownership.  The ASF licenses this file
+.. to you under the Apache License, Version 2.0 (the
+.. "License"); you may not use this file except in compliance
+.. with the License.  You may obtain a copy of the License at
+..
+..   http://www.apache.org/licenses/LICENSE-2.0
+..
+.. Unless required by applicable law or agreed to in writing,
+.. software distributed under the License is distributed on an
+.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+.. KIND, either express or implied.  See the License for the
+.. specific language governing permissions and limitations
+.. under the License.
+..
+
+Table Config
+===
+
+Table Config
+-
+
+Introduction to table configs
+~
+
+Using tables is how Pinot serves and organizes data. There are many settings 
in the table config which will influence how Pinot operates. The first and most 
significant distinction is using an offline versus a realtime table.
+
+An offline table in Pinot is used to host data which might be periodically 
uploaded - daily, weekly, etc. A realtime table, however, is used to consume 
data from incoming data streams and serve this data in a near-realtime manner. 
This might also be referred to as nearline or just plain 'realtime'.
+
+In this section a sample table configuration will be shown and all sections 
will be explained and if applicable have appropriate sections linked to for 
further explanation of those corresponding Pinot features.
+
+Sample table config and descriptions
+
+
+A sample table config is shown below which has sub-sections collasped. The sub 
sections will be described individually in following sections.
+
+The ``tableName`` should only contain alpha-numeric characters, hyphens ('-'), 
or underscores ('_'). Though using a double-underscore ('__') is not allowed 
and reserved for other features within Pinot.
+
+The ``tableType`` will indicate the type of the table, ``OFFLINE`` or 
``REALTIME``. There are some settings specific to each type. This 
differentiation will be called out below as options are explained.
+
+.. code-block:: none
+
+{
+  "tableName": "myPinotTable",
+  "tableType": "REALTIME"
+  "segmentsConfig": {},
+  "tableIndexConfig": {},
+  "tenants": {},
+  "routing": {},
+  "task": {},
+  "metadata": {}
+}
+
+Segments Config Section
+~~~
+
+The ``segmentsConfig`` section has information about configuring
+
+* Segment Retention - with the  ``retentionTimeUnit`` and 
``retentionTimeValue`` options.
+* Segment Push - Using ``segmentPushFrequency`` to indicate how frequently 
segments are uploaded.
+* Replication - Using ``replication`` for offline tables and 
``replicasPerPartition`` for realtime tables will indicate how many replicas of 
data will be present.
+* Schema - The name of the schema that's been uploaded to the controller
+* Time column - using ``timeColumnName`` and ``timeType``, this must match 
what's configured in the preceeding schema
+* Segment assignment strategy - Described more on the page `Customizing Pinot 
`_
+
+
+.. code-block:: none
+
+"segmentsConfig": {
+  "retentionTimeUnit": "DAYS",
+  "retentionTimeValue": "5",
+  "segmentPushFrequency": "daily",
+  "segmentPushType": "APPEND",
+  "replication": "3",
+  "replicasPerPartition": "3",
+  "schemaName": "ugcGestureEvents",
+  "timeColumnName": "daysSinceEpoch",
+  "timeType": "DAYS",
+  "segmentAssignmentStrategy": "BalanceNumSegmentAssignmentStrategy"
+},
+
+Table Index Config Section
+~~
+
+The ``tableIndexConfig`` section has information about how to configure:
+
+* Inverted Indexes - Using the ``invertedIndexColumns`` to specify a list of 
real column names as specified in the schema.
+* No Dictionary Columns - Using the ``noDictionaryColumns`` to specify a list 
of real column names as specified in the schema. Column names present will NOT 
have a dictionary created. More info on indexes can be found on the `Index 
Techniques `_ page.
+* Sorted Column - Using the ``sortedColumn`` to specify a list of real column 
names as specified in the schema.
+* Aggregate Metrics - Using ``aggregateMetrics`` set to ``"true"`` to enable 
the feature and ``"false"`` to disable. This feature is only available on 
REALTIME tables.
+* Data Partitioning Strategy using the ``segmentPartitionConfig`` to configure 

[GitHub] [incubator-pinot] dinoocch commented on a change in pull request #3975: ReadTheDocs documentation for Table Configs, Monitoring, and Deployment

2019-03-18 Thread GitBox
dinoocch commented on a change in pull request #3975: ReadTheDocs documentation 
for Table Configs, Monitoring, and Deployment
URL: https://github.com/apache/incubator-pinot/pull/3975#discussion_r266701093
 
 

 ##
 File path: docs/in_production.rst
 ##
 @@ -64,4 +67,32 @@ Configuring realtime data ingestion
 Monitoring Pinot
 
 
+In order for Pinot to provide effective service there is a core set of metrics 
which should be monitored to ensure service stability, fault tolerance and 
acceptable response times. In the section following, there are service level 
metrics which are recommended to be monitored.
+
+More info on metrics collection and viewing a complete set of available metric 
is available in the `Metrics `_ section.
+
+Pinot Server
+
+* Missing Segments - Number of missing segments - `NUM_MISSING_SEGMENTS 
`_
+* Query latency - Latency from the time a server receives a request to when it 
sends a response - `TOTAL_QUERY_TIME 
`_
+* Query Execution Exceptions - The number of exception which might have 
occurred during query execution - `QUERY_EXECUTION_EXCEPTIONS 
`_
+* Realtime Consumption Status - It's important to ensure at least a single 
replica of each partition is consuming - `LLC_PARTITION_CONSUMING 
`_
+* Realtime Highest Offset Consumed - `HIGHEST_STREAM_OFFSET_CONSUMED 
`_
+
+Pinot Broker
+
+* Incoming QPS (per broker) - `QUERIES 
`_
+* Dropped Requests - `REQUEST_DROPPED_DUE_TO_SEND_ERROR 
`_,
 `REQUEST_DROPPED_DUE_TO_CONNECTION_ERROR 
`_,
 `REQUEST_DROPPED_DUE_TO_ACCESS_ERROR 
`_
+* Partial Responses - `BROKER_RESPONSES_WITH_PARTIAL_SERVERS_RESPONDED 
`_
+* Table QPS quota exceeded - `QUERY_QUOTA_EXCEEDED 
`_
+* Table QPS quota usage percent - `QUERY_QUOTA_CAPACITY_UTILIZATION_RATE 
`_
+
+Pinot Controller
+
+* Missing Segment Count -
 
 Review comment:
   Gaps in time for segment coverage, I think?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] dinoocch commented on a change in pull request #3975: ReadTheDocs documentation for Table Configs, Monitoring, and Deployment

2019-03-18 Thread GitBox
dinoocch commented on a change in pull request #3975: ReadTheDocs documentation 
for Table Configs, Monitoring, and Deployment
URL: https://github.com/apache/incubator-pinot/pull/3975#discussion_r266703680
 
 

 ##
 File path: docs/tableconfig_schema.rst
 ##
 @@ -0,0 +1,172 @@
+..
+.. Licensed to the Apache Software Foundation (ASF) under one
+.. or more contributor license agreements.  See the NOTICE file
+.. distributed with this work for additional information
+.. regarding copyright ownership.  The ASF licenses this file
+.. to you under the Apache License, Version 2.0 (the
+.. "License"); you may not use this file except in compliance
+.. with the License.  You may obtain a copy of the License at
+..
+..   http://www.apache.org/licenses/LICENSE-2.0
+..
+.. Unless required by applicable law or agreed to in writing,
+.. software distributed under the License is distributed on an
+.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+.. KIND, either express or implied.  See the License for the
+.. specific language governing permissions and limitations
+.. under the License.
+..
+
+Table Config
+===
+
+Table Config
+-
+
+Introduction to table configs
+~
+
+Using tables is how Pinot serves and organizes data. There are many settings 
in the table config which will influence how Pinot operates. The first and most 
significant distinction is using an offline versus a realtime table.
+
+An offline table in Pinot is used to host data which might be periodically 
uploaded - daily, weekly, etc. A realtime table, however, is used to consume 
data from incoming data streams and serve this data in a near-realtime manner. 
This might also be referred to as nearline or just plain 'realtime'.
+
+In this section a sample table configuration will be shown and all sections 
will be explained and if applicable have appropriate sections linked to for 
further explanation of those corresponding Pinot features.
+
+Sample table config and descriptions
+
+
+A sample table config is shown below which has sub-sections collasped. The sub 
sections will be described individually in following sections.
+
+The ``tableName`` should only contain alpha-numeric characters, hyphens ('-'), 
or underscores ('_'). Though using a double-underscore ('__') is not allowed 
and reserved for other features within Pinot.
+
+The ``tableType`` will indicate the type of the table, ``OFFLINE`` or 
``REALTIME``. There are some settings specific to each type. This 
differentiation will be called out below as options are explained.
+
+.. code-block:: none
+
+{
+  "tableName": "myPinotTable",
+  "tableType": "REALTIME"
+  "segmentsConfig": {},
+  "tableIndexConfig": {},
+  "tenants": {},
+  "routing": {},
+  "task": {},
+  "metadata": {}
+}
+
+Segments Config Section
+~~~
+
+The ``segmentsConfig`` section has information about configuring
+
+* Segment Retention - with the  ``retentionTimeUnit`` and 
``retentionTimeValue`` options.
+* Segment Push - Using ``segmentPushFrequency`` to indicate how frequently 
segments are uploaded.
+* Replication - Using ``replication`` for offline tables and 
``replicasPerPartition`` for realtime tables will indicate how many replicas of 
data will be present.
+* Schema - The name of the schema that's been uploaded to the controller
+* Time column - using ``timeColumnName`` and ``timeType``, this must match 
what's configured in the preceeding schema
+* Segment assignment strategy - Described more on the page `Customizing Pinot 
`_
+
+
+.. code-block:: none
+
+"segmentsConfig": {
+  "retentionTimeUnit": "DAYS",
+  "retentionTimeValue": "5",
+  "segmentPushFrequency": "daily",
+  "segmentPushType": "APPEND",
+  "replication": "3",
+  "replicasPerPartition": "3",
+  "schemaName": "ugcGestureEvents",
+  "timeColumnName": "daysSinceEpoch",
+  "timeType": "DAYS",
+  "segmentAssignmentStrategy": "BalanceNumSegmentAssignmentStrategy"
+},
+
+Table Index Config Section
+~~
+
+The ``tableIndexConfig`` section has information about how to configure:
+
+* Inverted Indexes - Using the ``invertedIndexColumns`` to specify a list of 
real column names as specified in the schema.
+* No Dictionary Columns - Using the ``noDictionaryColumns`` to specify a list 
of real column names as specified in the schema. Column names present will NOT 
have a dictionary created. More info on indexes can be found on the `Index 
Techniques `_ page.
+* Sorted Column - Using the ``sortedColumn`` to specify a list of real column 
names as specified in the schema.
 
 Review comment:
   Does this only affect realtime?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go 

[GitHub] [incubator-pinot] dinoocch commented on a change in pull request #3975: ReadTheDocs documentation for Table Configs, Monitoring, and Deployment

2019-03-18 Thread GitBox
dinoocch commented on a change in pull request #3975: ReadTheDocs documentation 
for Table Configs, Monitoring, and Deployment
URL: https://github.com/apache/incubator-pinot/pull/3975#discussion_r266701213
 
 

 ##
 File path: docs/in_production.rst
 ##
 @@ -64,4 +67,32 @@ Configuring realtime data ingestion
 Monitoring Pinot
 
 
+In order for Pinot to provide effective service there is a core set of metrics 
which should be monitored to ensure service stability, fault tolerance and 
acceptable response times. In the section following, there are service level 
metrics which are recommended to be monitored.
+
+More info on metrics collection and viewing a complete set of available metric 
is available in the `Metrics `_ section.
+
+Pinot Server
+
+* Missing Segments - Number of missing segments - `NUM_MISSING_SEGMENTS 
`_
+* Query latency - Latency from the time a server receives a request to when it 
sends a response - `TOTAL_QUERY_TIME 
`_
+* Query Execution Exceptions - The number of exception which might have 
occurred during query execution - `QUERY_EXECUTION_EXCEPTIONS 
`_
+* Realtime Consumption Status - It's important to ensure at least a single 
replica of each partition is consuming - `LLC_PARTITION_CONSUMING 
`_
+* Realtime Highest Offset Consumed - `HIGHEST_STREAM_OFFSET_CONSUMED 
`_
+
+Pinot Broker
+
+* Incoming QPS (per broker) - `QUERIES 
`_
+* Dropped Requests - `REQUEST_DROPPED_DUE_TO_SEND_ERROR 
`_,
 `REQUEST_DROPPED_DUE_TO_CONNECTION_ERROR 
`_,
 `REQUEST_DROPPED_DUE_TO_ACCESS_ERROR 
`_
+* Partial Responses - `BROKER_RESPONSES_WITH_PARTIAL_SERVERS_RESPONDED 
`_
+* Table QPS quota exceeded - `QUERY_QUOTA_EXCEEDED 
`_
+* Table QPS quota usage percent - `QUERY_QUOTA_CAPACITY_UTILIZATION_RATE 
`_
+
+Pinot Controller
+
+* Missing Segment Count -
+* Segments in Error State -
 
 Review comment:
   Hopefully we have some information somewhere about what ERROR state means? 
or how to find what caused the ERROR state (ie instance errors) -- it would be 
good to link to that here


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] dinoocch commented on a change in pull request #3975: ReadTheDocs documentation for Table Configs, Monitoring, and Deployment

2019-03-18 Thread GitBox
dinoocch commented on a change in pull request #3975: ReadTheDocs documentation 
for Table Configs, Monitoring, and Deployment
URL: https://github.com/apache/incubator-pinot/pull/3975#discussion_r266703153
 
 

 ##
 File path: docs/tableconfig_schema.rst
 ##
 @@ -0,0 +1,172 @@
+..
+.. Licensed to the Apache Software Foundation (ASF) under one
+.. or more contributor license agreements.  See the NOTICE file
+.. distributed with this work for additional information
+.. regarding copyright ownership.  The ASF licenses this file
+.. to you under the Apache License, Version 2.0 (the
+.. "License"); you may not use this file except in compliance
+.. with the License.  You may obtain a copy of the License at
+..
+..   http://www.apache.org/licenses/LICENSE-2.0
+..
+.. Unless required by applicable law or agreed to in writing,
+.. software distributed under the License is distributed on an
+.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+.. KIND, either express or implied.  See the License for the
+.. specific language governing permissions and limitations
+.. under the License.
+..
+
+Table Config
+===
+
+Table Config
+-
+
+Introduction to table configs
+~
+
+Using tables is how Pinot serves and organizes data. There are many settings 
in the table config which will influence how Pinot operates. The first and most 
significant distinction is using an offline versus a realtime table.
+
+An offline table in Pinot is used to host data which might be periodically 
uploaded - daily, weekly, etc. A realtime table, however, is used to consume 
data from incoming data streams and serve this data in a near-realtime manner. 
This might also be referred to as nearline or just plain 'realtime'.
+
+In this section a sample table configuration will be shown and all sections 
will be explained and if applicable have appropriate sections linked to for 
further explanation of those corresponding Pinot features.
+
+Sample table config and descriptions
+
+
+A sample table config is shown below which has sub-sections collasped. The sub 
sections will be described individually in following sections.
+
+The ``tableName`` should only contain alpha-numeric characters, hyphens ('-'), 
or underscores ('_'). Though using a double-underscore ('__') is not allowed 
and reserved for other features within Pinot.
+
+The ``tableType`` will indicate the type of the table, ``OFFLINE`` or 
``REALTIME``. There are some settings specific to each type. This 
differentiation will be called out below as options are explained.
+
+.. code-block:: none
+
+{
+  "tableName": "myPinotTable",
+  "tableType": "REALTIME"
+  "segmentsConfig": {},
+  "tableIndexConfig": {},
+  "tenants": {},
+  "routing": {},
+  "task": {},
+  "metadata": {}
+}
+
+Segments Config Section
+~~~
+
+The ``segmentsConfig`` section has information about configuring
+
+* Segment Retention - with the  ``retentionTimeUnit`` and 
``retentionTimeValue`` options.
 
 Review comment:
   * Allowed values or examples as applicable


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] dinoocch commented on a change in pull request #3975: ReadTheDocs documentation for Table Configs, Monitoring, and Deployment

2019-03-18 Thread GitBox
dinoocch commented on a change in pull request #3975: ReadTheDocs documentation 
for Table Configs, Monitoring, and Deployment
URL: https://github.com/apache/incubator-pinot/pull/3975#discussion_r266703992
 
 

 ##
 File path: docs/tableconfig_schema.rst
 ##
 @@ -0,0 +1,172 @@
+..
+.. Licensed to the Apache Software Foundation (ASF) under one
+.. or more contributor license agreements.  See the NOTICE file
+.. distributed with this work for additional information
+.. regarding copyright ownership.  The ASF licenses this file
+.. to you under the Apache License, Version 2.0 (the
+.. "License"); you may not use this file except in compliance
+.. with the License.  You may obtain a copy of the License at
+..
+..   http://www.apache.org/licenses/LICENSE-2.0
+..
+.. Unless required by applicable law or agreed to in writing,
+.. software distributed under the License is distributed on an
+.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+.. KIND, either express or implied.  See the License for the
+.. specific language governing permissions and limitations
+.. under the License.
+..
+
+Table Config
+===
+
+Table Config
+-
+
+Introduction to table configs
+~
+
+Using tables is how Pinot serves and organizes data. There are many settings 
in the table config which will influence how Pinot operates. The first and most 
significant distinction is using an offline versus a realtime table.
+
+An offline table in Pinot is used to host data which might be periodically 
uploaded - daily, weekly, etc. A realtime table, however, is used to consume 
data from incoming data streams and serve this data in a near-realtime manner. 
This might also be referred to as nearline or just plain 'realtime'.
+
+In this section a sample table configuration will be shown and all sections 
will be explained and if applicable have appropriate sections linked to for 
further explanation of those corresponding Pinot features.
+
+Sample table config and descriptions
+
+
+A sample table config is shown below which has sub-sections collasped. The sub 
sections will be described individually in following sections.
+
+The ``tableName`` should only contain alpha-numeric characters, hyphens ('-'), 
or underscores ('_'). Though using a double-underscore ('__') is not allowed 
and reserved for other features within Pinot.
+
+The ``tableType`` will indicate the type of the table, ``OFFLINE`` or 
``REALTIME``. There are some settings specific to each type. This 
differentiation will be called out below as options are explained.
+
+.. code-block:: none
+
+{
+  "tableName": "myPinotTable",
+  "tableType": "REALTIME"
+  "segmentsConfig": {},
+  "tableIndexConfig": {},
+  "tenants": {},
+  "routing": {},
+  "task": {},
+  "metadata": {}
+}
+
+Segments Config Section
+~~~
+
+The ``segmentsConfig`` section has information about configuring
+
+* Segment Retention - with the  ``retentionTimeUnit`` and 
``retentionTimeValue`` options.
+* Segment Push - Using ``segmentPushFrequency`` to indicate how frequently 
segments are uploaded.
+* Replication - Using ``replication`` for offline tables and 
``replicasPerPartition`` for realtime tables will indicate how many replicas of 
data will be present.
+* Schema - The name of the schema that's been uploaded to the controller
+* Time column - using ``timeColumnName`` and ``timeType``, this must match 
what's configured in the preceeding schema
+* Segment assignment strategy - Described more on the page `Customizing Pinot 
`_
+
+
+.. code-block:: none
+
+"segmentsConfig": {
+  "retentionTimeUnit": "DAYS",
+  "retentionTimeValue": "5",
+  "segmentPushFrequency": "daily",
+  "segmentPushType": "APPEND",
+  "replication": "3",
+  "replicasPerPartition": "3",
+  "schemaName": "ugcGestureEvents",
+  "timeColumnName": "daysSinceEpoch",
+  "timeType": "DAYS",
+  "segmentAssignmentStrategy": "BalanceNumSegmentAssignmentStrategy"
+},
+
+Table Index Config Section
+~~
+
+The ``tableIndexConfig`` section has information about how to configure:
+
+* Inverted Indexes - Using the ``invertedIndexColumns`` to specify a list of 
real column names as specified in the schema.
+* No Dictionary Columns - Using the ``noDictionaryColumns`` to specify a list 
of real column names as specified in the schema. Column names present will NOT 
have a dictionary created. More info on indexes can be found on the `Index 
Techniques `_ page.
+* Sorted Column - Using the ``sortedColumn`` to specify a list of real column 
names as specified in the schema.
+* Aggregate Metrics - Using ``aggregateMetrics`` set to ``"true"`` to enable 
the feature and ``"false"`` to disable. This feature is only available on 
REALTIME tables.
 
 Review comment:
   Maybe there should eventually be a page for this feature


[GitHub] [incubator-pinot] dinoocch commented on a change in pull request #3975: ReadTheDocs documentation for Table Configs, Monitoring, and Deployment

2019-03-18 Thread GitBox
dinoocch commented on a change in pull request #3975: ReadTheDocs documentation 
for Table Configs, Monitoring, and Deployment
URL: https://github.com/apache/incubator-pinot/pull/3975#discussion_r266703613
 
 

 ##
 File path: docs/tableconfig_schema.rst
 ##
 @@ -0,0 +1,172 @@
+..
+.. Licensed to the Apache Software Foundation (ASF) under one
+.. or more contributor license agreements.  See the NOTICE file
+.. distributed with this work for additional information
+.. regarding copyright ownership.  The ASF licenses this file
+.. to you under the Apache License, Version 2.0 (the
+.. "License"); you may not use this file except in compliance
+.. with the License.  You may obtain a copy of the License at
+..
+..   http://www.apache.org/licenses/LICENSE-2.0
+..
+.. Unless required by applicable law or agreed to in writing,
+.. software distributed under the License is distributed on an
+.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+.. KIND, either express or implied.  See the License for the
+.. specific language governing permissions and limitations
+.. under the License.
+..
+
+Table Config
+===
+
+Table Config
+-
+
+Introduction to table configs
+~
+
+Using tables is how Pinot serves and organizes data. There are many settings 
in the table config which will influence how Pinot operates. The first and most 
significant distinction is using an offline versus a realtime table.
+
+An offline table in Pinot is used to host data which might be periodically 
uploaded - daily, weekly, etc. A realtime table, however, is used to consume 
data from incoming data streams and serve this data in a near-realtime manner. 
This might also be referred to as nearline or just plain 'realtime'.
+
+In this section a sample table configuration will be shown and all sections 
will be explained and if applicable have appropriate sections linked to for 
further explanation of those corresponding Pinot features.
+
+Sample table config and descriptions
+
+
+A sample table config is shown below which has sub-sections collasped. The sub 
sections will be described individually in following sections.
+
+The ``tableName`` should only contain alpha-numeric characters, hyphens ('-'), 
or underscores ('_'). Though using a double-underscore ('__') is not allowed 
and reserved for other features within Pinot.
+
+The ``tableType`` will indicate the type of the table, ``OFFLINE`` or 
``REALTIME``. There are some settings specific to each type. This 
differentiation will be called out below as options are explained.
+
+.. code-block:: none
+
+{
+  "tableName": "myPinotTable",
+  "tableType": "REALTIME"
+  "segmentsConfig": {},
+  "tableIndexConfig": {},
+  "tenants": {},
+  "routing": {},
+  "task": {},
+  "metadata": {}
+}
+
+Segments Config Section
+~~~
+
+The ``segmentsConfig`` section has information about configuring
+
+* Segment Retention - with the  ``retentionTimeUnit`` and 
``retentionTimeValue`` options.
+* Segment Push - Using ``segmentPushFrequency`` to indicate how frequently 
segments are uploaded.
+* Replication - Using ``replication`` for offline tables and 
``replicasPerPartition`` for realtime tables will indicate how many replicas of 
data will be present.
+* Schema - The name of the schema that's been uploaded to the controller
+* Time column - using ``timeColumnName`` and ``timeType``, this must match 
what's configured in the preceeding schema
+* Segment assignment strategy - Described more on the page `Customizing Pinot 
`_
+
+
+.. code-block:: none
+
+"segmentsConfig": {
+  "retentionTimeUnit": "DAYS",
+  "retentionTimeValue": "5",
+  "segmentPushFrequency": "daily",
+  "segmentPushType": "APPEND",
+  "replication": "3",
+  "replicasPerPartition": "3",
+  "schemaName": "ugcGestureEvents",
+  "timeColumnName": "daysSinceEpoch",
+  "timeType": "DAYS",
+  "segmentAssignmentStrategy": "BalanceNumSegmentAssignmentStrategy"
+},
+
+Table Index Config Section
+~~
+
+The ``tableIndexConfig`` section has information about how to configure:
+
+* Inverted Indexes - Using the ``invertedIndexColumns`` to specify a list of 
real column names as specified in the schema.
+* No Dictionary Columns - Using the ``noDictionaryColumns`` to specify a list 
of real column names as specified in the schema. Column names present will NOT 
have a dictionary created. More info on indexes can be found on the `Index 
Techniques `_ page.
+* Sorted Column - Using the ``sortedColumn`` to specify a list of real column 
names as specified in the schema.
+* Aggregate Metrics - Using ``aggregateMetrics`` set to ``"true"`` to enable 
the feature and ``"false"`` to disable. This feature is only available on 
REALTIME tables.
 
 Review comment:
   What does this do? Just a brief overview, unless there's a page 

[GitHub] [incubator-pinot] dinoocch commented on a change in pull request #3975: ReadTheDocs documentation for Table Configs, Monitoring, and Deployment

2019-03-18 Thread GitBox
dinoocch commented on a change in pull request #3975: ReadTheDocs documentation 
for Table Configs, Monitoring, and Deployment
URL: https://github.com/apache/incubator-pinot/pull/3975#discussion_r266703950
 
 

 ##
 File path: docs/tableconfig_schema.rst
 ##
 @@ -0,0 +1,172 @@
+..
+.. Licensed to the Apache Software Foundation (ASF) under one
+.. or more contributor license agreements.  See the NOTICE file
+.. distributed with this work for additional information
+.. regarding copyright ownership.  The ASF licenses this file
+.. to you under the Apache License, Version 2.0 (the
+.. "License"); you may not use this file except in compliance
+.. with the License.  You may obtain a copy of the License at
+..
+..   http://www.apache.org/licenses/LICENSE-2.0
+..
+.. Unless required by applicable law or agreed to in writing,
+.. software distributed under the License is distributed on an
+.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+.. KIND, either express or implied.  See the License for the
+.. specific language governing permissions and limitations
+.. under the License.
+..
+
+Table Config
+===
+
+Table Config
+-
+
+Introduction to table configs
+~
+
+Using tables is how Pinot serves and organizes data. There are many settings 
in the table config which will influence how Pinot operates. The first and most 
significant distinction is using an offline versus a realtime table.
+
+An offline table in Pinot is used to host data which might be periodically 
uploaded - daily, weekly, etc. A realtime table, however, is used to consume 
data from incoming data streams and serve this data in a near-realtime manner. 
This might also be referred to as nearline or just plain 'realtime'.
+
+In this section a sample table configuration will be shown and all sections 
will be explained and if applicable have appropriate sections linked to for 
further explanation of those corresponding Pinot features.
+
+Sample table config and descriptions
+
+
+A sample table config is shown below which has sub-sections collasped. The sub 
sections will be described individually in following sections.
+
+The ``tableName`` should only contain alpha-numeric characters, hyphens ('-'), 
or underscores ('_'). Though using a double-underscore ('__') is not allowed 
and reserved for other features within Pinot.
+
+The ``tableType`` will indicate the type of the table, ``OFFLINE`` or 
``REALTIME``. There are some settings specific to each type. This 
differentiation will be called out below as options are explained.
+
+.. code-block:: none
+
+{
+  "tableName": "myPinotTable",
+  "tableType": "REALTIME"
+  "segmentsConfig": {},
+  "tableIndexConfig": {},
+  "tenants": {},
+  "routing": {},
+  "task": {},
+  "metadata": {}
+}
+
+Segments Config Section
+~~~
+
+The ``segmentsConfig`` section has information about configuring
+
+* Segment Retention - with the  ``retentionTimeUnit`` and 
``retentionTimeValue`` options.
+* Segment Push - Using ``segmentPushFrequency`` to indicate how frequently 
segments are uploaded.
+* Replication - Using ``replication`` for offline tables and 
``replicasPerPartition`` for realtime tables will indicate how many replicas of 
data will be present.
+* Schema - The name of the schema that's been uploaded to the controller
+* Time column - using ``timeColumnName`` and ``timeType``, this must match 
what's configured in the preceeding schema
+* Segment assignment strategy - Described more on the page `Customizing Pinot 
`_
+
+
+.. code-block:: none
+
+"segmentsConfig": {
+  "retentionTimeUnit": "DAYS",
+  "retentionTimeValue": "5",
+  "segmentPushFrequency": "daily",
+  "segmentPushType": "APPEND",
+  "replication": "3",
+  "replicasPerPartition": "3",
+  "schemaName": "ugcGestureEvents",
+  "timeColumnName": "daysSinceEpoch",
+  "timeType": "DAYS",
+  "segmentAssignmentStrategy": "BalanceNumSegmentAssignmentStrategy"
+},
+
+Table Index Config Section
+~~
+
+The ``tableIndexConfig`` section has information about how to configure:
+
+* Inverted Indexes - Using the ``invertedIndexColumns`` to specify a list of 
real column names as specified in the schema.
+* No Dictionary Columns - Using the ``noDictionaryColumns`` to specify a list 
of real column names as specified in the schema. Column names present will NOT 
have a dictionary created. More info on indexes can be found on the `Index 
Techniques `_ page.
+* Sorted Column - Using the ``sortedColumn`` to specify a list of real column 
names as specified in the schema.
+* Aggregate Metrics - Using ``aggregateMetrics`` set to ``"true"`` to enable 
the feature and ``"false"`` to disable. This feature is only available on 
REALTIME tables.
 
 Review comment:
   You also need all metrics(?) columns in no dictionary columns 

[GitHub] [incubator-pinot] dinoocch commented on a change in pull request #3975: ReadTheDocs documentation for Table Configs, Monitoring, and Deployment

2019-03-18 Thread GitBox
dinoocch commented on a change in pull request #3975: ReadTheDocs documentation 
for Table Configs, Monitoring, and Deployment
URL: https://github.com/apache/incubator-pinot/pull/3975#discussion_r266705683
 
 

 ##
 File path: docs/tableconfig_schema.rst
 ##
 @@ -0,0 +1,172 @@
+..
+.. Licensed to the Apache Software Foundation (ASF) under one
+.. or more contributor license agreements.  See the NOTICE file
+.. distributed with this work for additional information
+.. regarding copyright ownership.  The ASF licenses this file
+.. to you under the Apache License, Version 2.0 (the
+.. "License"); you may not use this file except in compliance
+.. with the License.  You may obtain a copy of the License at
+..
+..   http://www.apache.org/licenses/LICENSE-2.0
+..
+.. Unless required by applicable law or agreed to in writing,
+.. software distributed under the License is distributed on an
+.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+.. KIND, either express or implied.  See the License for the
+.. specific language governing permissions and limitations
+.. under the License.
+..
+
+Table Config
+===
+
+Table Config
+-
+
+Introduction to table configs
+~
+
+Using tables is how Pinot serves and organizes data. There are many settings 
in the table config which will influence how Pinot operates. The first and most 
significant distinction is using an offline versus a realtime table.
 
 Review comment:
   Consider rewording.
   Some of this seems unnecessary? There should probably be a section long 
before this discussing the data model of offline and realtime tables.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] dinoocch commented on a change in pull request #3975: ReadTheDocs documentation for Table Configs, Monitoring, and Deployment

2019-03-18 Thread GitBox
dinoocch commented on a change in pull request #3975: ReadTheDocs documentation 
for Table Configs, Monitoring, and Deployment
URL: https://github.com/apache/incubator-pinot/pull/3975#discussion_r266703532
 
 

 ##
 File path: docs/tableconfig_schema.rst
 ##
 @@ -0,0 +1,172 @@
+..
+.. Licensed to the Apache Software Foundation (ASF) under one
+.. or more contributor license agreements.  See the NOTICE file
+.. distributed with this work for additional information
+.. regarding copyright ownership.  The ASF licenses this file
+.. to you under the Apache License, Version 2.0 (the
+.. "License"); you may not use this file except in compliance
+.. with the License.  You may obtain a copy of the License at
+..
+..   http://www.apache.org/licenses/LICENSE-2.0
+..
+.. Unless required by applicable law or agreed to in writing,
+.. software distributed under the License is distributed on an
+.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+.. KIND, either express or implied.  See the License for the
+.. specific language governing permissions and limitations
+.. under the License.
+..
+
+Table Config
+===
+
+Table Config
+-
+
+Introduction to table configs
+~
+
+Using tables is how Pinot serves and organizes data. There are many settings 
in the table config which will influence how Pinot operates. The first and most 
significant distinction is using an offline versus a realtime table.
+
+An offline table in Pinot is used to host data which might be periodically 
uploaded - daily, weekly, etc. A realtime table, however, is used to consume 
data from incoming data streams and serve this data in a near-realtime manner. 
This might also be referred to as nearline or just plain 'realtime'.
+
+In this section a sample table configuration will be shown and all sections 
will be explained and if applicable have appropriate sections linked to for 
further explanation of those corresponding Pinot features.
+
+Sample table config and descriptions
+
+
+A sample table config is shown below which has sub-sections collasped. The sub 
sections will be described individually in following sections.
+
+The ``tableName`` should only contain alpha-numeric characters, hyphens ('-'), 
or underscores ('_'). Though using a double-underscore ('__') is not allowed 
and reserved for other features within Pinot.
+
+The ``tableType`` will indicate the type of the table, ``OFFLINE`` or 
``REALTIME``. There are some settings specific to each type. This 
differentiation will be called out below as options are explained.
+
+.. code-block:: none
+
+{
+  "tableName": "myPinotTable",
+  "tableType": "REALTIME"
+  "segmentsConfig": {},
+  "tableIndexConfig": {},
+  "tenants": {},
+  "routing": {},
+  "task": {},
+  "metadata": {}
+}
+
+Segments Config Section
+~~~
+
+The ``segmentsConfig`` section has information about configuring
+
+* Segment Retention - with the  ``retentionTimeUnit`` and 
``retentionTimeValue`` options.
+* Segment Push - Using ``segmentPushFrequency`` to indicate how frequently 
segments are uploaded.
+* Replication - Using ``replication`` for offline tables and 
``replicasPerPartition`` for realtime tables will indicate how many replicas of 
data will be present.
+* Schema - The name of the schema that's been uploaded to the controller
+* Time column - using ``timeColumnName`` and ``timeType``, this must match 
what's configured in the preceeding schema
+* Segment assignment strategy - Described more on the page `Customizing Pinot 
`_
+
+
+.. code-block:: none
+
+"segmentsConfig": {
+  "retentionTimeUnit": "DAYS",
+  "retentionTimeValue": "5",
+  "segmentPushFrequency": "daily",
+  "segmentPushType": "APPEND",
+  "replication": "3",
+  "replicasPerPartition": "3",
+  "schemaName": "ugcGestureEvents",
+  "timeColumnName": "daysSinceEpoch",
+  "timeType": "DAYS",
+  "segmentAssignmentStrategy": "BalanceNumSegmentAssignmentStrategy"
+},
+
+Table Index Config Section
+~~
+
+The ``tableIndexConfig`` section has information about how to configure:
+
+* Inverted Indexes - Using the ``invertedIndexColumns`` to specify a list of 
real column names as specified in the schema.
+* No Dictionary Columns - Using the ``noDictionaryColumns`` to specify a list 
of real column names as specified in the schema. Column names present will NOT 
have a dictionary created. More info on indexes can be found on the `Index 
Techniques `_ page.
+* Sorted Column - Using the ``sortedColumn`` to specify a list of real column 
names as specified in the schema.
+* Aggregate Metrics - Using ``aggregateMetrics`` set to ``"true"`` to enable 
the feature and ``"false"`` to disable. This feature is only available on 
REALTIME tables.
+* Data Partitioning Strategy using the ``segmentPartitionConfig`` to configure 

[GitHub] [incubator-pinot] dinoocch commented on a change in pull request #3975: ReadTheDocs documentation for Table Configs, Monitoring, and Deployment

2019-03-18 Thread GitBox
dinoocch commented on a change in pull request #3975: ReadTheDocs documentation 
for Table Configs, Monitoring, and Deployment
URL: https://github.com/apache/incubator-pinot/pull/3975#discussion_r266704733
 
 

 ##
 File path: docs/tableconfig_schema.rst
 ##
 @@ -0,0 +1,172 @@
+..
+.. Licensed to the Apache Software Foundation (ASF) under one
+.. or more contributor license agreements.  See the NOTICE file
+.. distributed with this work for additional information
+.. regarding copyright ownership.  The ASF licenses this file
+.. to you under the Apache License, Version 2.0 (the
+.. "License"); you may not use this file except in compliance
+.. with the License.  You may obtain a copy of the License at
+..
+..   http://www.apache.org/licenses/LICENSE-2.0
+..
+.. Unless required by applicable law or agreed to in writing,
+.. software distributed under the License is distributed on an
+.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+.. KIND, either express or implied.  See the License for the
+.. specific language governing permissions and limitations
+.. under the License.
+..
+
+Table Config
+===
+
+Table Config
+-
+
+Introduction to table configs
+~
+
+Using tables is how Pinot serves and organizes data. There are many settings 
in the table config which will influence how Pinot operates. The first and most 
significant distinction is using an offline versus a realtime table.
+
+An offline table in Pinot is used to host data which might be periodically 
uploaded - daily, weekly, etc. A realtime table, however, is used to consume 
data from incoming data streams and serve this data in a near-realtime manner. 
This might also be referred to as nearline or just plain 'realtime'.
+
+In this section a sample table configuration will be shown and all sections 
will be explained and if applicable have appropriate sections linked to for 
further explanation of those corresponding Pinot features.
+
+Sample table config and descriptions
+
+
+A sample table config is shown below which has sub-sections collasped. The sub 
sections will be described individually in following sections.
+
+The ``tableName`` should only contain alpha-numeric characters, hyphens ('-'), 
or underscores ('_'). Though using a double-underscore ('__') is not allowed 
and reserved for other features within Pinot.
+
+The ``tableType`` will indicate the type of the table, ``OFFLINE`` or 
``REALTIME``. There are some settings specific to each type. This 
differentiation will be called out below as options are explained.
+
+.. code-block:: none
+
+{
+  "tableName": "myPinotTable",
+  "tableType": "REALTIME"
+  "segmentsConfig": {},
+  "tableIndexConfig": {},
+  "tenants": {},
+  "routing": {},
+  "task": {},
+  "metadata": {}
+}
+
+Segments Config Section
+~~~
+
+The ``segmentsConfig`` section has information about configuring
+
+* Segment Retention - with the  ``retentionTimeUnit`` and 
``retentionTimeValue`` options.
 
 Review comment:
   Consider including recommended values where it makes sense? Could go either 
way on this though...


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] dinoocch commented on a change in pull request #3975: ReadTheDocs documentation for Table Configs, Monitoring, and Deployment

2019-03-18 Thread GitBox
dinoocch commented on a change in pull request #3975: ReadTheDocs documentation 
for Table Configs, Monitoring, and Deployment
URL: https://github.com/apache/incubator-pinot/pull/3975#discussion_r266704620
 
 

 ##
 File path: docs/tableconfig_schema.rst
 ##
 @@ -0,0 +1,172 @@
+..
+.. Licensed to the Apache Software Foundation (ASF) under one
+.. or more contributor license agreements.  See the NOTICE file
+.. distributed with this work for additional information
+.. regarding copyright ownership.  The ASF licenses this file
+.. to you under the Apache License, Version 2.0 (the
+.. "License"); you may not use this file except in compliance
+.. with the License.  You may obtain a copy of the License at
+..
+..   http://www.apache.org/licenses/LICENSE-2.0
+..
+.. Unless required by applicable law or agreed to in writing,
+.. software distributed under the License is distributed on an
+.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+.. KIND, either express or implied.  See the License for the
+.. specific language governing permissions and limitations
+.. under the License.
+..
+
+Table Config
+===
+
+Table Config
+-
+
+Introduction to table configs
+~
+
+Using tables is how Pinot serves and organizes data. There are many settings 
in the table config which will influence how Pinot operates. The first and most 
significant distinction is using an offline versus a realtime table.
+
+An offline table in Pinot is used to host data which might be periodically 
uploaded - daily, weekly, etc. A realtime table, however, is used to consume 
data from incoming data streams and serve this data in a near-realtime manner. 
This might also be referred to as nearline or just plain 'realtime'.
+
+In this section a sample table configuration will be shown and all sections 
will be explained and if applicable have appropriate sections linked to for 
further explanation of those corresponding Pinot features.
+
+Sample table config and descriptions
+
+
+A sample table config is shown below which has sub-sections collasped. The sub 
sections will be described individually in following sections.
+
+The ``tableName`` should only contain alpha-numeric characters, hyphens ('-'), 
or underscores ('_'). Though using a double-underscore ('__') is not allowed 
and reserved for other features within Pinot.
+
+The ``tableType`` will indicate the type of the table, ``OFFLINE`` or 
``REALTIME``. There are some settings specific to each type. This 
differentiation will be called out below as options are explained.
+
+.. code-block:: none
+
+{
+  "tableName": "myPinotTable",
+  "tableType": "REALTIME"
+  "segmentsConfig": {},
+  "tableIndexConfig": {},
+  "tenants": {},
+  "routing": {},
+  "task": {},
+  "metadata": {}
+}
+
+Segments Config Section
+~~~
+
+The ``segmentsConfig`` section has information about configuring
+
+* Segment Retention - with the  ``retentionTimeUnit`` and 
``retentionTimeValue`` options.
+* Segment Push - Using ``segmentPushFrequency`` to indicate how frequently 
segments are uploaded.
+* Replication - Using ``replication`` for offline tables and 
``replicasPerPartition`` for realtime tables will indicate how many replicas of 
data will be present.
+* Schema - The name of the schema that's been uploaded to the controller
+* Time column - using ``timeColumnName`` and ``timeType``, this must match 
what's configured in the preceeding schema
+* Segment assignment strategy - Described more on the page `Customizing Pinot 
`_
+
+
+.. code-block:: none
+
+"segmentsConfig": {
+  "retentionTimeUnit": "DAYS",
+  "retentionTimeValue": "5",
+  "segmentPushFrequency": "daily",
+  "segmentPushType": "APPEND",
+  "replication": "3",
+  "replicasPerPartition": "3",
+  "schemaName": "ugcGestureEvents",
+  "timeColumnName": "daysSinceEpoch",
+  "timeType": "DAYS",
+  "segmentAssignmentStrategy": "BalanceNumSegmentAssignmentStrategy"
+},
+
+Table Index Config Section
+~~
+
+The ``tableIndexConfig`` section has information about how to configure:
+
+* Inverted Indexes - Using the ``invertedIndexColumns`` to specify a list of 
real column names as specified in the schema.
+* No Dictionary Columns - Using the ``noDictionaryColumns`` to specify a list 
of real column names as specified in the schema. Column names present will NOT 
have a dictionary created. More info on indexes can be found on the `Index 
Techniques `_ page.
+* Sorted Column - Using the ``sortedColumn`` to specify a list of real column 
names as specified in the schema.
+* Aggregate Metrics - Using ``aggregateMetrics`` set to ``"true"`` to enable 
the feature and ``"false"`` to disable. This feature is only available on 
REALTIME tables.
+* Data Partitioning Strategy using the ``segmentPartitionConfig`` to configure 

[GitHub] [incubator-pinot] dinoocch commented on a change in pull request #3975: ReadTheDocs documentation for Table Configs, Monitoring, and Deployment

2019-03-18 Thread GitBox
dinoocch commented on a change in pull request #3975: ReadTheDocs documentation 
for Table Configs, Monitoring, and Deployment
URL: https://github.com/apache/incubator-pinot/pull/3975#discussion_r266704156
 
 

 ##
 File path: docs/tableconfig_schema.rst
 ##
 @@ -0,0 +1,172 @@
+..
+.. Licensed to the Apache Software Foundation (ASF) under one
+.. or more contributor license agreements.  See the NOTICE file
+.. distributed with this work for additional information
+.. regarding copyright ownership.  The ASF licenses this file
+.. to you under the Apache License, Version 2.0 (the
+.. "License"); you may not use this file except in compliance
+.. with the License.  You may obtain a copy of the License at
+..
+..   http://www.apache.org/licenses/LICENSE-2.0
+..
+.. Unless required by applicable law or agreed to in writing,
+.. software distributed under the License is distributed on an
+.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+.. KIND, either express or implied.  See the License for the
+.. specific language governing permissions and limitations
+.. under the License.
+..
+
+Table Config
+===
+
+Table Config
+-
+
+Introduction to table configs
+~
+
+Using tables is how Pinot serves and organizes data. There are many settings 
in the table config which will influence how Pinot operates. The first and most 
significant distinction is using an offline versus a realtime table.
+
+An offline table in Pinot is used to host data which might be periodically 
uploaded - daily, weekly, etc. A realtime table, however, is used to consume 
data from incoming data streams and serve this data in a near-realtime manner. 
This might also be referred to as nearline or just plain 'realtime'.
+
+In this section a sample table configuration will be shown and all sections 
will be explained and if applicable have appropriate sections linked to for 
further explanation of those corresponding Pinot features.
+
+Sample table config and descriptions
+
+
+A sample table config is shown below which has sub-sections collasped. The sub 
sections will be described individually in following sections.
+
+The ``tableName`` should only contain alpha-numeric characters, hyphens ('-'), 
or underscores ('_'). Though using a double-underscore ('__') is not allowed 
and reserved for other features within Pinot.
+
+The ``tableType`` will indicate the type of the table, ``OFFLINE`` or 
``REALTIME``. There are some settings specific to each type. This 
differentiation will be called out below as options are explained.
+
+.. code-block:: none
+
+{
+  "tableName": "myPinotTable",
+  "tableType": "REALTIME"
+  "segmentsConfig": {},
+  "tableIndexConfig": {},
+  "tenants": {},
+  "routing": {},
+  "task": {},
+  "metadata": {}
+}
+
+Segments Config Section
+~~~
+
+The ``segmentsConfig`` section has information about configuring
+
+* Segment Retention - with the  ``retentionTimeUnit`` and 
``retentionTimeValue`` options.
+* Segment Push - Using ``segmentPushFrequency`` to indicate how frequently 
segments are uploaded.
+* Replication - Using ``replication`` for offline tables and 
``replicasPerPartition`` for realtime tables will indicate how many replicas of 
data will be present.
+* Schema - The name of the schema that's been uploaded to the controller
+* Time column - using ``timeColumnName`` and ``timeType``, this must match 
what's configured in the preceeding schema
+* Segment assignment strategy - Described more on the page `Customizing Pinot 
`_
+
+
+.. code-block:: none
+
+"segmentsConfig": {
+  "retentionTimeUnit": "DAYS",
+  "retentionTimeValue": "5",
+  "segmentPushFrequency": "daily",
+  "segmentPushType": "APPEND",
+  "replication": "3",
+  "replicasPerPartition": "3",
+  "schemaName": "ugcGestureEvents",
+  "timeColumnName": "daysSinceEpoch",
+  "timeType": "DAYS",
+  "segmentAssignmentStrategy": "BalanceNumSegmentAssignmentStrategy"
+},
+
+Table Index Config Section
+~~
+
+The ``tableIndexConfig`` section has information about how to configure:
+
+* Inverted Indexes - Using the ``invertedIndexColumns`` to specify a list of 
real column names as specified in the schema.
+* No Dictionary Columns - Using the ``noDictionaryColumns`` to specify a list 
of real column names as specified in the schema. Column names present will NOT 
have a dictionary created. More info on indexes can be found on the `Index 
Techniques `_ page.
+* Sorted Column - Using the ``sortedColumn`` to specify a list of real column 
names as specified in the schema.
+* Aggregate Metrics - Using ``aggregateMetrics`` set to ``"true"`` to enable 
the feature and ``"false"`` to disable. This feature is only available on 
REALTIME tables.
+* Data Partitioning Strategy using the ``segmentPartitionConfig`` to configure 

[GitHub] [incubator-pinot] dinoocch commented on a change in pull request #3975: ReadTheDocs documentation for Table Configs, Monitoring, and Deployment

2019-03-18 Thread GitBox
dinoocch commented on a change in pull request #3975: ReadTheDocs documentation 
for Table Configs, Monitoring, and Deployment
URL: https://github.com/apache/incubator-pinot/pull/3975#discussion_r266701869
 
 

 ##
 File path: docs/in_production.rst
 ##
 @@ -40,11 +40,14 @@ Recommended environment
 Deploying Pinot
 ---
 
-Direct deployment of Pinot
-~~
+In general, when deploying Pinot services, it is best to adhere to a specific 
ordering in which the various components should be deployed. This deployment 
order is recommended incase of the scenario that there might be protocol or 
other significant differences, the deployments go out in a predictable order in 
which failure  due to these changes can be avoided.
 
 Review comment:
   Consider wrapping long lines at 79 characters.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] dinoocch commented on a change in pull request #3975: ReadTheDocs documentation for Table Configs, Monitoring, and Deployment

2019-03-18 Thread GitBox
dinoocch commented on a change in pull request #3975: ReadTheDocs documentation 
for Table Configs, Monitoring, and Deployment
URL: https://github.com/apache/incubator-pinot/pull/3975#discussion_r266699836
 
 

 ##
 File path: docs/in_production.rst
 ##
 @@ -64,4 +67,32 @@ Configuring realtime data ingestion
 Monitoring Pinot
 
 
+In order for Pinot to provide effective service there is a core set of metrics 
which should be monitored to ensure service stability, fault tolerance and 
acceptable response times. In the section following, there are service level 
metrics which are recommended to be monitored.
+
+More info on metrics collection and viewing a complete set of available metric 
is available in the `Metrics `_ section.
+
+Pinot Server
+
+* Missing Segments - Number of missing segments - `NUM_MISSING_SEGMENTS 
`_
 
 Review comment:
   This explanation is confusing.
   
   Metric is number of missing segments that the broker queried for (expected 
to be on the server) but the server didn't have. This can be due to retention 
or stale routing table.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] dinoocch commented on a change in pull request #3975: ReadTheDocs documentation for Table Configs, Monitoring, and Deployment

2019-03-18 Thread GitBox
dinoocch commented on a change in pull request #3975: ReadTheDocs documentation 
for Table Configs, Monitoring, and Deployment
URL: https://github.com/apache/incubator-pinot/pull/3975#discussion_r266702135
 
 

 ##
 File path: docs/tableconfig_schema.rst
 ##
 @@ -0,0 +1,172 @@
+..
+.. Licensed to the Apache Software Foundation (ASF) under one
+.. or more contributor license agreements.  See the NOTICE file
+.. distributed with this work for additional information
+.. regarding copyright ownership.  The ASF licenses this file
+.. to you under the Apache License, Version 2.0 (the
+.. "License"); you may not use this file except in compliance
+.. with the License.  You may obtain a copy of the License at
+..
+..   http://www.apache.org/licenses/LICENSE-2.0
+..
+.. Unless required by applicable law or agreed to in writing,
+.. software distributed under the License is distributed on an
+.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+.. KIND, either express or implied.  See the License for the
+.. specific language governing permissions and limitations
+.. under the License.
+..
+
+Table Config
+===
+
+Table Config
+-
+
+Introduction to table configs
 
 Review comment:
   ```suggestion
   Introduction
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] dinoocch commented on a change in pull request #3975: ReadTheDocs documentation for Table Configs, Monitoring, and Deployment

2019-03-18 Thread GitBox
dinoocch commented on a change in pull request #3975: ReadTheDocs documentation 
for Table Configs, Monitoring, and Deployment
URL: https://github.com/apache/incubator-pinot/pull/3975#discussion_r266699635
 
 

 ##
 File path: docs/in_production.rst
 ##
 @@ -64,4 +67,32 @@ Configuring realtime data ingestion
 Monitoring Pinot
 
 
+In order for Pinot to provide effective service there is a core set of metrics 
which should be monitored to ensure service stability, fault tolerance and 
acceptable response times. In the section following, there are service level 
metrics which are recommended to be monitored.
+
+More info on metrics collection and viewing a complete set of available metric 
is available in the `Metrics `_ section.
+
+Pinot Server
 
 Review comment:
   Probably want to use subheadings


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] dinoocch commented on a change in pull request #3975: ReadTheDocs documentation for Table Configs, Monitoring, and Deployment

2019-03-18 Thread GitBox
dinoocch commented on a change in pull request #3975: ReadTheDocs documentation 
for Table Configs, Monitoring, and Deployment
URL: https://github.com/apache/incubator-pinot/pull/3975#discussion_r266702485
 
 

 ##
 File path: docs/tableconfig_schema.rst
 ##
 @@ -0,0 +1,172 @@
+..
+.. Licensed to the Apache Software Foundation (ASF) under one
+.. or more contributor license agreements.  See the NOTICE file
+.. distributed with this work for additional information
+.. regarding copyright ownership.  The ASF licenses this file
+.. to you under the Apache License, Version 2.0 (the
+.. "License"); you may not use this file except in compliance
+.. with the License.  You may obtain a copy of the License at
+..
+..   http://www.apache.org/licenses/LICENSE-2.0
+..
+.. Unless required by applicable law or agreed to in writing,
+.. software distributed under the License is distributed on an
+.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+.. KIND, either express or implied.  See the License for the
+.. specific language governing permissions and limitations
+.. under the License.
+..
+
+Table Config
+===
+
+Table Config
+-
+
+Introduction to table configs
+~
+
+Using tables is how Pinot serves and organizes data. There are many settings 
in the table config which will influence how Pinot operates. The first and most 
significant distinction is using an offline versus a realtime table.
+
+An offline table in Pinot is used to host data which might be periodically 
uploaded - daily, weekly, etc. A realtime table, however, is used to consume 
data from incoming data streams and serve this data in a near-realtime manner. 
This might also be referred to as nearline or just plain 'realtime'.
+
+In this section a sample table configuration will be shown and all sections 
will be explained and if applicable have appropriate sections linked to for 
further explanation of those corresponding Pinot features.
+
+Sample table config and descriptions
+
+
+A sample table config is shown below which has sub-sections collasped. The sub 
sections will be described individually in following sections.
+
+The ``tableName`` should only contain alpha-numeric characters, hyphens ('-'), 
or underscores ('_'). Though using a double-underscore ('__') is not allowed 
and reserved for other features within Pinot.
+
+The ``tableType`` will indicate the type of the table, ``OFFLINE`` or 
``REALTIME``. There are some settings specific to each type. This 
differentiation will be called out below as options are explained.
+
+.. code-block:: none
+
+{
+  "tableName": "myPinotTable",
+  "tableType": "REALTIME"
+  "segmentsConfig": {},
+  "tableIndexConfig": {},
+  "tenants": {},
+  "routing": {},
+  "task": {},
+  "metadata": {}
+}
+
+Segments Config Section
+~~~
+
+The ``segmentsConfig`` section has information about configuring
 
 Review comment:
   configuring...?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] dinoocch commented on a change in pull request #3975: ReadTheDocs documentation for Table Configs, Monitoring, and Deployment

2019-03-18 Thread GitBox
dinoocch commented on a change in pull request #3975: ReadTheDocs documentation 
for Table Configs, Monitoring, and Deployment
URL: https://github.com/apache/incubator-pinot/pull/3975#discussion_r266701369
 
 

 ##
 File path: docs/in_production.rst
 ##
 @@ -64,4 +67,32 @@ Configuring realtime data ingestion
 Monitoring Pinot
 
 
+In order for Pinot to provide effective service there is a core set of metrics 
which should be monitored to ensure service stability, fault tolerance and 
acceptable response times. In the section following, there are service level 
metrics which are recommended to be monitored.
+
+More info on metrics collection and viewing a complete set of available metric 
is available in the `Metrics `_ section.
+
+Pinot Server
+
+* Missing Segments - Number of missing segments - `NUM_MISSING_SEGMENTS 
`_
 
 Review comment:
   Also consider ordering metrics by importance.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] dinoocch commented on a change in pull request #3975: ReadTheDocs documentation for Table Configs, Monitoring, and Deployment

2019-03-18 Thread GitBox
dinoocch commented on a change in pull request #3975: ReadTheDocs documentation 
for Table Configs, Monitoring, and Deployment
URL: https://github.com/apache/incubator-pinot/pull/3975#discussion_r266700385
 
 

 ##
 File path: docs/in_production.rst
 ##
 @@ -64,4 +67,32 @@ Configuring realtime data ingestion
 Monitoring Pinot
 
 
+In order for Pinot to provide effective service there is a core set of metrics 
which should be monitored to ensure service stability, fault tolerance and 
acceptable response times. In the section following, there are service level 
metrics which are recommended to be monitored.
+
+More info on metrics collection and viewing a complete set of available metric 
is available in the `Metrics `_ section.
+
+Pinot Server
+
+* Missing Segments - Number of missing segments - `NUM_MISSING_SEGMENTS 
`_
 
 Review comment:
   Consider format
   ```
   * : 
   
   * 
   * 
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org