Apache Pinot Daily Email Digest (2020-07-06)

Pinot Slack Email Digest Mon, 06 Jul 2020 19:00:44 -0700

<h3><u>#general</u></h3><br><strong>@yash.agarwal: </strong>Is it possible to 
use multiple buckets for S3PinotFs ? We have limitations to the amount of data 
we can store in a single bucket.<br><strong>@mayanks: </strong>@kharekartik 
^^<br><strong>@kharekartik: </strong>Hi @yash.agarwal, currently it is not 
possible. Let me take a look into what can be done<br><strong>@g.kishore: 
</strong>@yash.agarwal what kind of limitation do you 
have<br><strong>@yash.agarwal: </strong>@g.kishore We have our buckets limited 
to 1TB and 2 million objects, and we are looking to deploy a cluster well over 
50TB.<br><strong>@g.kishore: </strong>got it, let me see how we can support the 
multiple buckets.<br><strong>@yash.agarwal: </strong>Sure. Do let me know if I 
can do anything to help :slightly_smiling_face:.<br><strong>@g.kishore: 
</strong>would love to get your help, created 
<#C016ZKW1EPK|s3-multiple-buckets><br><h3><u>#troubleshooting</u></h3><br><strong>@somanshu.jindal:
 </strong>Hi, If i want to use zookeeper cluster for production setup, Can i 
specify all the zookeeper hosts when starting various pinot components like 
controller, broker etc.<br><strong>@yash.agarwal: </strong>@yash.agarwal has 
joined the channel<br><strong>@somanshu.jindal: </strong>I need help with 
hardware requirements for the various components like cores, memory etc?
Also which components are memory intensive, io intensive, cpu intensive etc.
Currently i am thinking of
• Controller - 2
• Broker - 2
• Servers - 3 (for realtime ingestion)
• Zookeeper (should i go with standalone or cluster?)
As far as i know, segments are stored on servers and controller (segment 
store), right?<br><strong>@yash.agarwal: </strong>Is it possible to use 
multiple buckets for S3PinotFs ? We have limitations to the amount of data we 
can store in a single bucket.<br><strong>@g.kishore: </strong>@somanshu.jindal 
For prod, here is a good setup
```controller 
- min 2 (for fault tolerance) ideal 3 
- 4 core, 4 gb (disk space should be sufficient for logs and temp segments) - 
100 GB
Broker
- Min 2, add more nodes as needed as later to scale
 - 4 core, 4gb (disk space should be sufficient for logs) - 10GB min
Zookeeper (cluster mode), 
- min 3 (this is where the entire cluster state is stored)
- 4 gb, 4 core,  disk space sufficient to store logs, transaction logs and 
snapshots. If you can afford, go with ssd if not disk will be fine. 100GB


Pinot server
- Min 2 (this is where the segments will be stored), you can add more servers 
anytime without downtime
- 8 core, 16 gb, SSD boxes (pick any size that works for your use case (500 gb 
to 2TB or even more). 
- If you are running on cloud, you can use mounted ssd instead of local 
ssd```<br><strong>@pyne.suvodeep: </strong>@pyne.suvodeep has joined the 
channel<br><strong>@pradeepgv42: </strong>QQ, wondering how difficult would it 
be to include timestampNanos as part of the time column in pinot?
(is it just a matter of pinot parsing and understanding that timestamp is in 
Nanos or there are more assumptions around?)

I believe currently till `millis` is supported. Context is we have system level 
events (think stream of syscalls)
and want to be able to store the nanos timestamp to fix the order among them 
and also it’s used by other systems in our infrastructure.

Currently I am storing nanos column as a different column and created a `millis`
column to serve as time column, thinking if I can avoid storing the additional
duplicate info if the feature is simple enough to add?<br><strong>@g.kishore: 
</strong>IMO, nanos cannot be used as timestamp<br><strong>@g.kishore: 
</strong>irrespective of Pinot supporting that datatype<br><strong>@g.kishore: 
</strong>nanos is mainly used to measure relative 
times<br><strong>@elon.azoulay: </strong>FYI, we have a table which already 
exists and I wanted to add a sorted column index but getting "bad request 400". 
Nothing in the controller logs. Can you see what's wrong with the 
following?<br><strong>@elon.azoulay: </strong>```curl -f -k -X POST --header 
'Content-Type: application/json' -d '@realtime.json' 
${CONTROLLER}/tables```<br><strong>@elon.azoulay: </strong>```{
      "tableName": "oas_integration_operation_event",
      "tableType": "REALTIME",
      "segmentsConfig": {
        "timeColumnName": "operation_ts",
        "timeType": "SECONDS",
        "retentionTimeUnit": "DAYS",
        "retentionTimeValue": "7",
        "segmentPushType": "APPEND",
        "segmentPushFrequency": "daily",
        "segmentAssignmentStrategy": "BalanceNumSegmentAssignmentStrategy",
        "schemaName": "oas_integration_operation_event",
        "replicasPerPartition": "3",
        "timeType": "SECONDS"
      },
      "tenants": {
        "broker": "DefaultTenant",
        "server": "DefaultTenant"
      },
      "tableIndexConfig": {
        "loadMode": "MMAP",
        "invertedIndexColumns": [ "service_slug", "operation_type", 
"operation_result", "store_id"],
        "sortedColumn": ["operation_ts"],
        "noDictionaryColumns": [],
        "aggregateMetrics": "false",
        "streamConfigs": {
          "streamType": "kafka",
          "stream.kafka.consumer.type": "LowLevel",
          "stream.kafka.topic.name": 
"oas-integration-operation-completion-avro",
          "stream.kafka.decoder.class.name": 
"org.apache.pinot.plugin.inputformat.avro.confluent.KafkaConfluentSchemaRegistryAvroMessageDecoder",
          "stream.kafka.consumer.factory.class.name": 
"org.apache.pinot.plugin.stream.kafka20.KafkaConsumerFactory",
          "stream.kafka.decoder.prop.schema.registry.rest.url": 
"<https://u17000708.ct.sendgrid.net/ls/click?upn=iSrCRfgZvz-2BV64a3Rv7HYVQ6HO-2FNd3WXo8sCVuFwfT0-3DT4te_vGLQYiKGfBLXsUt3KGBrxeq6BCTMpPOLROqAvDqBeTxq8wcUqUF3xaBwWeV07JNVyRGy4jlW5PJgT6jQqbHf3TPoY-2FqgmxDrNxIDcaah2om0KvbgMcFLGXrE8ZfpBNvOa9cIJododz1I6dFs45CFYTkxvtRRBjmslWphjLH4q6H1lFMXjU7Oa0hAjVJFMuO-2BC0ULgQjrczkzjbMYZ8ac8tFMZprfJvJ5lZlXAH5d4-2FE-3D>",
          "stream.kafka.zk.broker.url": "XXXX/",
          "stream.kafka.broker.list": "XXXX:9092",
          "realtime.segment.flush.threshold.time": "6h",
          "realtime.segment.flush.threshold.size": "0",
          "realtime.segment.flush.desired.size": "200M",
          "stream.kafka.consumer.prop.auto.isolation.level": "read_committed",
          "stream.kafka.consumer.prop.auto.offset.reset": "smallest",
          "stream.kafka.consumer.prop.group.id": 
"oas_integration_operation_event-load-pinot-llprb",
          "stream.kafka.consumer.prop.client.id": "XXXX"
        },
        "starTreeIndexConfigs":  [{ "dimensionsSplitOrder": [ "service_slug", 
"store_id", "operation_type", "operation_result" ], "functionColumnPairs": [ 
"PERCENTILEEST__operation_latency_ms", "AVG__operation_latency_ms", 
"DISTINCTCOUNT__store_id", "COUNT__store_id", "COUNT__operation_type" ] }, { 
"dimensionsSplitOrder": [ "service_slug", "store_id" ], "functionColumnPairs": 
[ "COUNT__store_id", "COUNT__operation_type" ] }]
      },
      "metadata": {
        "customConfigs": {}
      }
}```<br><strong>@mayanks: </strong>IIRC, uploading segments to realtime tables 
was not possible (a while back, but not sure if it continues to be the 
case).<br><strong>@elon.azoulay: </strong>This is just updating the spec for 
the table<br><strong>@mayanks: </strong>can you try 
swagger?<br><strong>@elon.azoulay: </strong>Sure<br><strong>@elon.azoulay: 
</strong>Oh, thanks! Looks like I can't change the time type for the time 
column, i.e. segmentsConfig.timeType<br><strong>@mayanks: </strong>Makes sense, 
that could be backward 
incompatible.<br><h3><u>#presto-pinot-streaming</u></h3><br><strong>@elon.azoulay:
 </strong>Here's a link to the design doc: 
<https://u17000708.ct.sendgrid.net/ls/click?upn=1BiFF0-2FtVRazUn1cLzaiMc9VK8AZw4xfCWnhVjqO8F2yNvEnb3JHma9TbSCfyfAx-2FVOn7Bt885qSK47uf3MFF-2FhL8qplE-2FLYisjbzJXY-2FUB7YCnAiPcrkdz5y054MsHzlsZTBtUMD-2BcUlK45ORI42w-3D-3DdUo8_vGLQYiKGfBLXsUt3KGBrxeq6BCTMpPOLROqAvDqBeTxq8wcUqUF3xaBwWeV07JNVh4mtLbu51UvID-2BpIVeVfHHAkz-2BQGywKBCG-2BuczerYFmfsSw-2BaUhWdf5KrlyQBpdgjghNzrbFX8rvY73d4ST7SlokoYDYRdCoOTGb1ArYbbIkXTayr2aC97n0VXZH4chsCkI8vMD05ZPq-2FvzlmlID-2FWYWayA-2FwE2RKIfyz6P47zs-3D><br><strong>@g.kishore:
 </strong>@jackie.jxt can you please take a look at 
this?<br><strong>@jackie.jxt: </strong>Sure<br><strong>@g.kishore: 
</strong>@elon.azoulay need access<br><strong>@elon.azoulay: </strong>Try this 
one: 
<https://u17000708.ct.sendgrid.net/ls/click?upn=1BiFF0-2FtVRazUn1cLzaiMc9VK8AZw4xfCWnhVjqO8F2yNvEnb3JHma9TbSCfyfAx-2FVOn7Bt885qSK47uf3MFF-2FhL8qplE-2FLYisjbzJXY-2FUB7YCnAiPcrkdz5y054MsHzlsZTBtUMD-2BcUlK45ORI42w-3D-3D32ZZ_vGLQYiKGfBLXsUt3KGBrxeq6BCTMpPOLROqAvDqBeTxq8wcUqUF3xaBwWeV07JNVquxchxi3QlvwYIA1-2FNYdsWIcFvbIHp6nKWfN04ATBV0yJvPGfj63ENLE4TNmKIg-2BcbJT6F3swY6J8adylMAjX7HFQOXlImxxHKo7cX7oqBOq-2BDPxsm1a5e4fBK7n4PpmlT6r4qZMmM16VR4YCnDU4w0ygo9mC2b-2BJwiMNVWoK98-3D><br><strong>@g.kishore:
 </strong>can you write a few sentences on why we need this and whats the 
current design<br><strong>@g.kishore: 
</strong><https://u17000708.ct.sendgrid.net/ls/click?upn=1BiFF0-2FtVRazUn1cLzaiMWR9hf84-2BEJYpip6YlEfWjHMb3DE3DtTnj4lc7ywiNxn8nE0KD6t23Jqnbnkq1-2Fazw-3D-3D_TmA_vGLQYiKGfBLXsUt3KGBrxeq6BCTMpPOLROqAvDqBeTxq8wcUqUF3xaBwWeV07JNVzSBoU3zH4HjRuVheDvC3EgsKYdEk1Y6sJnY9wsmnoKBRjducBzXmsKfeziONk-2BOyIWDjmSFdd1orV6HvzPyxRynSRgZCN5CvD8J3b1YDJphT3Nc3t10nYBybTrYtMgwY6TWsi-2B0Dtu-2Fmo7DmxIVTnkAUvz5OUTUwdy7ZFd9iAvI-3D><br><strong>@g.kishore:
 </strong>use this diagram<br><strong>@g.kishore: 
</strong><br><strong>@g.kishore: </strong>today we are in unary 
streaming<br><strong>@g.kishore: </strong>and we want to move to server 
streaming<br><strong>@g.kishore: </strong>advantages
• less memory pressure on pinot server<br><strong>@g.kishore: </strong>• presto 
workers can start working as soon as chunks arrive<br><strong>@elon.azoulay: 
</strong>Sure<br><h3><u>#s3-multiple-buckets</u></h3><br><strong>@g.kishore: 
</strong>@g.kishore has joined the channel<br><strong>@yash.agarwal: 
</strong>@yash.agarwal has joined the channel<br><strong>@kharekartik: 
</strong>@kharekartik has joined the channel<br><strong>@singalravi: 
</strong>@singalravi has joined the channel<br><strong>@kharekartik: 
</strong>@g.kishore Is there a support for multiple directories for FS? If Yes, 
we can extend that to multiple buckets.<br><strong>@kharekartik: 
</strong>@yash.agarwal How do you want to split data across 
buckets?<br><strong>@g.kishore: </strong>@kharekartik No, I was thinking if 
users can provide a list of subFolders/s3buckets, we can pick one randomly or 
hash it based on segment name<br><strong>@kharekartik: </strong>Randomly at the 
time of creating the segments?<br><strong>@kharekartik: </strong>Wouldn't that 
disrupt the query execution?<br><strong>@g.kishore: </strong>no, we just store 
the uri along with segment metadata in ZK<br><strong>@g.kishore: </strong>it 
can point to anything<br><strong>@g.kishore: </strong>actually, this is a 
problem only with real-time where we create the URI<br><strong>@g.kishore: 
</strong>with batch ingestion, user can provide any 
URI<br><strong>@yash.agarwal: </strong>We don’t have any specific requirement 
around how to slit data across buckets.<br><strong>@pradeepgv42: 
</strong>@pradeepgv42 has joined the channel<br><strong>@kharekartik: 
</strong>Ok. Then I believe the change needs to be done in the handling of 
ingestion config and then picking a random directory while creating segments

S3 filesystem implementation won't need any change unless the buckets are 
located in different regions<br><strong>@yash.agarwal: </strong>all the buckets 
are co located.<br><strong>@g.kishore: </strong>Yash, is this realtime or 
offline<br><strong>@yash.agarwal: </strong>Right now it is only 
offline.<br><strong>@g.kishore: </strong>then, you dont need any thing for 
now<br><strong>@g.kishore: </strong>I am guessing you will use the 
ingestion-job to generate the segments<br><strong>@vallamsetty: 
</strong>@vallamsetty has joined the channel<br><strong>@yash.agarwal: 
</strong>Yeah I realised that too. I am very new to this so sorry for any 
troubles :slightly_smiling_face:<br><strong>@g.kishore: </strong>no worries, 
this is a good feature to have. if you dont mind, can you create an issue<br>

Apache Pinot Daily Email Digest (2020-07-06)

Reply via email to