#general


@xiangfu0: Just fyi, I will prepare the 0.9.0 release candidate based on commit .
@nicholas.nezis: @nicholas.nezis has joined the channel
@karinwolok1: Live in 5 min if anyone wants to join (Intro level on Apache Pinot)  
@bowenzhu: @bowenzhu has joined the channel
@brandon: @brandon has joined the channel

#random


@nicholas.nezis: @nicholas.nezis has joined the channel
@bowenzhu: @bowenzhu has joined the channel
@brandon: @brandon has joined the channel

#troubleshooting


@alihaydar.atil: Hey everyone, i have managed to create a hybrid table. I have few questions regarding to the subject. •Since segments are transferred to offline table periodically, Is it a correct assumption that i don't need those transferred realtime segments to be hosted in servers? •If that is the case, Is it recommended to clean up those transferred segments and what is the correct way to clean them up? What comes up to my mind is setting up retentionTimeUnit and retentionTimeValue properties in realtime table configuration. Does Pinot have a built-in clean up mechanism for hybrid tables? Thanks in advance
  @npawar: Setting retention is the right way
  @npawar: Pinot has a periodic task which will cleanup the segments from the table, which are older than renention time
  @alihaydar.atil: @npawar thank you :pray::skin-tone-2:
@nicholas.nezis: @nicholas.nezis has joined the channel
@tony: We have a Pinot / Kubernetes deployment with 6 controller pods. We are seeing high CPU on one controller, very low on the others. Restarting pods does not change this behavior. Our Pinot is now primarily ingesting one fairly high volume Kafka stream with 128 partitions. Is this expected?
  @xiangfu0: It's expected I think. Pinot controller doesn't do actual ingestion work but handles the management of segment assignment, you should expect server side to ingest data.Also a lot of heavy lifting works are done by lead controller, which is elected automatically. Typically we don't do more than 3 controllers.
  @tony: Thanks. I did find that this is the leader. I will switch to 3 controllers but make them larger.
  @xiangfu0: :thumbsup:
  @ssubrama: How many tables do you have? If you have many tables, the load will be divided roughly equally amongst the controllers for each table. If you have only one table, then one of the controllers will be doing all the work, of course.
  @ssubrama: Also, think of the work in the controller as "meta" work. So, it is proportional to the frequency new segments added/deleted (roughly). And then there are periodic jobs that are sort of proportional to the _number_ of segments you have in tables (data size does not matter). It is useful to check if your realtime tables are creating segments too frequently.
  @ssubrama: @tony ^^
  @tony: Currently we have essentially one table (2 tables but one is 98% of the volume) but that will be changing over time. So as we add more tables we will see load more distributed over controllers.
@bowenzhu: @bowenzhu has joined the channel
@brandon: @brandon has joined the channel

#pinot-dev


@dunithd: @dunithd has joined the channel
@xiangfu0: Just fyi, I will prepare the 0.9.0 release candidate based on commit .

#announcements


@dunithd: @dunithd has joined the channel

#presto-pinot-connector


@nakkul: @nakkul has joined the channel

#pinot-perf-tuning


@tony: Question about server disk size - do server nodes need enough disk space to store all segments? Or will segments get dropped from local disk and re-read from deep storage as needed if the disk gets full?
@g.kishore: it needs enough disk space to store all the segments assigned to it
@tony: Thanks. So deep storage is just a backup. Is this use case is meant to address? We have a AWS/EKS deployment and our cost is driven by server storage (EBS) - it would be ideal to have older data in S3
@ssubrama: @tony perhaps you are looking for a solution being worked on in this issue:
@ssubrama: Tiered storage just moves some segments to a different set of servers, but those servers now need to have enough storage to host these.
@ssubrama: Even in the issue that I mention, it is expected that the storage use temporarily bumps up on the servers, and then reclaimed when the segments "age". Pinot does not handle the case of serving data from segments that cannot be stored on servers.

#getting-started


@dunithd: @dunithd has joined the channel
@aaron.weiss: @aaron.weiss has joined the channel

#releases


@dunithd: @dunithd has joined the channel
--------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@pinot.apache.org For additional commands, e-mail: dev-h...@pinot.apache.org

Reply via email to