Apache Pinot Daily Email Digest (2021-05-11)

Pinot Slack Email Digest Tue, 11 May 2021 19:00:36 -0700

#general

@humengyuk18: What kind of index should I use if I have datetime string column to enable faster ranged query? Will a ranged index help?
@fx19880617: range index only works for numbers.
@fx19880617: for date string, you can try to have a separated column to convert the datestring to seconds or days value
@fx19880617: also if you have the data partitioned by date for segments, then range query should mostly fall into to the individual segment
@humengyuk18: I see. But the problem for our use case is, superset time column can only be datetime, that’s why we converted epoch_ms to datetime during ingestion.
@fx19880617: oh?
@fx19880617: I thought superset can have epoch_ms and epoch_s?
@fx19880617: something like:
@humengyuk18: For temporal column, it has to datetime or datetime like.
@humengyuk18: I guess a long type can’t work as temporal?
@fx19880617: you can check mark `is temporal`
@fx19880617:
@fx19880617: this is one example
@fx19880617: this is the query superset generated:
@humengyuk18: It works, the description on superset is a little misleading, I thought it must be datetime type.
@humengyuk18: Why using epoch_ms time column is much slower than a datetime string column?
@humengyuk18: The same superset cost 3675 ms using epoch_ms column, but only cost 575 ms using a datetime string column.
@fx19880617: hmm, do you have the query generated for the graph?
@fx19880617: it’s possible that the group by transformation make the computation complicated
@humengyuk18: The only difference for the query is the time filter.
@humengyuk18: ```SELECT count(DISTINCT device_sn) AS count_1 FROM pad_interaction WHERE "time" >= 1620705960000 AND "time" < 1620706560000 AND company_name = 'xxx' AND action_value = 'mqtt_ping' AND action_message = 'success' AND ((regexp_like(device_sn, '^WAL.*'))) ORDER BY count(DISTINCT device_sn) DESC;```
@fx19880617: oh? can you paste the queries?
@humengyuk18: ```SELECT count(DISTINCT device_sn) AS count_1 FROM pad_interaction WHERE ds >= '2021-05-11 12:03:00' AND ds < '2021-05-11 12:13:00' AND company_name = 'xxx' AND action_value = 'mqtt_ping' AND action_message = 'success' AND ((regexp_like(device_sn, '^WAL.*'))) ORDER BY count(DISTINCT device_sn) DESC```
@fx19880617: hmm
@fx19880617: this query seems only hit the most recent data
@fx19880617: it’s weird that there are some big difference
@humengyuk18: numEntriesScannedInFilter difference is huge, datetime string has only 9370339 entries scanned, epoch_ms has 262577613 entries scanned.
@fx19880617: oh, because it’s ms not seconds
@fx19880617: then add a second epoch value column should help
@fx19880617: also,
@fx19880617: do you have any sort index?
@humengyuk18: No, no index is configured.
@fx19880617: ic, then adding inverted index may help as well for columns in filtering
@humengyuk18: Thanks, I will add some inverted index.
@g.kishore: Why not range?
@humengyuk18: Can range index used for string columns? I didn’t see any description in the documentation.
@fx19880617: I feel sortIndex on time and inverted index on company_name, action_value, action_message may help
@humengyuk18: inverted index can’t apply on multivalue columns?
@fx19880617: inv index works for multi-value column
@david.gyoker: @david.gyoker has joined the channel
@pedro.cls93: Hello, Does Pinot support defining a computed field (metric) based on a field that does not appear in the schema but exists in the ingestion message? This is a realtime table if that makes a difference.
@npawar: Yes this is supported
@pedro.cls93: So the transformationFunction can receive as a parameter a field that does not exist in the schema? Is there documentation on this use-case?
@npawar: Yes. If you have a function like bar="reverse(foo)", where bar is in the Pinot schema, but foo is not in the Pinot schema, it will work if foo is present in the source record. Pinot will look for all arguments to the function in the row
@npawar: Dunno if this is specifically documented, there might be a mention on the page with the transform functions documentation
@pedro.cls93: What are the consequences of defining a dimension as a String of length Integer.MaxSize and creating a json index based on it? What hardware considerations need to be made if any? What is the performance impact?
@bigdatalabsg: @bigdatalabsg has joined the channel
@kevin.smith: @kevin.smith has joined the channel
@aiyer: Hi Team -- A basic question , If we use S3 for storing the segments, how does pinot take care of query latency since there will be netw call involved?
@pedro.cls93: Someone with more experience can chime in but Pinot controller & server instances will have a local copy of the segments to handle queries.
@pedro.cls93: See this thread
@g.kishore: Only Pinot server will have the local copy, not controller
@aiyer: ok.. so does it mean the cluster must have enough memory combined to host all the data that is getting ingested?
@aiyer: and that would keep growing in production, so how do we generally scale that ? Should we be looking at sacling up number of servers as traffic and data grows?
@g.kishore: No, all the data is stored on disk and mmaped. So you need enough disk space
@aiyer: ok.. what is we use S3 ?
@g.kishore: s3 the backup store
@aiyer: oh ok.. so we will have data in disk of the servers +the backup store.
@mayanks: Correct
@pedro.cls93: @g.kishore So when configuring S3 or some other object store as deep store for segments means that disk space in servers is for caching purposes only?
@pedro.cls93: Are there any metrics/ways to see if see if a given server is under-provisioned for caching segments?
@mayanks: Yes, servers use local disk for better performance, as of now
@mayanks: There’s an endpoint in server to get table size.
@pedro.cls93: Hello everyone, What does Pinot store in zookeeper metadata. I currently have 2GB out of 2.5GB of disk used up (78.5%) in my zookeeper instance. Should this be a cause for concern?
@mayanks: Cluster state, segment metadata etc. do you have too many segments?
@mayanks: Whatever is visible via ZK in the UI is stored inside ZK
@pedro.cls93: What is too many segments? I have 1 table only with 3842 segments.
@mayanks: Yeah 3.8k segments might lead to 2GB, each segment has metadata stored in ZK
@mayanks: Too many is relative of course, here I meant enough to justify 2GB disk storage
@pedro.cls93: Understood. Can zookeeper information be offloaded to hdfs like the segments?
@mayanks: Not sure if ZK supports that If it did, then you will incur latency for ZK read/writes
@karinwolok1: :wave: Let's give a big warm welcome to all the new Pinot community members! :wine_glass: :point_right: We'd love to learn a little bit about who you are and what brought you here! Please take a moment and introduce yourself to the community! :heart: @vishal.sharma09890 @bin.huangxb @libhojani @darkyin87 @tony @gaurav.eca @guido.schmutz @k.santhiswarup @richhickey @arthurnamiasdecrasto @onderaltinkurt @rohithuppala @shaileshjha061 @sumit.m @david.gyoker @bigdatalabsg @kevin.smith
@vishal.sharma09890: Hi, thanks for having us here. I am a software developer from India, just started in the tech industry last year, graduating in 2020. My field of interest is distributed systems and i work majorly on go.
@k.santhiswarup: Hello Guys! I am checking out Pinot repo and I noticed the code for thrideye is not in incubator-pinot anymore and all the references to thirdeye are broken in the docs:disappointed: . Is this intentional?
@g.kishore: Yes, it was intentional. its moved to
@g.kishore: This was needed as part of Pinot graduation process in Apache
@k.santhiswarup: Thanks @g.kishore for clarification.
@kmvb.tau: does pinot support window functions like presto?
@mayanks: Not yet, could you file an issue describing the use case?
@kmvb.tau: ok sure.. will do
@g.kishore: No, we don't support window functions. You can achieve that using presto -pinot connector
@kmvb.tau: I understand using presto connector for join tables.. if pinot supports all single table functions (window ) then it will be helpful. our use case: To find avg response time for support tickets . We use lag window functions over audit data . If we use presto connector then large amount of data will be transferred to presto nodes and will increase query time as well.
@g.kishore: if you partition the data on ticketid and use partition based assignment, you can achieve this in Pinot via UDF. we did this for distinct count and it showed amazing improvement in performance @jackie.jxt ^^
@kmvb.tau: Any document reference for data partitioning techniques ? Partitioning data on ticketid will definitely help query performance.
@jackie.jxt: Here are the doc for the partition based segment assignment:
@mkrishna.parimi: @mkrishna.parimi has joined the channel
@keweishang: @keweishang has joined the channel

#random

#feat-presto-connector

@kmvb.tau: @kmvb.tau has joined the channel
@keweishang: @keweishang has joined the channel
@keweishang: @keweishang has left the channel

#troubleshooting

@david.gyoker: @david.gyoker has joined the channel
@bigdatalabsg: @bigdatalabsg has joined the channel
@kevin.smith: @kevin.smith has joined the channel
@jmeyer: Sorry for bumping the question, maybe it got shadowed by following ones. > Hello > What is the recommended approach to getting the "last non-null value" ? > Use a UDF ?
@g.kishore: Last non null value for a key?
@g.kishore: Or across lot of keys
@jmeyer: Across lot of keys Like in a "periodic" timeseries (ex: 1 data point per day), where some values (ex: some days) are null. How can we efficiently fetch the last *non-null* value ?
@jmeyer: *Date - Value* 05/01/2021 - null 04/01/2021 - null 03/01/2021 - 3.0 02/01/2021 - 1.0 *Output: 3.0*
@g.kishore: Ah, got it
@jmeyer: Maybe, it'd be simpler not to store null values and get the `last` value ? Not sure that'd be more efficient that way
@g.kishore: Select date, Val from T where F value >0 order by date desc limit 1 should work
@g.kishore: Writing a udf might be more efficient but see if this works for now
@jmeyer: Makes sense indeed ! Do you expect this query to be highly performant ? Should a specific index be added ?
@jmeyer: Actually, what about negative values ? We'd be interested in those too ^^ I know there's some support for null values, as in IS NOT NULL - but iirc, there's a performance penalty associated with this feature
@g.kishore: No, not null index is very fast
@g.kishore: So you can use that
@jmeyer: So, enable this feature (`nullHandlingEnabled`) And query like `SELECT value FROM table WHERE value IS NOT NULL ORDER BY date LIMIT 1`
@jmeyer: Sounds right ?
@g.kishore: yes
@jmeyer: Great, thank you @g.kishore :slightly_smiling_face: Have a good day !
@keweishang: @keweishang has joined the channel

#getting-started

@keweishang: @keweishang has joined the channel
@keweishang: @keweishang has left the channel

#minion-improvements

@laxman: I tried union approach. Pinot unit tests are failing. So, didn’t feel confident to push it to test environment
@laxman: TIMES function returning double. But `clicks` is of integer type. Wondering how it was working. And I feel this should fail earlier too.
@laxman: Here is the patch I tried
@laxman: ```diff --git a/pinot-core/src/main/java/org/apache/pinot/core/segment/processing/utils/SegmentProcessorUtils.java b/pinot-core/src/main/java/org/apache/pinot/core/segment/processing/utils/SegmentProcessorUtils.java index 920d75f7d..097873325 100644 --- a/pinot-core/src/main/java/org/apache/pinot/core/segment/processing/utils/SegmentProcessorUtils.java +++ b/pinot-core/src/main/java/org/apache/pinot/core/segment/processing/utils/SegmentProcessorUtils.java @@ -66,22 +66,22 @@ public final class SegmentProcessorUtils { if (fieldSpec.isSingleValueField()) { switch (dataType) { case INT: - fieldAssembler = fieldAssembler.name(name).type().intType().noDefault(); + fieldAssembler = fieldAssembler.name(name).type().optional().intType(); break; case LONG: - fieldAssembler = fieldAssembler.name(name).type().longType().noDefault(); + fieldAssembler = fieldAssembler.name(name).type().optional().longType(); break; case FLOAT: - fieldAssembler = fieldAssembler.name(name).type().floatType().noDefault(); + fieldAssembler = fieldAssembler.name(name).type().optional().floatType(); break; case DOUBLE: - fieldAssembler = fieldAssembler.name(name).type().doubleType().noDefault(); + fieldAssembler = fieldAssembler.name(name).type().optional().doubleType(); break; case STRING: - fieldAssembler = fieldAssembler.name(name).type().stringType().noDefault(); + fieldAssembler = fieldAssembler.name(name).type().optional().stringType(); break; case BYTES: - fieldAssembler = fieldAssembler.name(name).type().bytesType().noDefault(); + fieldAssembler = fieldAssembler.name(name).type().optional().bytesType(); break; default: throw new RuntimeException("Unsupported data type: " + dataType); @@ -89,19 +89,19 @@ public final class SegmentProcessorUtils { } else { switch (dataType) { case INT: - fieldAssembler = fieldAssembler.name(name).type().array().items().intType().noDefault(); + fieldAssembler = fieldAssembler.name(name).type().optional().array().items().intType(); break; case LONG: - fieldAssembler = fieldAssembler.name(name).type().array().items().longType().noDefault(); + fieldAssembler = fieldAssembler.name(name).type().optional().array().items().longType(); break; case FLOAT: - fieldAssembler = fieldAssembler.name(name).type().array().items().floatType().noDefault(); + fieldAssembler = fieldAssembler.name(name).type().optional().array().items().floatType(); break; case DOUBLE: - fieldAssembler = fieldAssembler.name(name).type().array().items().doubleType().noDefault(); + fieldAssembler = fieldAssembler.name(name).type().optional().array().items().doubleType(); break; case STRING: - fieldAssembler = fieldAssembler.name(name).type().array().items().stringType().noDefault(); + fieldAssembler = fieldAssembler.name(name).type().optional().array().items().stringType(); break; default: throw new RuntimeException("Unsupported data type: " + dataType);```
@laxman: cc: @fx19880617 @jackie.jxt @npawar Please take a look if you get few minutes.
@laxman: Also, please let me know if there are any issues I can work to fix in this conversion flow.

#product-launch

@g.kishore: @g.kishore has joined the channel
@g.kishore: @g.kishore has left the channel
--------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@pinot.apache.org For additional commands, e-mail: dev-h...@pinot.apache.org