egalpin opened a new issue #7978:
URL: https://github.com/apache/pinot/issues/7978
As of today, the JSON index is an inverted index and can't be further
configured by the user via schema settings. It would be very powerful to
support the ability to specify different data types and indexes for fields of
JSON columns. Using the example from the JSON Docs[1]:
```
{
"name": "adam",
"age": 30,
"country": "us",
"addresses":
[
{
"number" : 112,
"street" : "main st",
"country" : "us"
},
{
"number" : 2,
"street" : "second st",
"country" : "us"
},
{
"number" : 3,
"street" : "third st",
"country" : "ca"
}
]
}
```
For example, having the ability to specify that `number` is an `int` and
building a range index would allow for range queries. The street address number
is a contrived example that's not really practical, but it gets the point
across.
Adding a range index could be done today using an ingestion transform. But
adding a range index to JSON data becomes very powerful when combined with the
idea that JSON context can be maintained when using JSON_MATCH.
Ex.
```sql
SELECT ...
FROM mytable
WHERE JSON_MATCH(person, '"$.addresses[*].number"<=2 AND
"$.addresses[*].country"=''ca''')
```
This would not match the above `adam` record, because within the context of
the JSON object itself there is no satisfactory result where both predicates
are true. Using an ingestion transform cannot reproduce this functionality.
Note that range index is a single example, and ideally all types of indexes
would be supported for this type of semi-structured JSON data with a
predictable schema.
[1]
https://docs.pinot.apache.org/basics/indexing/json-index#chained-key-lookup
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]