egalpin opened a new issue #7978:
URL: https://github.com/apache/pinot/issues/7978


   As of today, the JSON index is an inverted index and can't be further 
configured by the user via schema settings. It would be very powerful to 
support the ability to specify different data types and indexes for fields of 
JSON columns. Using the example from the JSON Docs[1]:
   
   ```
   {
     "name": "adam",
     "age": 30,
     "country": "us",
     "addresses":
     [
       {
         "number" : 112,
         "street" : "main st",
         "country" : "us"
       },
       {
         "number" : 2,
         "street" : "second st",
         "country" : "us"
       },
       {
         "number" : 3,
         "street" : "third st",
         "country" : "ca"
       }
     ]
   }
   ```
   
   For example, having the ability to specify that `number` is an `int` and 
building a range index would allow for range queries. The street address number 
is a contrived example that's not really practical, but it gets the point 
across.
   
   Adding a range index could be done today using an ingestion transform. But 
adding a range index to JSON data becomes very powerful when combined with the 
idea that JSON context can be maintained when using JSON_MATCH.
   
   Ex.
   ```sql
   SELECT ... 
   FROM mytable 
   WHERE JSON_MATCH(person, '"$.addresses[*].number"<=2 AND 
"$.addresses[*].country"=''ca''')
   ```
   
   This would not match the above `adam` record, because within the context of 
the JSON object itself there is no satisfactory result where both predicates 
are true. Using an ingestion transform cannot reproduce this functionality. 
Note that range index is a single example, and ideally all types of indexes 
would be supported for this type of semi-structured JSON data with a 
predictable schema. 
   
   [1] 
https://docs.pinot.apache.org/basics/indexing/json-index#chained-key-lookup


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to