pdeva opened a new issue #8446: flatten json to multiple rows
URL: https://github.com/apache/incubator-druid/issues/8446
 
 
   ### Description
   
   if i have an object like:
   
   ```js
   {
     host: "myhost"
     metrics: [
       { name: "metric1", value: 100},
       { name: "metric2", value: 200}
     ]
   }
   ```
   
   flatten it to 2 rows like:
   
   ```
   host: "myhost", metric: "metric1", value: 100
   host: "myhost", metric: "metric2", value: 200
   ```
   
   ### Motivation
   
   benefits of this approach:
   
   1. Ability to transfer multiple rows of data in a single atomic unit
   
   currently in order to achieve this, you need to use kafka transactional 
topics, which add even more overhead. with the mentioned approach, a single 
piece of json if transferred successfully results in 'all or nothing' for 
multiple rows.
   
   2. Less data transfer
   
   If a piece of data like this is currently received, it needs to be converted 
to multiple rows, transferred to kafka, then read in multiple separate rows by 
druid.
   this results in memory/cpu overhead processsing all the redundant columns of 
data across those multiple rows. 
   With the approach defined here, they are just defined once, resulting in an 
overall smaller payload.
   
   3. decreased complexity
   currently you need to have code in a transformer to translate the json above 
into multiple rows. all that code will no longer be required.
   
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to