mgahan opened a new issue, #34762:
URL: https://github.com/apache/arrow/issues/34762

   ### Describe the enhancement requested
   
   Stack Overflow Reference:
   
   
https://stackoverflow.com/questions/75861075/is-there-a-way-to-read-a-nested-column
   
   I have a bunch of Newline-delimited JSON files that I want to read into R 
using the arrow package.
   
   One of the parameters in the file is nested. The potential nested values are 
quite big and messy and I would prefer to only select the nested parameters I 
actually need.
   
   Here is an example of the data I am working with:
   
   ```
   # Bring in libraries
   suppressMessages(library(arrow))
   suppressMessages(library(data.table))
   
   # Make data
   tf <- tempfile()
   on.exit(unlink(tf))
   writeLines('
       { "hello": 3.5, "world": false, "yo":{"param1":"duck1","param2":"duck2"} 
}
       { "hello": 3.25, "world": null, "yo":{"param1":"duck3","param2":"duck4"} 
}
       { "hello": 0.0, "world": true, "yo":{"param1":"duck5","param2":"duck6"} }
     ', tf, useBytes = TRUE)
   df <- read_json_arrow(tf)
   ```
   
   This is the result of what I just read in:
   
   
![image](https://user-images.githubusercontent.com/6619822/228276217-6173715b-614b-40a5-9331-6d6a06670897.png)
   
   ```
   read_json_arrow(tf, col_select = "yo")
   ```
   
   I can also read in the "yo" column. The results is below:
   
   
![image](https://user-images.githubusercontent.com/6619822/228276398-3c37c153-0995-429a-ac32-11f41d8bdc90.png)
   
   But I am having trouble reading in the "yo.param1" data element:
   
   
![image](https://user-images.githubusercontent.com/6619822/228276472-8832281c-ce3d-4fbc-8563-c33f8358e287.png)
   
   Any ideas on how I might read this nested column in and avoid reading in the 
entire column?
   
   
   
   ### Component(s)
   
   R


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to