mofeiatwork commented on issue #12459:
URL: https://github.com/apache/arrow/issues/12459#issuecomment-1045610272


   After reading the arrow document and code, I have implemented a basic 
row-based converter from arrow array to json, which could handle most primitive 
types and nested types including struct, list, and map.
   
   Of course it's less efficient than columnar style builder, which I will 
further improve it.
   
   I think the key is how to convert nested type to tree-schema JSON through a 
row-oriented JSON builder. Since most JSON builder implementations (like 
rapidjson, simdjson) are row-oriented, which build a JSON document one by one. 
A basic idea is, iterate nested data recursive, and build the JSON tree at the 
same time. The limitation is during the iteration of nested data, there's a lot 
of code branch which reduce the performance.
   
   So my idea is separate JSON building into two stages:
   - Schema building: build JSON tree structure according to arrow schema
   - Leaf filling: fill the JSON tree leaf node with arrow array
   
   In this way, most code branch could be eliminated, and the access of arrow 
array will be cache-friendly.
   
   How do you think about it?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to