joosthooz commented on PR #33738:
URL: https://github.com/apache/arrow/pull/33738#issuecomment-1400316462

   Hi, I gave this branch a spin, and it seems that the nesting has become 
inconsistent: 
   
![image](https://user-images.githubusercontent.com/1442581/214045948-eb3f077b-da26-457c-ab42-b5c6b5166f55.png)
   There's 2 ReadBatch spans under InitialTask. 1 of these has all the 
FragmentsToBatches as its child spans (these were nested under the SourceNode 
before). The other keeps recursively nesting more ReadBatch spans. Each has a 
ProcessMorsel, that has the filter, project and sink spans nested under each 
other. Then the dataset writer also keeps nesting WriteAndCheckBackpressure.
   
![image](https://user-images.githubusercontent.com/1442581/214047110-cfd52d55-8e6c-413a-a3e3-3fbc23f949cb.png)
   Is there a way to go back to making most of these spans siblings again?
   Do we want to change the organization of the spans in this PR from having 1 
span for each node in the graph, each having a span for every chunk of data it 
processes (how it was before), to having a ProcessMorsel for each chunk of 
data, each having a span for each node it traverses through?
   I think I can help in a follow-up PR, especially for the dataset writer.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to