> I'm not really sure what you're talking about here, since I did not suggest a "shared data structure", and I'm not really sure what that means in this context.
What you claimed is just monitoring/updating the state with a single thread *given* all applications have been there. To implement this functionality, you need to make all state of applications accessible by this thread so that you need some data structure storing that (maybe some pointers to Session object) Then how you put new object to this data structure? through Servlet directly? then this data structure would be shared by different threads. make that thread something like Spark's eventLoop thread? we are seeing the scalability issues there... > Yes. While there are applications that need monitoring, you poll YARN at a constant frequency. Basically what would be done by multiple threads, but there's a single one. Did you find the bulk API? > Why not. The expensive part is not parsing results, I'll bet, but having a whole bunch of different tasks opening and closing YARN connections. First, YARNClient is thread safe and can be shared by multiple threads.... Second, If I have 1000 applications, what's your expectation to the following cases 1. YARN processed request for 999 and failed on the last one for some reason 2. Livy received 999 well-formatted response but get 1 malformed response On Tue, Aug 15, 2017 at 5:54 PM, Marcelo Vanzin <[email protected]> wrote: > On Tue, Aug 15, 2017 at 2:20 PM, Nan Zhu <[email protected]> wrote: > > The key design consideration here is that how you model the state of > > applications, if in actor, then there will be no synchronization involved > > and yielding a cleaner design; if in a shared data structure, you will > have > > to be careful about coordinating threads here (we actually have a design > > based on shared data structure and we eventually discard to pursue a > > cleaner one). > > I'm not really sure what you're talking about here, since I did not > suggest a "shared data structure", and I'm not really sure what that > means in this context. > > > I think bulk API can make life easier comparing to the shared data > > structure, but it raises up two questions > > > > 1. Are we going to update all applications in the uniform pace, even they > > are submitted in different time? > > Yes. While there are applications that need monitoring, you poll YARN > at a constant frequency. Basically what would be done by multiple > threads, but there's a single one. > > > 2. Are we going to use a single thread for everything, including > send/recv > > req/res and parse, etc. > > Why not. The expensive part is not parsing results, I'll bet, but > having a whole bunch of different tasks opening and closing YARN > connections. > > > -- > Marcelo >
