Hello Jianfeng,

Thanks for getting back to me.  I thought about using "udata64", too. But that 
didn't work for me if a single packet was fanned out to multiple slave 
processes.  But most importantly, it looks like if a slave process crashes 
somewhere in the middle of getting or putting packets from/to a pool, we could 
end up with a deadlock. So I guess I'd have to think about a different design 
or be ready to bounce all of the processes if one of them fails.

Thanks,
Vlad

> -----Original Message-----
> From: Tan, Jianfeng [mailto:[email protected]]
> Sent: Thursday, March 01, 2018 3:20 AM
> To: Lazarenko, Vlad (WorldQuant); '[email protected]'
> Subject: RE: Multi-process recovery (is it even possible?)
> 
> 
> 
> > -----Original Message-----
> > From: users [mailto:[email protected]] On Behalf Of Lazarenko,
> > Vlad
> > (WorldQuant)
> > Sent: Thursday, March 1, 2018 2:54 AM
> > To: '[email protected]'
> > Subject: [dpdk-users] Multi-process recovery (is it even possible?)
> >
> > Guys,
> >
> > I am looking for possible solutions for the following problems that
> > come along with asymmetric multi-process architecture...
> >
> > Given multiple processes share the same RX/TX queue(s) and packet
> > pool(s) and the possibility of one packet from RX queue being fanned
> > out to multiple slave processes, is there a way to recover from slave
> > crashing (or exits w/o cleaning up properly)? In theory it could have
> > incremented mbuf reference count more than once and unless everything
> > is restarted, I don't see a reliable way to release those mbufs back to the
> pool.
> 
> Recycle an element is too difficult; from what I know, it's next to 
> impossible.
> To recycle a memzone/mempool is easier. So in your case, you might want to
> use different pools for different queues (processes).
> 
> If you really want to recycle an element, rte_mbuf in your case, it might be
> doable by:
> 1. set up rx callback for each process, and in the callback, store a special 
> flag
> at rte_mbuf->udata64.
> 2. when the primary to detect a secondary is down, we iterate all element
> with the special flag, and put them back into the ring.
> 
> There is small chance to fail that , mbuf is allocated by a secondary process,
> and before it's flagged, it crashes.
> 
> Thanks,
> Jianfeng
> 
> 
> >
> > Also, if spinlock is involved and either master or slave crashes,
> > everything simply gets stuck. Is there any way to detect this (i.e. outside 
> > of
> data path)..?
> >
> > Thanks,
> > Vlad
> >



###################################################################################

The information contained in this communication is confidential, may be

subject to legal privilege, and is intended only for the individual named.

If you are not the named addressee, please notify the sender immediately and

delete this email from your system.  The views expressed in this email are

the views of the sender only.  Outgoing and incoming electronic communications

to this address are electronically archived and subject to review and/or 
disclosure

to someone other than the recipient.

###################################################################################

Reply via email to