On Thu, Jun 09, 2011 at 11:31:31AM -0700, Maxim Sobolev wrote:
> >Sorry, but I think your patch is wrong. If even it fixes the issue for you,
> >actually I think it does not fix but hides a real problem we have to address.
> >
> >Receiving the whole chunk at once should be more effectively because we do
> >one
> >syscall instead of several. Also, if you receive in smaller chunks no need to
> >set MSG_WAITALL at all.
> >
> >Besides, with your patch I am observing hangs on primary startup in
> >
> >init_remote->primary_connect->proto_connection_recv->proto_common_recv
> >
> >The primary worker process asks the parent to connect to the secondary. After
> >establishing the connection the parent sends connection protocol name and
> >descriptor to the worker (proto_connection_send/proto_connection_recv). The
> >issue here is that in proto_connection_recv() the size of protoname is
> >unknown, so it calls proto_common_recv() with size = 127, larger than
> >protoname ("tcp").
> >
> >It worked previously because after sending protoname proto_connection_send()
> >sends the descriptor calling sendmsg(). This is data of different type and it
> >makes recv() return although only 4 bytes of 127 requested were received.
> >
> >With your patch, after receiving these 4 bytes it returns back to recv()
> >waiting for rest 123 bytes and gets stuck forever. Don't you observe this?
> >It
> >is strange, because for me it hangs on every start up. I am seeing this on
> >yesterday current.
>
> Yes, you are right. It appears that I did not test new code on
> primary, only on secondary. Which explains why I did not see that
> issue. Can you please try the following patch and let me know if it
> solves the issue for you?
>
> http://sobomax.sippysoft.com/hastd.diffI strongly disagree with this patch. The whole proto API is now clean and elegant. Hopefully this polution is not needed and I'd prefer to give time to Mikolaj to investigate it. I also plan to use it in other projects. HAST currently doesn't support 'async' mode, so working over WAN is not common use case for now, which makes your changes low priority. High priority for HAST is working over LAN. In the future it would be nice if you could send patches for review before you commit them with 3 days MFC time. HAST is actively maintained. -- Pawel Jakub Dawidek http://www.wheelsystems.com FreeBSD committer http://www.FreeBSD.org Am I Evil? Yes, I Am! http://yomoli.com
pgpewSqtm2flF.pgp
Description: PGP signature
