Hi,

First of all, sorry if this is not the right channel to report a bug. I saw
that the Developer Documentation mentions JIRA. However, I was not sure if
non-developers can create new issues.


Description of the issue:

Gremlin Go hangs when the connection with the database is dropped in the middle
of the execution of a long traversal. Just so you have some specific context,
the database I'm using is AWS Neptune and I access it through local
port-forwarding. I identified that Gremlin Go was hanging when the
port-forwarding died or became unstable in the middle of the execution of a
traversal.

How to reproduce the issue:

1. Execute an expensive traversal.
2. Drop the connection to the DB in the middle of the execution.
3. Gremlin Go should hang.

Root cause analysis:

After debugging the issue, it seems to be caused by a deadlock that happens in
gremlinServerWSProtocol.readLoop() when protocol.transporter.Read() returns an
error.

The following simplified call graph shows what is happening:

gremlinServerWSProtocol.readLoop()
        gorillaTransporter.Read()
        readErrorHandler()
                synchronizedMap.synchronizedRange()
                        synchronizedMap.syncLock.Lock()
                        channelResultSet.Close()
                                channelResultSet.container.delete()
                                        synchronizedMap.syncLock.Lock()

As you can see synchronizedMap.syncLock.Lock() is called twice without
unlocking the mutex, which causes the deadlock.


I'm not familiar with the code base so bear with me if the information I'm
providing is not totally accurate.

Thanks for the hard work! So far, the driver looks amazing!

        Roi Martin

Reply via email to