[ https://issues.apache.org/jira/browse/TINKERPOP-2197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17308150#comment-17308150 ]
David Edey commented on TINKERPOP-2197: --------------------------------------- In case this is useful for anyone else in the future. We had this issue, which plays particularly badly with AWS Lambda freezing and killing sockets. When AWS Lambda unfroze to service the next request, the event emitter would emit the error event, which would cause the node runtime to exit due to [https://nodejs.org/api/events.html#events_error_events] This would return: {noformat} START RequestId: be28ed85-7065-4bd5-9bd2-c7ffe179b655 Version: $LATEST ... (Redacted details about the request) 2021-03-24T12:18:29.283Z be28ed85-7065-4bd5-9bd2-c7ffe179b655 ERROR Uncaught Exception { "errorType": "Error", "errorMessage": "read ECONNRESET", "code": "ECONNRESET", "errno": "ECONNRESET", "syscall": "read", "stack": [ "Error: read ECONNRESET", " at TLSWrap.onStreamRead (internal/stream_base_commons.js:209:20)" ] } END RequestId: be28ed85-7065-4bd5-9bd2-c7ffe179b655 {noformat} h3. The fix The good news is this is now fixed in version 3.4.4 and above of the gremlin npm library in [through this commit](https://github.com/apache/tinkerpop/commit/78819c3efa97f87ddc55da8e9e4543d661f4d1ec) - in TINKERPOP-2290 - so you might just need to update the version you're using. If for whatever reason you need to stay on a lower version, we had success with handling the error in user land with the following, although updating is surely better. {code:javascript} driverRemoteConnection?._client?._connection.on("error", e => { /* Ignore this as it's already handled in the connection - possibly log an error */ }); {code} > gremlin javascript - Error: read ECONNRESET at TLSWrap.onStreamRead - > websocket error > -------------------------------------------------------------------------------------- > > Key: TINKERPOP-2197 > URL: https://issues.apache.org/jira/browse/TINKERPOP-2197 > Project: TinkerPop > Issue Type: Bug > Components: javascript > Affects Versions: 3.4.0 > Environment: windows 10 ent, 10.016299 > nodejs 11.10.1 > gemlin javascript 3.4.0 > express 4.16.4 > Reporter: Thomas Mahringer > Priority: Major > Attachments: gremlin-ws-error.png > > > *Environment* > > I'm running a nodejs express app and connect to MSFT azure gremlin through > the js driver: > > {code:java} > connect() { > this.authenticator = new > Gremlin.driver.auth.PlainTextSaslAuthenticator( > `/dbs/${this.config.database}/colls/${this.config.collection}`, > this.config.primaryKey) > this.client = new Gremlin.driver.Client( > this.config.endpoint, > { > authenticator: this.authenticator, > traversalsource: "g", > rejectUnauthorized: true, > mimeType: "application/vnd.gremlin-v2.0+json" > } > } > ); > {code} > > > The app calls various gremlin commands through the string + "query parameter" > syntax, e.g. > {code:java} > await this.client.submit("g.V().hasLabel(label)", {label: "Person"});{code} > > All the queries work fine but when the app is idling for about 5-10 minutes, > the nodejs process exits with the above error. None of my error handlers are > hit. > So I debugged the gremlin js code and found out where the error is thrown: > It's the error handler in gremlin/lib/driver/connection.js, which gets > installed in "open" > {code:java} > open() { > if (this.isOpen) { > return Promise.resolve(); > } > if (this._openPromise) { > return this._openPromise; > } > this.emit('log', `ws open`); > this._ws = new WebSocket(this.url, { > headers: this.options.headers, > ca: this.options.ca, > cert: this.options.cert, > pfx: this.options.pfx, > rejectUnauthorized: this.options.rejectUnauthorized > }); > this._ws.on('message', (data) => this._handleMessage(data)); > // ******* Install error handler > this._ws.on('error', (err) => this._handleError(err)); > ... > } > _handleError(err) { > this.emit('log', `ws error ${err}`); > console.error("***************** Added log to improve debug > ****************") > this._cleanupWebsocket(); > this.emit('error', err); // Error ist thrown here > }{code} > > In the debugger I can see that the error is caused in > *stream_base_commons.js:* > {code:java} > function onStreamRead(arrayBuffer) { > const nread = streamBaseState[kReadBytesOrError]; > const handle = this; > const stream = this[owner_symbol]; > stream[kUpdateTimer](); > if (nread > 0 && !stream.destroyed) { > const offset = streamBaseState[kArrayBufferOffset]; > const buf = new FastBuffer(arrayBuffer, offset, nread); > if (!stream.push(buf)) { > handle.reading = false; > if (!stream.destroyed) { > const err = handle.readStop(); > if (err) > stream.destroy(errnoException(err, 'read')); > } > } > return; > } > if (nread === 0) { > return; > } > if (nread !== UV_EOF) { > /*Happens here >>>> */ return stream.destroy(errnoException(nread, > 'read')); > } > ... > } > {code} > > So my questions are: > * Why is the error thrown in the first place. (After app idles for 5-10 > minutes) > ** What happens in "this.emit('error', err);" in "Connection._handleError". > * What is the right place to catch the error? I've of course wrapped the > "submit" code above in try/catch (in case of async/await) or as > "then(...).catch()..." (in case of using the promise directly). Since the > error causes nodejs to exit, it's quite bad.:) > ** As it all happens asynchronously, I would need something like an > "onError" handler. But the web socket and its handler > (this._ws.on('error')...) are within the "Connection" class. > The attached image shows the stack trace in Visual Studio Code Debugger. > Any help is appreciated! > Thanks > Thomas -- This message was sent by Atlassian Jira (v8.3.4#803005)