[ https://issues.apache.org/jira/browse/TOREE-409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16220449#comment-16220449 ]
Nicholas Marion commented on TOREE-409: --------------------------------------- This appears to occur when file-encoding is set to ISO8859-1, (ASCII). Changing file-encoding to UTF-8 has fixed the problem. > Signature Mismatch between jupyter_client and Apache Toree > ---------------------------------------------------------- > > Key: TOREE-409 > URL: https://issues.apache.org/jira/browse/TOREE-409 > Project: TOREE > Issue Type: Bug > Components: Kernel > Environment: On an x/86 system, I have a jupyter notebook server > running. On a z/OS system, I have a Jupyter Kernel Gateway running. The > notebook server connects to the kernel gateway which gives the server access > to Apache Toree (what is currently in master branch from about two weeks ago) > . Apache Toree then interacts with the Spark cluster (2.0.2) The scala > version is 2.11.8. The python version I am using is Python 3.6 (although I > also tried this with python 2.7 and I also ran into the same problem) > Reporter: Yunli Tang > > I am encountering a signature mismatch that is caused when using unicode > characters > 128 > The environment I am currently in: > On an x/86 system, I have a jupyter notebook server running. On a z/OS > system, I have a Jupyter Kernel Gateway running. The notebook server connects > to the kernel gateway which gives the server access to Apache Toree (what is > currently in master branch from about two weeks ago) . Apache Toree then > interacts with the Spark cluster (2.0.2) The scala version is 2.11.8. The > python version I am using is Python 3.6 (although I also tried this with > python 2.7 and I also ran into the same problem) > This error occurs when I use either python 2.7 and python 3.6 > In a jupyter notebook I try running the following: > print("่") > The notebook hangs up and the following is produced from the kernel gateway > logs: > [E 170413 15:16:27 web:1548] Uncaught exception GET > /api/kernels/e6f0c109-d3b2-4254-85a6-1eea95f7175b/channels (9.12.41.240) > HTTPServerRequest(protocol='http', host='9.12.41.72:9099', method='GET', > uri='/api/kernels/e6f0c109-d3b2-4254-85a6-1eea95f7175b/channels', > version='HTTP/1.1', remote_ip='9.12.41.240', headers={'Upgrade': 'websocket', > 'Accept-Encoding': 'gzip', 'Sec-Websocket-Version': '13', 'Connection': > 'Upgrade', 'Sec-Websocket-Key': 'evzOnn7Up3BD/6Grb87mCQ==', 'Host': > '9.12.41.72:9099', 'Authorization': 'token commander'}) > Traceback (most recent call last): > File > "/Voyager/Hamlet/python/python-2017-04-12-py27/python27/lib/python2.7/site-packages/tornado/web.py", > line 1425, in _stack_context_handle_exception > raise_exc_info((type, value, traceback)) > File > "/Voyager/Hamlet/python/python-2017-04-12-py27/python27/lib/python2.7/site-packages/tornado/stack_context.py", > line 314, in wrapped > ret = fn(*args, **kwargs) > File > "/Voyager/Hamlet/python/python-2017-04-12-py27/python27/lib/python2.7/site-packages/zmq/eventloop/zmqstream.py", > line 191, inย > self.on_recv(lambda msg: callback(self, msg), copy=copy) > File > "/Voyager/Hamlet/python/python-2017-04-12-py27/python27/lib/python2.7/site-packages/jupyter_kernel_gateway-1.2.1-py2.7.egg/kernel_gateway/services/kernels/handlers.py", > line 172, in _on_zmq_reply > super(ZMQChannelsHandler, self)._on_zmq_reply(stream, msg_list) > File > "/Voyager/Hamlet/python/python-2017-04-12-py27/python27/lib/python2.7/site-packages/notebook/services/kernels/handlers.py", > line 296, in _on_zmq_reply > msg = self.session.deserialize(fed_msg_list) > File > "/Voyager/Hamlet/python/python-2017-04-12-py27/python27/lib/python2.7/site-packages/jupyter_client/session.py", > line 859, in deserialize > raise ValueError("Invalid Signature: %r" % signature) > ValueError: Invalid Signature: > '4324e46ac9c58336e781be2bff631fb7e3019f1ce58f5795544a8d54cdfa0f0a' > Upon further investigation, I wanted to see the messages that were being > received by the zmq socket and what was being sent to the zmq socket. Here is > what I found when running the cell with print("่"): > The CONTENT STRING that is received by zmq socket in hexadecimal: > 7b22636f6465223a227072696e74285c223f5c2229222c22657865637574696f6e5f636f756e74223a317d > which is : {"code":"print(\\"?\\")","execution_count":1} > Notice the "3f" 3f is "?" in utf-8 encoding. > The CONTENT STRING that is being SIGNED by Apache Toree over the zmq socket > in hexadecimal: > 7b22636f6465223a227072696e74285c22e892845c2229222c22657865637574696f6e5f636f756e74223a317d > Now, if you compare both this hexadecimal string and the one that is being > received by zmq socket it is different! The difference is the "3f" in what is > being received and the "e89284" in what is being signed. Note that e89284 > equates to ่ in utf-8 encoding. > The STRING that is being SENT by Apache Toree over the zmq socket in > hexadecimal: > 536f6d65285b2033666438623739382d303663312d343634322d616130392d6635636232633664326636662c203c4944537c4d53473e2c20343763333766326264366161316663353335636230343466313331663838313861343462343164383066306463643332316239343934386239333561303135642c207b226d73675f6964223a2266633934623732612d646466642d343263372d386230662d643034626561386533616530222c22757365726e616d65223a2254414e4759222c2273657373696f6e223a2231663061366439622d656431642d343132312d623566342d386330366163613939323261222c226d73675f74797065223a22657865637574655f696e707574222c2276657273696f6e223a22352e30227d2c207b226d73675f6964223a224342463035463446323945443430323538423835323336373644383937393130222c22757365726e616d65223a22757365726e616d65222c2273657373696f6e223a224346443534443639324334303446463138333330324142424238433431323533222c226d73675f74797065223a22657865637574655f72657175657374222c2276657273696f6e223a22352e30227d2c207b2274696d657374616d70223a2231343934383736343036353534227d2c207b22636f6465223a227072696e74285c223f5c2229222c22657865637574696f6e5f636f756e74223a317d205d29 > In english that equates to: > Some([ > 3fd8b798-06c1-4642-aa09-f5cb2c6d2f6f,<IDS|MSG>,47c37f2bd6aa1fc535cb044f131f8818a44b41d80f0dcd321b94948b935a015d, > > {"msg_id":"fc94b72a-ddfd-42c7-8b0f-d04bea8e3ae0","username":"TANGY","session":"1f0a6d9b-ed1d-4121-b5f4-8c06aca9922a","msg_type":"execute_input","version":"5.0"},{"msg_id":"CBF05F4F29ED40258B8523676D897910","username":"username","session":"CFD54D692C404FF183302ABBB8C41253","msg_type":"execute_request","version":"5.0"}, > {"timestamp":"1494876406554"}, {"code":"print(\"?\")","execution_count":1} ]) > The part that is interesting to me is the content string, I parsed out the > content string of the hexadecimal message above: > 7b22636f6465223a227072696e74285c223f5c2229222c22657865637574696f6e5f636f756e74223a317d > This is where I'm guessing the invalid mismatch occurs. The content string > that apache toree is signing off on is different from the content string that > is is sending over. Notice that the content string that is being sent over is > exactly the same as the content string that is being received by zmq socket > (both have the invalid 3f) > This is where I put my debug statements in case it matters: > communication/src/main/scala/org/apache/toree/communication/socket/ZeroMQSocketRunnable.scala: > /** > * Sends the next outbound message from the outbound message queue. > * > * @param socket The socket to use when sending the message > * > * @return True if a message was sent, otherwise false > */ > protected def processNextOutboundMessage(socket: ZMQ.Socket): Boolean = { > val message = Option(outboundMessages.poll()) > if (message != None){ > logger.warn(s"Message that is SENT IN HEX:" + String.format("%040x", > new BigInteger(1, s"${message}".getBytes(StandardCharsets.UTF_8)))) > logger.warn(s"Message that is SENT:" + s"${message}") > } > message.foreach(_.send(socket)) > message.nonEmpty > } > And then also in: > communication/src/main/scala/org/apache/toree/communication/security/SignatureProducerActor.scala > class SignatureProducerActor( > private val hmac: Hmac > ) extends Actor with LogLike with OrderedSupport { > override def receive: Receive = { > case message: KernelMessage => withProcessing { > logger.warn(s"Message that is being signed (HEADER):" + > s"${message.header}") > logger.warn(s"Message that is being signed (PARENT HEADER):" + > s"${message.parentHeader}") > logger.warn(s"Message that is being signed (METADATA):" + > s"${message.metadata}") > logger.warn(s"Message that is being signed IN HEX (CONTENT STRING):" + > String.format("%040x", new BigInteger(1, > s"${message.contentString}".getBytes(StandardCharsets.UTF_8)))) > logger.warn(s"Message that is being signed (CONTENT STRING):" + > s"${message.contentString}") > val signature = hmac( > Json.stringify(Json.toJson(message.header)), > Json.stringify(Json.toJson(message.parentHeader)), > Json.stringify(Json.toJson(message.metadata)), > message.contentString > ) > sender ! signature > } > } > Also something else I noticed was when I ran jupyter notebook/toree from > source (make dev) I noticed that in the message, the hexadecimal > representation of the content string gets sent over as opposed to the string > itself i.e. > Some([ 9FF3E30DB4AD4ED2B0C6795A5AF321A6, > <IDS|MSG>,fd19b14775db834185f1fafd1d22061a903898db98b25582700de5230a85c9c4,{"msg_id":"4c18b424-4d18-4f3e-bddb-b035c638ab7e","username":"root","session":"2142f120-8287-4723-9d0e-05d85260fb0b","msg_type":"execute_input","version":"5.0"}, > > {"msg_id":"D872D7431AF941C4865B9D255CB01A5A","username":"username","session":"9FF3E30DB4AD4ED2B0C6795A5AF321A6","msg_type":"execute_request","version":"5.0"}, > {"timestamp":"1494877223980"}, > 7B22636F6465223A227072696E74285C22E892845C2229222C22657865637574696F6E5F636F756E74223A317D > ]) > In my case I see this: > Some([ > 3fd8b798-06c1-4642-aa09-f5cb2c6d2f6f,<IDS|MSG>,47c37f2bd6aa1fc535cb044f131f8818a44b41d80f0dcd321b94948b935a015d, > > {"msg_id":"fc94b72a-ddfd-42c7-8b0f-d04bea8e3ae0","username":"TANGY","session":"1f0a6d9b-ed1d-4121-b5f4-8c06aca9922a","msg_type":"execute_input","version":"5.0"},{"msg_id":"CBF05F4F29ED40258B8523676D897910","username":"username","session":"CFD54D692C404FF183302ABBB8C41253","msg_type":"execute_request","version":"5.0"}, > {"timestamp":"1494876406554"}, {"code":"print(\"?\")","execution_count":1} ]) > Thanks for the help -- This message was sent by Atlassian JIRA (v6.4.14#64029)