[ 
https://issues.apache.org/jira/browse/TOREE-409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16220449#comment-16220449
 ] 

Nicholas Marion commented on TOREE-409:
---------------------------------------

This appears to occur when file-encoding is set to ISO8859-1, (ASCII). Changing 
file-encoding to UTF-8 has fixed the problem.

> Signature Mismatch between jupyter_client and Apache Toree
> ----------------------------------------------------------
>
>                 Key: TOREE-409
>                 URL: https://issues.apache.org/jira/browse/TOREE-409
>             Project: TOREE
>          Issue Type: Bug
>          Components: Kernel
>         Environment: On an x/86 system, I have a jupyter notebook server 
> running. On a z/OS system, I have a Jupyter Kernel Gateway running. The 
> notebook server connects to the kernel gateway which gives the server access 
> to Apache Toree (what is currently in master branch from about two weeks ago) 
> . Apache Toree then interacts with the Spark cluster (2.0.2) The scala 
> version is 2.11.8. The python version I am using is Python 3.6 (although I 
> also tried this with python 2.7 and I also ran into the same problem)
>            Reporter: Yunli Tang
>
> I am encountering a signature mismatch that is caused when using unicode 
> characters > 128 
> The environment I am currently in:
> On an x/86 system, I have a jupyter notebook server running. On a z/OS 
> system, I have a Jupyter Kernel Gateway running. The notebook server connects 
> to the kernel gateway which gives the server access to Apache Toree (what is 
> currently in master branch from about two weeks ago) . Apache Toree then 
> interacts with the Spark cluster (2.0.2) The scala version is 2.11.8. The 
> python version I am using is Python 3.6 (although I also tried this with 
> python 2.7 and I also ran into the same problem)
> This error occurs when I use either python 2.7 and python 3.6
> In a jupyter notebook I try running the following:
> print("่’„")
> The notebook hangs up and the following is produced from the kernel gateway 
> logs:
> [E 170413 15:16:27 web:1548] Uncaught exception GET 
> /api/kernels/e6f0c109-d3b2-4254-85a6-1eea95f7175b/channels (9.12.41.240)
> HTTPServerRequest(protocol='http', host='9.12.41.72:9099', method='GET', 
> uri='/api/kernels/e6f0c109-d3b2-4254-85a6-1eea95f7175b/channels', 
> version='HTTP/1.1', remote_ip='9.12.41.240', headers={'Upgrade': 'websocket', 
> 'Accept-Encoding': 'gzip', 'Sec-Websocket-Version': '13', 'Connection': 
> 'Upgrade', 'Sec-Websocket-Key': 'evzOnn7Up3BD/6Grb87mCQ==', 'Host': 
> '9.12.41.72:9099', 'Authorization': 'token commander'})
> Traceback (most recent call last):
> File 
> "/Voyager/Hamlet/python/python-2017-04-12-py27/python27/lib/python2.7/site-packages/tornado/web.py",
>  line 1425, in _stack_context_handle_exception
> raise_exc_info((type, value, traceback))
> File 
> "/Voyager/Hamlet/python/python-2017-04-12-py27/python27/lib/python2.7/site-packages/tornado/stack_context.py",
>  line 314, in wrapped
> ret = fn(*args, **kwargs)
> File 
> "/Voyager/Hamlet/python/python-2017-04-12-py27/python27/lib/python2.7/site-packages/zmq/eventloop/zmqstream.py",
>  line 191, inย 
> self.on_recv(lambda msg: callback(self, msg), copy=copy)
> File 
> "/Voyager/Hamlet/python/python-2017-04-12-py27/python27/lib/python2.7/site-packages/jupyter_kernel_gateway-1.2.1-py2.7.egg/kernel_gateway/services/kernels/handlers.py",
>  line 172, in _on_zmq_reply
> super(ZMQChannelsHandler, self)._on_zmq_reply(stream, msg_list)
> File 
> "/Voyager/Hamlet/python/python-2017-04-12-py27/python27/lib/python2.7/site-packages/notebook/services/kernels/handlers.py",
>  line 296, in _on_zmq_reply
> msg = self.session.deserialize(fed_msg_list)
> File 
> "/Voyager/Hamlet/python/python-2017-04-12-py27/python27/lib/python2.7/site-packages/jupyter_client/session.py",
>  line 859, in deserialize
> raise ValueError("Invalid Signature: %r" % signature)
> ValueError: Invalid Signature: 
> '4324e46ac9c58336e781be2bff631fb7e3019f1ce58f5795544a8d54cdfa0f0a'
> Upon further investigation, I wanted to see the messages that were being 
> received by the zmq socket and what was being sent to the zmq socket. Here is 
> what I found when running the cell with print("่’„"):
> The CONTENT STRING that is received by zmq socket in hexadecimal: 
> 7b22636f6465223a227072696e74285c223f5c2229222c22657865637574696f6e5f636f756e74223a317d
> which is : {"code":"print(\\"?\\")","execution_count":1}
> Notice the "3f" 3f is "?" in utf-8 encoding.
> The CONTENT STRING that is being SIGNED by Apache Toree over the zmq socket 
> in hexadecimal:
> 7b22636f6465223a227072696e74285c22e892845c2229222c22657865637574696f6e5f636f756e74223a317d
> Now, if you compare both this hexadecimal string and the one that is being 
> received by zmq socket it is different! The difference is the "3f" in what is 
> being received and the "e89284" in what is being signed. Note that e89284 
> equates to ่’„ in utf-8 encoding. 
> The STRING that is being SENT by Apache Toree over the zmq socket in 
> hexadecimal:
> 536f6d65285b2033666438623739382d303663312d343634322d616130392d6635636232633664326636662c203c4944537c4d53473e2c20343763333766326264366161316663353335636230343466313331663838313861343462343164383066306463643332316239343934386239333561303135642c207b226d73675f6964223a2266633934623732612d646466642d343263372d386230662d643034626561386533616530222c22757365726e616d65223a2254414e4759222c2273657373696f6e223a2231663061366439622d656431642d343132312d623566342d386330366163613939323261222c226d73675f74797065223a22657865637574655f696e707574222c2276657273696f6e223a22352e30227d2c207b226d73675f6964223a224342463035463446323945443430323538423835323336373644383937393130222c22757365726e616d65223a22757365726e616d65222c2273657373696f6e223a224346443534443639324334303446463138333330324142424238433431323533222c226d73675f74797065223a22657865637574655f72657175657374222c2276657273696f6e223a22352e30227d2c207b2274696d657374616d70223a2231343934383736343036353534227d2c207b22636f6465223a227072696e74285c223f5c2229222c22657865637574696f6e5f636f756e74223a317d205d29
> In english that equates to: 
> Some([ 
> 3fd8b798-06c1-4642-aa09-f5cb2c6d2f6f,<IDS|MSG>,47c37f2bd6aa1fc535cb044f131f8818a44b41d80f0dcd321b94948b935a015d,
>  
> {"msg_id":"fc94b72a-ddfd-42c7-8b0f-d04bea8e3ae0","username":"TANGY","session":"1f0a6d9b-ed1d-4121-b5f4-8c06aca9922a","msg_type":"execute_input","version":"5.0"},{"msg_id":"CBF05F4F29ED40258B8523676D897910","username":"username","session":"CFD54D692C404FF183302ABBB8C41253","msg_type":"execute_request","version":"5.0"},
>  {"timestamp":"1494876406554"}, {"code":"print(\"?\")","execution_count":1} ])
> The part that is interesting to me is the content string, I parsed out the 
> content string of the hexadecimal message above:
> 7b22636f6465223a227072696e74285c223f5c2229222c22657865637574696f6e5f636f756e74223a317d
> This is where I'm guessing the invalid mismatch occurs. The content string 
> that apache toree is signing off on is different from the content string that 
> is is sending over. Notice that the content string that is being sent over is 
> exactly the same as the content string that is being received by zmq socket 
> (both have the invalid 3f)
> This is where I put my debug statements in case it matters:
> communication/src/main/scala/org/apache/toree/communication/socket/ZeroMQSocketRunnable.scala:
>   /**
>    * Sends the next outbound message from the outbound message queue.
>    *
>    * @param socket The socket to use when sending the message
>    *
>    * @return True if a message was sent, otherwise false
>    */
>   protected def processNextOutboundMessage(socket: ZMQ.Socket): Boolean = {
>     val message = Option(outboundMessages.poll())
>     if (message != None){
>         logger.warn(s"Message that is SENT IN HEX:" + String.format("%040x", 
> new BigInteger(1, s"${message}".getBytes(StandardCharsets.UTF_8))))
>         logger.warn(s"Message that is SENT:" + s"${message}")
>     }
>     message.foreach(_.send(socket))
>     message.nonEmpty
>   }
> And then also in:
> communication/src/main/scala/org/apache/toree/communication/security/SignatureProducerActor.scala
> class SignatureProducerActor(
>   private val hmac: Hmac
> ) extends Actor with LogLike with OrderedSupport {
>   override def receive: Receive = {
>     case message: KernelMessage => withProcessing {
>       logger.warn(s"Message that is being signed (HEADER):" + 
> s"${message.header}")
>       logger.warn(s"Message that is being signed (PARENT HEADER):" + 
> s"${message.parentHeader}")
>       logger.warn(s"Message that is being signed (METADATA):" +  
> s"${message.metadata}")
>       logger.warn(s"Message that is being signed IN HEX (CONTENT STRING):" + 
> String.format("%040x", new BigInteger(1, 
> s"${message.contentString}".getBytes(StandardCharsets.UTF_8))))
>       logger.warn(s"Message that is being signed (CONTENT STRING):" + 
> s"${message.contentString}")
>       val signature = hmac(
>         Json.stringify(Json.toJson(message.header)),
>         Json.stringify(Json.toJson(message.parentHeader)),
>         Json.stringify(Json.toJson(message.metadata)),
>         message.contentString
>       )
>       sender ! signature
>     }
>   }
> Also something else I noticed was when I ran jupyter notebook/toree from 
> source (make dev) I noticed that in the message, the hexadecimal 
> representation of the content string gets sent over as opposed to the string 
> itself i.e.
> Some([ 9FF3E30DB4AD4ED2B0C6795A5AF321A6, 
> <IDS|MSG>,fd19b14775db834185f1fafd1d22061a903898db98b25582700de5230a85c9c4,{"msg_id":"4c18b424-4d18-4f3e-bddb-b035c638ab7e","username":"root","session":"2142f120-8287-4723-9d0e-05d85260fb0b","msg_type":"execute_input","version":"5.0"},
>  
> {"msg_id":"D872D7431AF941C4865B9D255CB01A5A","username":"username","session":"9FF3E30DB4AD4ED2B0C6795A5AF321A6","msg_type":"execute_request","version":"5.0"},
>  {"timestamp":"1494877223980"}, 
> 7B22636F6465223A227072696E74285C22E892845C2229222C22657865637574696F6E5F636F756E74223A317D
>  ])
> In my case I see this:
> Some([ 
> 3fd8b798-06c1-4642-aa09-f5cb2c6d2f6f,<IDS|MSG>,47c37f2bd6aa1fc535cb044f131f8818a44b41d80f0dcd321b94948b935a015d,
>  
> {"msg_id":"fc94b72a-ddfd-42c7-8b0f-d04bea8e3ae0","username":"TANGY","session":"1f0a6d9b-ed1d-4121-b5f4-8c06aca9922a","msg_type":"execute_input","version":"5.0"},{"msg_id":"CBF05F4F29ED40258B8523676D897910","username":"username","session":"CFD54D692C404FF183302ABBB8C41253","msg_type":"execute_request","version":"5.0"},
>  {"timestamp":"1494876406554"}, {"code":"print(\"?\")","execution_count":1} ])
> Thanks for the help



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to