Zameer Manji created AURORA-1743:
------------------------------------
Summary: Support Thrift Binary Protocol
Key: AURORA-1743
URL: https://issues.apache.org/jira/browse/AURORA-1743
Project: Aurora
Issue Type: Task
Components: Scheduler
Reporter: Zameer Manji
Currently the scheduler serves its thrift API over the TJSON protocol in
thrift. This has some benefits, as it allows lots of tools to consume the data
without understanding the IDL, the wire format is easy to understand and it can
power our AJAX UI.
However the performance of the TJSON protocol in Python is abysmal. On a small
cluster (~2k tasks) which uses Docker and stores some metadata per task, and
unscoped {{getTasksWithoutConfigs}} returns about 22MB of JSON. In Python,
using the TJSON protocol, it takes more than 20 seconds to deserialize this
data. The equivalent data encoded in the binary protocol takes about 3 seconds
to deserialize. I suspect that similar performance issues occur in other thrift
implementations in other programming languages.
I think the scheduler should support both the TJSON protocol and the binary
protocol over HTTP.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)