Revised, now with Ihsan's schedule and abstract.

With NSDI being local, we have a number of talks queued up this week, and
most of the visitors will have time after/before to meet with us.  Please
sign up!   Note that several of the talks have non-standard start times.
Abstracts collected below.

Wednesday, CSE 305, 1:30pm
Vijay Chidambaram, UT Austin
PebblesDB: Building Key-Value Stores Using Fragmented Log-Structured Merge

Thursday, CSE 305, 11am
Mothy Roscoe, ETH Zurich
Enzian: Research Computer

Thursday, CSE 305, 12pm
Ihsan Qazi, LUMS
Understanding Internet Access in the Developing World

Friday, CSE 305, 12:00pm
Keith Winstein, Stanford
Tiny Functions for Codecs, Compilation, and (maybe) Everything
PebblesDB: Building Key-Value Stores Using Fragmented Log-Structured Merge

Key-value stores such as LevelDB and RocksDB have become a fundamental part
of the systems infrastructure. However, these stores suffer from high write
amplification: for example, 45 GB of data written to RocksDB results in 1.8
TB (28x) written to storage. In this talk, I show that the write
amplification problem is fundamental to the Log-Structured Merge Trees data
structure that underlies these stores. I present a novel data structure
that is inspired by Skip Lists, termed Fragmented Log-Structured Merge
Trees (FLSM). FLSM introduces the notion of guards to organize logs, and
avoids rewriting data in the same level. I will describe PebblesDB, a new
key-value store that we built by modifying HyperLevelDB to use the FLSM
data structure. I will briefly present our evaluation which shows that
PebblesDB increases write throughput by 6.7x (compared to RocksDB and
LevelDB) while simultaneously reducing write amplification by 2.4-3x.
PebblesDB is open-source (, and I
hope to convince some of you to incorporate it into new systems you build :)

Enzian: Research Computer

Academic research in rack-scale and datacenter computing
today is hamstrung by lack of hardware.  Cloud providers and hardware
vendors build custom accelerators, interconnects, and networks for
commercially important workloads, but university researchers are stuck
with commodity, off-the-shelf parts.

Enzian is a research computer developed at ETH Zurich in collaboration
with Cavium and Xilinx which addresses this problem.  An Enzian board
consists of a server-class ARMv8 SoC tightly coupled and coherent with
a large FPGA (eliminating PCIe), with about 0.5 TB DDR4 and about 600
Gb/s of network I/O either to the CPU (over Ethernet) or directly to
the FPGA (potentially over custom protocols).  Enzian runs both
Barrelfish and Linux operating systems.

Many Enzian boards can be connected in a rack-scale machine (either
with or without a discrete switch), and the design is intended to
allow many different research use-cases: zero-overhead run-time
verification of software invariants, novel interconnect protocols for
remote memory access, hardware enforcement of access control in a
large machine, high-performance streaming analytics using a
combination of software and configurable hardware, and much more.
By providing a powerful and flexible platform for computer systems
research, Enzian aims to enable more relevant and far-reaching work on
future compute platforms.

Understanding Internet Access in the Developing World

In this talk, I will present my recent research on Internet access in
developing countries. In the first half of my talk, I will present a study
on the characteristics of mobile devices in developing regions. Using a
dataset of 0.5 million subscribers from one of the largest cellular
operators in Pakistan, I will present an analysis of cell phones being used
based on different features (e.g., CPU, memory, and cellular interface).
Our analysis reveals potential device-level bottlenecks for Internet
access, which can inform infrastructure design for improving mobile web
performance. (This work appeared in ACM IMC 2016) Another accessibility
challenge in developing countries is the rise in Internet censorship
events, which can have a substantial impact on various stakeholders in the
Internet ecosystem (e.g., users, content providers, ISPs, and advertisers).
In the second half of my talk, I will discuss how Internet censorship poses
an economic threat to online advertising, which plays an essential role in
enabling the free Web by allowing publishers to monetize their services.
Then I will describe a system we designed that enables relevant ads while
retaining the effectiveness of censorship resistance tools (e.g., Tor).
(This work appeared in ACM HotNets 2017)

Tiny Functions for Codes, Compilation, and (maybe) Soon Everything

Networks, applications, and media codecs frequently treat one another as
strangers. By expressing large systems as compositions of small, pure
functions, we've found it's possible to achieve tighter couplings between
these components, improving performance without giving up modularity or the
ability to debug. I'll discuss our experience with systems that demonstrate
this basic idea: ExCamera (NSDI 2017) parallelizes video encoding into
thousands of tiny tasks, each handling a fraction of a second of video,
much shorter than the interval between key frames, and executing in
parallel on AWS Lambda. This was the first system to demonstrate
"burst-parallel" thousands-way computation on functions-as-a-service
infrastructure. Salsify (NSDI 2018) is a low-latency network video system
that uses a purely functional video codec to explore execution paths of the
encoder without committing to them, allowing it to closely match the
capacity estimates from a video-aware transport protocol. This architecture
outperforms more loosely-coupled applications -- Skype, Facetime, Hangouts,
WebRTC -- in delay and visual quality, and suggests that while improvements
in video codecs may have reached the point of diminishing returns, video
systems still have low-hanging fruit. Lepton (NSDI 2017) uses a purely
functional JPEG/VP8 transcoder to compress images in parallel across a
distributed network filesystem with arbitrary block boundaries. This
free-software system is in production at Dropbox and has compressed, by
23%, more than 200 petabytes of user JPEGs.

Based on our experience, we propose an intermediate representation for
interactive lambda computing, called cloud "thunks" -- stateless closures
that describe their data-dependencies by content-hash, separating the
specification of an algorithm from its schedule and execution. We have
created a tool that extracts this IR from off-the-shelf software build
systems, letting the user treat a FaaS service like a 5,000-core build farm
with global memoization of results. Expressing systems and protocols as
compositions of small, pure functions has the potential to lead to a wave
of "general-purpose" lambda computing, permitting us to transform everyday
time-consuming operations into large numbers of functions executing with
massive parallelism for short durations in the cloud.
change mailing list

Reply via email to