I think the way I phrased things earlier may be leading to some confusion
here. When I said "don't bring down my application", I was referring to the
application not meeting its end-to-end SLA, not to the app server crashing.

The groups I've talked to already isolate their front-end systems from
back-end systems with multiple solutions like message queues and key-value
stores. But some applications require a complex automated decision based on
information that is not available until the moment of the decision. Some
good examples are credit decisions, decisions about whether to blacklist a
hostile IP address, and product recommendations based on the user's
location and the link they just clicked. In all these cases, there is an
action that needs to happen in the real world, and that action has a
deadline, and you need to score a model to meet that deadline.

In a typical enterprise IT environment, the analytics tier is a much more
convenient place to run compute- and memory-intensive scoring tasks. The
hardware, software, and toolchain are tuned for the workload, and the data
science department has much more administrative control. So the question
naturally comes up: "Can I score my [credit/security/recommender/...]
models on the same infrastructure that I use to build them?"


On Thu, Oct 13, 2016 at 9:59 AM, Holden Karau <hol...@pigscanfly.ca> wrote:

> This is a thing I often have people ask me about, and then I do my best
> dissuade them from using Spark in the "hot path" and it's normally
> something which most people eventually accept. Fred might have more
> information for people for whom this is a hard requirement though.

> On Thursday, October 13, 2016, Cody Koeninger <c...@koeninger.org> wrote:
>> I've always been confused as to why it would ever be a good idea to
>> put any streaming query system on the critical path for synchronous  <
>> 100msec requests.  It seems to make a lot more sense to have a
>> streaming system do asynch updates of a store that has better latency
>> and quality of service characteristics for multiple users.  Then your
>> only latency concerns are event to update, not request to response.
>> On Thu, Oct 13, 2016 at 10:39 AM, Fred Reiss <freiss....@gmail.com>
>> wrote:
>> > On Tue, Oct 11, 2016 at 11:02 AM, Shivaram Venkataraman
>> > <shiva...@eecs.berkeley.edu> wrote:
>> >>
>> >> >
>> >> Could you expand a little bit more on stability ? Is it just bursty
>> >> workloads in terms of peak vs. average throughput ? Also what level of
>> >> latencies do you find users care about ? Is it on the order of 2-3
>> >> seconds vs. 1 second vs. 100s of milliseconds ?
>> >> >
>> >
>> >
>> > Regarding stability, I've seen two levels of concrete requirements.
>> >
>> > The first is "don't bring down my Spark cluster". That is to say,
>> regardless
>> > of the input data rate, Spark shouldn't thrash or crash outright.
>> Processing
>> > may lag behind the data arrival rate, but the cluster should stay up and
>> > remain fully functional.
>> >
>> > The second level is "don't bring down my application". A common use for
>> > streaming systems is to handle heavyweight computations that are part
>> of a
>> > larger application, like a web application, a mobile app, or a plant
>> control
>> > system. For example, an online application for car insurance might need
>> to
>> > do some pretty involved machine learning to produce an accurate quote
>> and
>> > suggest good upsells to the customer. If the heavyweight portion times
>> out,
>> > the whole application times out, and you lose a customer.
>> >
>> > In terms of bursty vs. non-bursty, the "don't bring down my Spark
>> cluster
>> > case" is more about handling bursts, while the "don't bring down my
>> > application" case is more about delivering acceptable end-to-end
>> response
>> > times under typical load.
>> >
>> > Regarding latency: One group I talked to mentioned requirements in the
>> > 100-200 msec range, driven by the need to display a web page on a
>> browser or
>> > mobile device. Another group in the Internet of Things space mentioned
>> times
>> > ranging from 5 seconds to 30 seconds throughout the conversation. But
>> most
>> > people I've talked to have been pretty vague about specific numbers.
>> >
>> > My impression is that these groups are not motivated by anxiety about
>> > meeting a particular latency target for a particular application.
>> Rather,
>> > they want to make low latency the norm so that they can stop having to
>> think
>> > about latency. Today, low latency is a special requirement of special
>> > applications. But that policy imposes a lot of hidden costs. IT
>> architects
>> > have to spend time estimating the latency requirements of every
>> application
>> > and lobbying for special treatment when those requirements are strict.
>> > Managers have to spend time engineering business processes around
>> latency.
>> > Data scientists have to spend time packaging up models and negotiating
>> how
>> > those models will be shipped over to the low-latency serving tier. And
>> > customers who are accustomed to Google and smartphones end up with an
>> > experience that is functional but unsatisfying.
>> >
>> > It's best to think of latency as a sliding scale. A given level of
>> latency
>> > imposes a given level of cost enterprise-wide. Someone who is making a
>> > decision on middleware policy will balance this cost against other
>> costs:
>> > How much does it cost to deploy the middleware? How much does it cost to
>> > train developers to use the system? The winner will be the system that
>> > minimizes the overall cost.
>> >
>> > Fred
>> ---------------------------------------------------------------------
>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
> --
> Cell : 425-233-8271
> Twitter: https://twitter.com/holdenkarau

Reply via email to