Richard Monson-Haefel wrote:
>
> Charlie Alfred wrote:
>
> >
> > Would you also say that this analysis holds true if the EJB Server supports
> > in-memory caching of read-mostly Entity Beans? Or, if it uses an EJB-side
> > hot-cache with multi-version concurrency control, like eXcelon Corp's Javlin
> > product?
>
> Nice collection of buzz words, but most of them don't mean much in the context of
> this discussion.
>
> Relational databases also maintain a very fast and robust cache, which direct JDBC
> access can leverage with each query. Front ending the database cache with an Object
> Cache may have advantages when the database is only accessed by one EJB server, but
> otherwise the performance gains are questionable. If your database is accessed by
> several EJB servers or by a legacy systems and servers, then an object cache presents
> a HUGE problem, cache coherence. How to do you make sure the object cache is in sync
> with changes made to the database by other systems? You don't. You can't.
>
I'm sorry that all you took away from my question was a list of buzzwords.
Let me try to make my point more clearly, because I still think that the
original question was relevant to the thread.
If I'm not misrepresenting your last post, you said that high-volume read-only
operations (i.e. product catalogs, etc.) should be performed with Servlets and
JDBC, and EJB should be reserved for the update-style transactions (i.e.
making purchases).
In general, I agree with your advice, however there are several cases where
things aren't that simple. You allude to these cases near the end of your post,
but I'd like to take a few minutes to explore one. Consider an airline reservations/
customer service system. Some transactions (reservation purchase, passenger
check-in, etc.) fall into the high-volume update category, and would be a
natural fit for EJB.
Other transactions, such as flight availability search (e.g. city pair, departure
date/time interval, return date/time interval), are not as obvious. You seem
to suggest that an operation like this would be better performed as a Servlet
using JDBC, than through a Session Bean. I'm not certain I agree, and the
buzzwords that motivate my conclusion are "EJB hot-cache", "multi-version
concurrency control" and "impedence mismatch."
First, let's examine some of the important characteristics of the flight
availability search:
a. involves "read mostly" data. For travel requests 2 or more days out, you
probably can get away with using the published schedule (which changes
very infrequently). For travel requests within 2 days, you may need to use
the operational schedule (which changes more frequently, but still not
continually).
b. takes a while to process a transaction. Even if tables of connect-through
cities are used to reduce the size of the travel path graph, this query
still produces a large number of alternatives. And in the case where
travel uses the operational schedule, connections have to be validating by
examining the arrival and departure times and planned gate locations to
determining if there's sufficient time to make the connection.
c. Wants some level of transaction isolation (to ensure that the data set is
consistent during the time period while the list of best itineraries is being
compiled). However, this isolation *cannot* block concurrent updates to
the records that are being examined.
This is a situation where App-Server side caching can make sense, especially
when the App-Servers are running on different network host(s) than the
RDBMS server. The benefit occurs in avoiding the network traffic between
the App-Server and the RDBMS. This benefit has to be weighed against the
cost of keeping the client-side cache(s) current. Effective use of triggers
in the RDBMS can keep the overhead to a minimum (especially since the update
frequency is relatively low). In summary, save hundreds or thousands of remote
SQL queries, pay for a few update triggers.
The downside is that there is some latency between the time when an update
is committed to the RDBMS and the time when the notification reaches the
App-Server(s). However, this type of operation can tolerate latencies of as
much as a few seconds (especially considering the fact that once the App
Server comes up with its list of 'n' best flights, it can pass them through a
second validation transaction that bypasses the cache and goes directly
against the RDBMS).
Next, let's consider the impedence mismatch issue, since I think that goes
to the heart of Servlet vs Session Bean.
An RDBMS schema for an airline system might include tables for AIRPORT
and FLIGHT_LEG, and FL_AVAILABILITY among others. The top 10
airlines probably operate about 25,000 domestic flight legs per day between
airports in about 300-400 U.S cities. Each flight leg probably has 15-25
different fare classes. With a 6 month reservation horizon, the database
contains about 500,000 flight legs, and about 10 million availability records.
Even with the help of a "through city" table, a typical flight availability
lookup will require 5-10 different complex SQL queries. For example,
a trip from Boston to L.A. might need to examine the direct flights, plus
connections in Chicago, St. Louis, Atlanta, Dallas, Cincinatti, Denver, etc.
Even with a heavily indexed FLIGHT_LEG table (origin, destination, depart
date/time), these are reasonably heavyweight queries.
By contrast, airline operations are a natural fit for a network-oriented model.
Rather than using one giant table of FLIGHT_LEGS or FL_AVAILABILITY
records, FlightLeg objects are associated with inbound and outbound
collections of their respective arrival and departure airports. And a
collection fo 15-25 FlAvailability objects are hung off of the FlightLeg
object that owns them.
It would be difficult to convince me that a Servlet that queries the RDBMS
using JDBC would be the most effective solution for this problem. And I
don't think that this is a *unique* problem by any stretch of the imagination.
> Lets face it, relational databases have been working on their cache technology for a
> couple of decades (or more) while the EJB vendors have been developing it for a few
> years (at best).
So? United, Delta, and American Airlines have been operating for at lot
longer than SouthWest Airlines. Does this mean that they are way ahead on
their flight planning, piloting, or maintenance technology learning curves?
Experience has a way of spreading. I'd be surprised if most of the EJB vendors
are trying to figure out how to design caching mechanisms, rather than leveraging
the current state of industry research.
> When I need to share a database across several systems (legacy and
> modern) I'll leverage a relational database cache so that all systems benefit from a
> consistent view of the data.
Agreed. Especially for data where the frequency of update (across all database
clients) exceeds the frequency of access.
> On the other hand, if I have a complex object model and
> only access the database through one platform (like Gemstone/J) then an object cache
> is great performance gain. Gemstone for example can maintain something like 6 Gig of
> object cache in memory -- that's incredible and valuable under the right conditions.
>
This is essentially the point that I was making above with the airline example.
Gemstone provides this capability. Javlin provides it for WebLogic server.
I'm sure that other vendors do, as well.
> >
> >
> > However, unless I'm missing something, if the relevant info is cached then
> > this overhead is significantly reduced on the EJB case. Its performance
> > would be much closer to the straight JDBC case for the simple case (i.e.
> > select of fields from few tables with simple joins). In the case where
> > the read-mostly data takes on a more complex, object-oriented structure
> > (such as a product catalog with lots of cross-references) the EJB cache
> > might even have a much lower impedence mismatch for the Servlet.
> >
>
> There is no impedance mismatch issue with Servlets and JDBC. You access the data in
> the same way its organized, relationally. There is, however, an impedance mismatch
> anytime you model your business in objects and persist them to a relational database.
>
This assumes that the Servlets want to "see" them organized relationally.
If the Servlets perform operations that require an object model (i.e. if
we did the flight availability search example from above in a Servlet)
then the impedence mismatch definitely exists.
> But this is besides the point. EJB is not objects in the purest sense of the
> concept, they are components.
I'm not sure why this discussion is relevant. Yes, components exist at a
higher level of granularity than objects. Yes, it's appropriate to decompose
a EJB component into a set of finer-grained objects. However, at some
level, any component (implemented using EJB or any other mechanism) has
a set of functionality and quality of service requirements. And the choice
of how to design it must be driven by them.
The Servlet vs. Session Bean question is an appropriate one. And while I'm
generally in favor of rules of thumb, I think that broad generalizations
such as "EJB is appropriate for high volume transactional sites like an
electronic trading site, but for sites that provide read-only access to
product catalogs or primarly textual material (documents, articles, etc..)
with no transactional requirements" can be misleading if taken out of
context.
Isn't it better to highlight the issues that should be considered in framing
the decision. I'm a big fan of Design Patterns, and what I particularly like
is the fact that the standard format covers:
a. a description of what the solution is and what problem it addresses,
b. what forces operate that make this solution attractive, and
c. what forces might also be present that reduce the attractiveness
of the solution.
> It should be noted that I am a very big fan of EJB, perhaps one of the biggest.
> Experience in actual large scale implementations have taught me the benefits that are
> inherent in the EJB platform. At the same time I'm not blinded by loyalty to the
> platform and believe that each technology has its place and that leveraging them
> appropriately is more important then using them as a silver bullet.
>
Couldn't agree more.
Charlie Alfred
===========================================================================
To unsubscribe, send email to [EMAIL PROTECTED] and include in the body
of the message "signoff EJB-INTEREST". For general help, send email to
[EMAIL PROTECTED] and include in the body of the message "help".