Re: Patterns for massive amounts of data?

Assaf Arkin Sat, 26 Feb 2000 03:06:38 -0800
What you are looking for is a way to construct a finder that only
returns an interval group, say WHERE ID BETWEEN 1 AND 1000. That will
give you just the entity beans you are looking for.

If you are retrieving the information purely for display and potentially
updating one or two records, session beans will work faster. Way less
overhead. You construct a query, get a result set, work with that result
set. Entity beans add a separate layer.

If you are retrieving the information and potentially updating most of
the data, then having it in memory (entity beans) is easier to work
with.

As for caching, I'm not sure it make sense. Let's say you have 1 million
records, you load 1000 and cache them, then load another 1000 (different
set) and cache them, then load another 1000. At this point the EJB
server will dump the first 1000 to make room for the new 1000, so you
end up getting nothing from caching.

As a general rule if you load the same set of records 80% of the time,
caching is much faster. If you load a different set of records 80% of
the time, the 20% caching might be offset by the cost of using entity
beans, and session beans will run faster.


Another possibility, which totally bypasses EJB (but you need to
evaluate it based on how the client works) is to use JDBC directly from
the client for reading. In that case you can benefit from fetch buffers
(i.e. you read 100 at a time, out of the 1000 results) and offload a lot
of network activity (client->RDBMS as opposed to client->EJB->RDBMS).

arkin


Jonas Wallenius wrote:
>
> Hi folks,
>
> A question about a probably quite common issue in n-tier apps...
>
> Suppose I have a java client - session bean - entity bean(s) - DB.
>
> Suppose the data we're dealing with consist of huge lists (of names, or
> whatever) PK:ed on a running integer. Could contain millions of entries.
> The data could conceptually be seen as consisting of interval groups, where
> all entries in an interval belong together and are often retrieved and processed
> together.
>
> The client user wants to perform a bunch of filtered searches on a list:
> "Show items 1-1000 and 1000-1500, show all names containing an 'a', ..."
>
> I'm wondering about two things here:
>
> 1) Is there some "common/best practice" for deciding how and when to retrieve
>    the data from the DB to the client? I'm particularly interested in hearing
>    of experiences comparing session bean -> DB via direkt SQL (stored
>    procedures, etc) vs session bean -> entity beans (hopefully cached).
>
> 2) Is there some common/best practice of representing such massive amounts of
>    data in a client? What about in the database? Could the interval group
>    property of the data be exploited in some clever way?
>
> As for 1), the natural answer seems to me to be to fetch each data interval as
> the user requests it. The downside to this might be that the server will need to
> go through the entire data list each time a filter operation is requested, which
> is why I'm asking for good ideas.
>
> As for 2), I don't mean graphically (though ideas for that are welcome too :-)
> as much as structurally, especially in the case where the data can be seen as a
> group of non-overlapping intervals (items 1-1000 and 2000-10000 and ...).
>
> We can assume that the entity beans have exclusive write access to the DB.
>
> Regards,
>
> Jonas
>
> ===========================================================================
> To unsubscribe, send email to [EMAIL PROTECTED] and include in the body
> of the message "signoff EJB-INTEREST".  For general help, send email to
> [EMAIL PROTECTED] and include in the body of the message "help".

--
----------------------------------------------------------------------
Assaf Arkin                                           www.exoffice.com
CTO, Exoffice Technologies, Inc.                        www.exolab.org

===========================================================================
To unsubscribe, send email to [EMAIL PROTECTED] and include in the body
of the message "signoff EJB-INTEREST".  For general help, send email to
[EMAIL PROTECTED] and include in the body of the message "help".
Re: Patterns for massive amounts of data?

Reply via email to