I'm working with very large data sets. In my current problem, I have about 74,000 records that need to be converted, filtered and stored in Oracle database. The initial run will has the largest query results. After this, the data will change only on a very limited basis.
The source database is Microsoft Server SQL 7 and the destination database is Oracle database running on a Solaris. I've been creating a modest program in Java to do the conversion. The current design uses a Row Gateway pattern from http://martinfowler.com/isa/index.html. This is basically a static class that creates a collection of value objects. I've been surprised at how fast this working. I have the following code in my main routine start = System.currentTimeMillis(); CDPubsList pubs = new CDPubsList(); List pubsList = pubs.getCDPubList(); for (Iterator iter = pubsList.iterator(); iter.hasNext();) { CDPub element = (CDPub) iter.next(); System.out.println(element); } finish = System.currentTimeMillis(); It typically takes about only thirty seconds. I think some of the time is cause by the System.out.println and my log4J debug statements. My problem is that I ran out of memory in Eclipse. I fixed the problem by upping memory in Eclipse. The issue is that I know that this is one of my smaller data sets. I was going to replace this with a CachedRowSet, but I just read in the JDBC API Tutorial and Reference, 2nd ed that "CachedRowSet - ... [is] not suitable for very large data sets..." I could switch my design to have the main loop read one record at a time. I have several reservations about this design. First, it is that not very object-oriented. The core code knows all about my database. Since I have to create several of this conversation programs over the next year, I wanted the core code to be like a simple framework or harness that I could reuse over and over. I lose the reusable components that I'm creating. I would either have to have the connection open all the time or I would need to add a connection pool manger - such as DBCP from Apache. I could grab portions of the data set, I think this would be called paging, but I just can't see how to do this without requiring an alteration to the source database schema. I could do this via view, but I rather not having to do this. I figure I must be just missing something obvious. If you don't mind, I was hoping that someone have some pointers... _______________________________________________ MVC-Programmers mailing list [EMAIL PROTECTED] http://www.netbean.net/mailman/listinfo/mvc-programmers