On Tue, Jun 14, 2011 at 6:09 AM, Rick Hillegas <[email protected]>wrote:
> Hi Masood, > > Some comments inline... > > On 6/13/11 7:30 PM, Masood Mortazavi wrote: > >> On Mon, Jun 13, 2011 at 7:18 PM, Masood Mortazavi < >> [email protected] <mailto:[email protected]>> wrote: >> >> >> >> On Mon, Jun 13, 2011 at 6:39 PM, Masood Mortazavi >> <[email protected] <mailto:[email protected]>> wrote: >> >> I have a Derby table (T) with 10^7 records. >> This table has 5 columns. >> I have defined a Derby view (V) on one of those columns. >> To produce (V) all of (T) needs to be scanned. >> There are about 2000 records that the view (V) contains. >> >> Now, I "cross" (T) with (V), with an IN clause involving (V). >> I have three questions: >> [1] Is the optimizer smart enough to to scan only O( (T)*(V) ) >> times (with the V comparisons involving in-memory >> "materialization" of (V)), or will it be O( (T)*(T) ) or >> something else? >> >> I think this depends on the view, the query, and the statistics on the > underlying table. I'm sure that Derby's view materialization could be > improved. > > >> [2] What is the support that Derby 10.8.1.2 may be offering >> for materialized views? >> >> View materialization wasn't a focus of the 10.8.1.2 release. It is > possible that the automatic calculation of statistics may improve the > performance of materialized views in some cases. > > >> [3] If the answer to [1] is the larger number, would >> materializing (V) change things or improve them drastically, >> i.e. would Derby save the large (T) sized scan to produce (V), >> or even better, would it do that and also bring all of (V), >> which is rather small, into memory (in large enough chunks or >> completely) for the purposes of the cross involving IN? >> >> With best regards to all good Derby friends, >> - m. >> >> >> I solve the materialization issue by a simple exporting of the >> view's select query and then importing it. >> It works well. >> I will rerun my IN cross query, against the now materialized table >> to see how much better or worse it does in this particular case. >> The general gist of my questions above still remain. >> >> Regards, >> Masood >> >> >> >> OK. >> As described above, I used Derby's wonderful system utility procedures to >> produce a materialization of the view in question, above. >> The cross query became faster than a blizzard. >> > You may get better performance by using an INSERT ... SELECT statement > rather than exporting the view and re-importing it. That will eliminate the > redundant writing of the view to an external file. You may also investigate > using a temporary table to store the intermediate result (for details, see > the Reference Guide section on DECLARE GLOBAL TEMPORARY TABLE). > > Hope this helps, > -Rick This is all very helpful. Thank you very much. Yes, INSERT . . . SELECT would have been very helpful and more concise. We may have other uses for the extra data file, in a Hadoop environment. Given the statistics of the source data, doing an "IN" against a materialized view turned out to be a lot faster, as expected. Thanks again and with very best regards, m. P.S. By the way, not supporting materialized views is not a big deal. I'm not suggesting it should be supported.
