Hi, The Optimizer does some computations to estimate number of scans. I don't know if any of this is placed in the tdb. The compiler could be changed to propagate this info to the tdb.
Dave -----Original Message----- From: Eric Owhadi [mailto:[email protected]] Sent: Monday, March 7, 2016 1:55 PM To: [email protected] Subject: Small Scanner and MDAM Hi Trafodionneers, I have experimented the potential effect of SMALL SCANNER on MDAM, and the results are very interesting. Given MDAM is having a lot of activity dealing with small scans (the MDAM probes), and the other 50 % activity that can either be good candidate for small scan or not depending or quantity of data retrieve in each scan, intuition about the good fit between the 2 features made me validate it. So I created a table of 50 millions records, with 2 column key to generate a predictable MDAM plan, and with record length making each data scan fall below the 64K hbase block size. Forcing SMALL_SCANNER using the cqd HBASE_SMALL_SCANNER ‘ON’, bypass the system logic and will always force all scans to use small scanner. Results shows a X1.39 speed improvement over regular scanner. This is goodness. Along with this speed improvement, the fact that small scanner is using non-blocking reads should also have good impact on concurrency allowing higher TPM rates on use cases using MDAM. Now I am trying to productize this: - I can always assume that probes should always use small scanner, as by definition, they probe data, so retrieve a very small amount, lower than HBASE_BLOCK_SIZE. - Now, for the non-probe scan traffic, that is where I am struggling. I was hoping to find somewhere on the tdb, information about on how many scans the mdam disjunct would result, and therefore apply a heuristic assumption of equal distribution of data for each scan, and knowing total size of data expected for the scan, just divide it by number of expected scans resulting from MDAM disjunct… But I have not found anything on tdb that would tell me how many data scan would an MDAM result in… Any idea how to tackle this? Thanks in advance for the help, Eric
