In local cluster, it looks good to me. But when I move the code to actual vm cluster. query returning incorrect data.
Thanks On 2 January 2017 at 18:15, Anil <[email protected]> wrote: > Hi Nikolay, > > Thanks for the response. The use case is very simple. > > Each Person is associated with number of person details which holds > address, start and end dates. > > *Scenario* - Get person information with latest person detail for given > person ids. Person information must include duplicate person with same > equivalent Id > > 1. To get the duplicate persons with same equivalentid - > > join (select equivalentId, count(*) dupCount from PERSON_CACHE.PERSON > group by equivalentId ) dupPerson on p.equivalentId = dupPerson.equivalentId > > 2. To get the latest person details - based on recent end date > > join DETAILS_CACHE.PersonDetail pd on p.equivalentId = pd.equivalentId > join (select equivalentId, max(enddate) as enddate from > DETAILS_CACHE.PersonDetail group by equivalentId) maxPd on p.equivalentId > = maxPd.equivalentId and maxPd.endDate = pd.endDate > > To support group sorted index, i have created index on equivalentId in > DETAILS_CACHE. > > Please let me know if you have any questions. > Thanks >
