Michael J. Carey created ASTERIXDB-1639:
-------------------------------------------

             Summary: Need a spatial test case with a high point dup factor
                 Key: ASTERIXDB-1639
                 URL: https://issues.apache.org/jira/browse/ASTERIXDB-1639
             Project: Apache AsterixDB
          Issue Type: Test
          Components: AsterixDB, Hyracks, Storage
            Reporter: Michael J. Carey
            Assignee: Taewoo Kim


We need a few LSM R-tree test cases where we have many leaves worth of data 
(which could be achieved by making an artificially small NC config?) that have 
the same key - to make sure that we can handle that case properly.  (I'm 
wondering after talking with Wail if that's the root of his problems a few 
weeks ago - he had a high duplicate rate.)  E.g., we should trying to insert a 
ton of data all at one of the same 2-3 unique spatial points.  It would be good 
for there to be enough data that multi-level Hilbert sorting is required as 
well.  This is likely to be a time-consuming test so it should be in our period 
(not per-checkin) tests.  We should actually do this extreme-dup-case test for 
all index types, but R trees are suspected of maybe doing this wrong.  Who 
would be best to write/run this test w/o much effort?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to