[ https://issues.apache.org/jira/browse/ASTERIXDB-1639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15480237#comment-15480237 ]
Taewoo Kim edited comment on ASTERIXDB-1639 at 9/10/16 6:45 PM: ---------------------------------------------------------------- You assigned this to right guy even if he lacks some knowledge. :-) We can discuss more details next week. was (Author: wangsaeu): You assigned to this to right guy even if he lacks some knowledge. :-) We can discuss more details next week. > Need a spatial test case with a high point dup factor > ----------------------------------------------------- > > Key: ASTERIXDB-1639 > URL: https://issues.apache.org/jira/browse/ASTERIXDB-1639 > Project: Apache AsterixDB > Issue Type: Test > Components: AsterixDB, Hyracks, Storage > Reporter: Michael J. Carey > Assignee: Taewoo Kim > Labels: soon > > We need a few LSM R-tree test cases where we have many leaves worth of data > (which could be achieved by making an artificially small NC config?) that > have the same key - to make sure that we can handle that case properly. (I'm > wondering after talking with Wail if that's the root of his problems a few > weeks ago - he had a high duplicate rate.) E.g., we should trying to insert > a ton of data all at one of the same 2-3 unique spatial points. It would be > good for there to be enough data that multi-level Hilbert sorting is required > as well. This is likely to be a time-consuming test so it should be in our > period (not per-checkin) tests. We should actually do this extreme-dup-case > test for all index types, but R trees are suspected of maybe doing this > wrong. Who would be best to write/run this test w/o much effort? -- This message was sent by Atlassian JIRA (v6.3.4#6332)