Thanks for all the suggestions. Data issues was one of the first things we look for, and we were able to verify absolutely that there are no 'hidden' characters causing creation of separate nodes. The nodes actually have exact duplicate IDs.
The file being indexed is actually a transaction log, and here is a partial listing of the transactions, in creation order: Trans.Log.Id MainDB.ID........ Trans.Dt. Trans.Time Trans.Dt.. Trans.Time..... Program 763038240 DBID12345 29 SEP 10 05:14:10pm 15613 62050.6937 NUS009 763038241 DBID12345 29 SEP 10 05:14:10pm 15613 62050.6954 NUS009 763038242 DBID12345 29 SEP 10 05:14:10pm 15613 62050.6971 NUS009 763038243 DBID12345 29 SEP 10 05:14:10pm 15613 62050.6988 NUS009 763038244 DBID12345 29 SEP 10 05:14:10pm 15613 62050.7006 NUS009 763038245 DBID12345 29 SEP 10 05:14:10pm 15613 62050.7024 NUS009 763038246 DBID12345 29 SEP 10 05:14:10pm 15613 62050.7041 NUS009 763038247 DBID12345 29 SEP 10 05:14:10pm 15613 62050.7057 NUS009 763038248 DBID12345 29 SEP 10 05:14:10pm 15613 62050.7075 NUS009 763038249 DBID12345 29 SEP 10 05:14:10pm 15613 62050.7095 NUS009 763038250 DBID12345 29 SEP 10 05:14:10pm 15613 62050.7112 NUS009 763038251 DBID12345 29 SEP 10 05:14:10pm 15613 62050.7129 NUS009 763038252 DBID12345 29 SEP 10 05:14:10pm 15613 62050.7146 NUS009 763038253 DBID12345 29 SEP 10 05:14:10pm 15613 62050.8737 NUS009 763038254 DBID12345 29 SEP 10 05:14:10pm 15613 62050.8761 NUS009 763038255 DBID12345 29 SEP 10 05:14:10pm 15613 62050.8777 NUS009 763038256 DBID12345 29 SEP 10 05:14:10pm 15613 62050.8793 NUS009 763038257 DBID12345 29 SEP 10 05:14:10pm 15613 62050.8811 NUS009 763038258 DBID12345 29 SEP 10 05:14:10pm 15613 62050.9003 NUS009 763038259 DBID12345 29 SEP 10 05:14:10pm 15613 62050.9017 NUS009 763038260 DBID12345 29 SEP 10 05:14:10pm 15613 62050.9034 NUS009 763038261 DBID12345 29 SEP 10 05:14:10pm 15613 62050.9053 NUS009 763038262 DBID12345 29 SEP 10 05:14:10pm 15613 62050.907 NUS009 763038263 DBID12345 29 SEP 10 05:14:10pm 15613 62050.9494 NUS009 763038264 DBID12345 29 SEP 10 05:14:10pm 15613 62050.9509 NUS009 763038265 DBID12345 29 SEP 10 05:14:10pm 15613 62050.9524 NUS009 763038266 DBID12345 29 SEP 10 05:14:10pm 15613 62050.9539 NUS009 763038267 DBID12345 29 SEP 10 05:14:10pm 15613 62050.9553 NUS009 763038268 DBID12345 29 SEP 10 05:14:10pm 15613 62050.9568 NUS009 763038269 DBID12345 29 SEP 10 05:14:10pm 15613 62050.9584 NUS009 763038270 DBID12345 29 SEP 10 05:14:10pm 15613 62050.9755 NUS009 763038271 DBID12345 29 SEP 10 05:14:10pm 15613 62050.9769 NUS009 763038272 DBID12345 29 SEP 10 05:14:10pm 15613 62050.9784 NUS009 763038273 DBID12345 29 SEP 10 05:14:10pm 15613 62050.9799 NUS009 763038274 DBID12345 29 SEP 10 05:14:10pm 15613 62050.9814 NUS009 763038275 DBID12345 29 SEP 10 05:14:10pm 15613 62050.9829 NUS009 763038276 DBID12345 29 SEP 10 05:14:10pm 15613 62050.9843 NUS009 763038277 DBID12345 29 SEP 10 05:14:10pm 15613 62050.9858 NUS009 763038278 DBID12345 29 SEP 10 05:14:10pm 15613 62050.9873 NUS009 *763038279 DBID12345 29 SEP 10 05:14:10pm 15613 62050.9943 NUS009 763038280 DBID12345 29 SEP 10 05:14:11pm 15613 62051.0247 NUS009 763038281 DBID12345 29 SEP 10 05:14:11pm 15613 62051.0264 NUS009 763038282 DBID12345 29 SEP 10 05:14:11pm 15613 62051.0279 NUS009 763038283 DBID12345 29 SEP 10 05:14:11pm 15613 62051.1511 NUS009 763038284 DBID12345 29 SEP 10 05:14:11pm 15613 62051.161 NUS009 I have placed an asterisk at the beginning of the line with the transaction that was indexed in the duplicate node (in case the red highlighting doesn't carry through). _All_ of the other transactions for DBID12345 were indexed together. All of the other transactions for DBID12345 were created by the same instance of the same program, at pretty much the same point in time, as you can see from the time and date stamps. I've included the internal time so that you can see to the millisecond level how close together these records are created (and indexed). There is no correlative at all in the dictionary item, and we have already thought that we should probably change it to a D-style dict. item, but we're trying to reliably reproduce the problem, preferably on a smaller database, before we make that change, so we can tell whether or not it actually fixes it. We're looking into determining exactly how long a rebuild of this particular index will take, as that seems like the best work-around for us at the moment. We are in the process of formalizing the steps we use to identify the problem, and codifying that in a process that can be run on every index we have, so that we can at least identify the scope of our issues and, if rebuild is the only remedy, make plans to schedule them regularly. If you would like to check your indexes, here are the steps we have found to be reliable: 1. Create an F-pointer to the index directly, named indexfile, with the Dict pointing to the dict of the data/source file. 2. SSELECT indexfile 3. SAVE.LIST XX 4. Create a DICT fname DUP dictionary item like this: a. 0001: I b. 0002: @2;@ID;@1...@2 c. 0003: d. 0004: e. 0005: 3R f. 0006: S 5. GET.LIST XX 6. LIST indexfile WITH DUP # “0” DUP 7. Note the ID’s listed (dup_id1, dup_id2, etc.). Use them as listing criteria as follows: a. LIST indexfile WITH @ID = “dup_id1]””dup_id2]” (etc.) F1 F2 Building a program to perform these tasks will be one of our next immediate steps. Any further suggestions or comments are most welcome! Sincerely Best Regards, Richard On Fri, Oct 1, 2010 at 7:39 AM, David Wolverton <dwolv...@flash.net> wrote: > This has been our issue in the past -- unless you look at that data with a > HexEditor, it's not obvious. If you use the UniData "AE" Editor (now > included with UniVerse) you can type a "^" (Shift 6 - Caret) and see > non-printable characters. Once you know the culprit, you fix the source > data and the index heals itself. > > David W. > > -----Original Message----- > From: u2-users-boun...@listserver.u2ug.org > [mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Richard Brown > Sent: Friday, October 01, 2010 6:02 AM > To: U2 Users List > Subject: Re: [U2] UV index with duplicate nodes > > Look for a space or non printing character in the data. That would make it > look the same but actually be unique. > > > ----- Original Message ----- > From: "Richard Lewis" <rbl...@gmail.com> > To: <U2-Users@listserver.u2ug.org> > Sent: Thursday, September 30, 2010 8:47 PM > Subject: [U2] UV index with duplicate nodes > > > > We've just uncovered a rather unusual and unsettling situation. We have > a > > file with a single index that has somehow gotten nodes with duplicate > > keys. > > A simple example would be having an index on ZIPCODE in a address > > database, > > and finding that there are _two_ nodes (records) in the index for ZIPCODE > > 12345, for example. The source records referred to in the nodes are not > > duplicated, but since most operations find the 'first' node, any source > > records referred to in the duplicate node appear to not exist in the > > index. > > > >>LIST.INDEX fname ALL > > Alternate Key Index Summary for file fname > > File........... fname > > Indices........ 1 (1 A-type, 0 C-type, 0 D-type, 0 I-type, 0 SQL, 0 > > S-type) > > Index Updates.. Enabled, No updates pending > > > > Index name Type Build Nulls In DICT S/M Just Unique Field > > num/I-type > > fieldname A Not Reqd Yes Yes M L N 2 > > > > > > The file contains 6,539,233 records, with 574,547 unique values in > > fieldname > > (which is actually a single-valued field, and has been verified that each > > record's fieldname contains one and only one value). We found that 9 > > source > > records appear to have not been included in the index, but upon further > > research found the nodes with duplicate keys. We created an F-pointer to > > the index file itself (not normally recommended, but useful), then got > the > > results like the following: > > > > LIST indexfile WITH @ID = "12345]" F1 F2 > > > > fname..... F1........ F2........ > > 12345 987654 876543 > > 12345-6789 765432 543219 > > 12345 654321 > > > > We are having our UniVerse administrator ask our dealer for assistance, > > but > > were interested if any other users have had any recent similar > > experiences, > > or advice. > > _______________________________________________ > U2-Users mailing list > U2-Users@listserver.u2ug.org > http://listserver.u2ug.org/mailman/listinfo/u2-users > > > _______________________________________________ > U2-Users mailing list > U2-Users@listserver.u2ug.org > http://listserver.u2ug.org/mailman/listinfo/u2-users > _______________________________________________ U2-Users mailing list U2-Users@listserver.u2ug.org http://listserver.u2ug.org/mailman/listinfo/u2-users