Sounds reasonable. Is there anything you need from me, then, or do you have what you need? >>>>
Hi Grant, as promised, I am currently looking through your patch. So please, be patient for some more days. I stumbled over something in the current implementation that took me some hours to understand and test. In the txd-file you store field numbers. You are using difference-encoding (store the differences of field numbers, not their absolute values) and variable-length integers. The problem is that FieldInfos not necesarily store fields in alphabetical order. No order is guranteed at all and order can change from segment to segment, as well as the field numbers themselves. This means that the field numbers you are writing into the txd-file are not necessarily in increasing order and you can get negative entries with the difference encoding. Variable-length intergers due to their specification (e.g. IndexInput.readVInt()) only work for positive numbers. All this was difficult to test, ... , The result is: It really is as described above, but luckily, variable-length integers also work for negative numbers. So termVerctors work as they should. However, I will change from difference encoding for the field numbers to normal encoding. I think usualy one does not have more than 256 different fields and so difference encoding is not necessary. Furthermore, negative numbers always take 4 bytes as variable-length integer, so difference encoding actually needs more space than normal encoding here. Note that of course difference encoding for positions remains unchanged since it definitely is very effective here. Christoph --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]