Michael McCandless wrote: > "Yonik Seeley" <[EMAIL PROTECTED]> wrote: >> On Nov 19, 2007 6:52 PM, Michael Busch <[EMAIL PROTECTED]> wrote: >>> Yonik Seeley wrote: >>>> So I think we all agree to do payloads by reference (do not make a >>>> copy of byte[] like termBuffer does), and to allow payload reuse. >>>> >>>> So now we still have 3 viable options still on the table I think: >>>> Token{ byte[] payload, int payloadLength, ...} >>>> Token{ byte[] payload, int payloadOffset, int payloadLength,...} >>>> Token{ Payload p, ... } >>>> >>> I'm for option 2. I agree that it is worthwhile to allow filters to >>> modify the payloads. And I'd like to optimize for the case where lot's >>> of tokens have payloads, and option 2 seems therefore the way to go. >> Just to play devil's advocate, it seems like adding the byte[] >> directly to Token gains less than we might have been thinking if we >> have reuse in any case. A TokenFilter could reuse the same Payload >> object for each term in a Field, so the CPU allocation savings is >> closer to a single Payload per field using payloads. >> >> If we used a Payload object, it would save 8 bytes per Token for >> fields not using payloads. >> Besides an initial allocation per field, the additional cost to using >> a Payload field would be an additional dereference (but that should be >> really minor). > > These are excellent points. I guess I would lean [back] towards > keeping the separate Payload object and extending its API to allow > re-use and modification of its byte[]? >
+1 -Michael --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]