>> We have methods to convert to both UTF-16 and UTF-32 in our engine, >> which don't need a fixed length buffer, so I would like to replace: >> >> lucene_utf8towcs(wcharBuffer, content, MAX_CONV_SIZE); >> >> with a call to our code, if we can nail down exactly what clucene wants >> in the resultant wcharBuffer
lucene_utf8towcs calls lucene_utf8towc for every character; the comment on the function is this: /** * lucene_utf8towc: * @p: a pointer to Unicode character encoded as UTF-8 * * Converts a sequence of bytes encoded as UTF-8 to a Unicode character. * If @p does not point to a valid UTF-8 encoded character, results are * undefined. If you are not sure that the bytes are complete * valid Unicode characters, you should use lucene_utf8towc_validated() * instead. * * Return value: the resulting character **/ The call to doc->Add actually expects a TCHAR, so if your utf8 to utf16 conversion can produce a TCHAR, then that's all that would be necessary I think. Matthew _______________________________________________ sword-devel mailing list: sword-devel@crosswire.org http://www.crosswire.org/mailman/listinfo/sword-devel Instructions to unsubscribe/change your settings at above page