https://bugs.freedesktop.org/show_bug.cgi?id=44681
--- Comment #2 from Caolán McNamara <[email protected]> 2012-02-10 07:28:53 PST --- oh shiny, that's *very* encouraging. With a bit of luck this could take hours off my multi-language build times :-) Well first off I reckon its best to mail your code-to-date and your question again to the general development list [email protected] to get better and wider feedback but here's my guesses. * Lucene 2.3 is used, but the CLucene stable version is Lucene 1.9.1 compatible. The developers recommend using the Git version, which *is* compatible with 2.3. Is that OK? probably yeah * How exactly is HelpIndexerTool used? I believe both as a command-line tool (as part of a ??? -> HelpLinker -> HelpIndexer -> ???) chain in the build process, and as a run-time component to index help for extensions. Is it desired to keep HelpIndexer as a stand-alone command-line tool, or is that just because it is a Java component currently? I think its desirable to be a standalone command-line tool. It gets used when building the "helpcontent2" module, which is really slow for lots of enabled languages. Currently, I ported most of the first part (for Japanese there is a special Analyzer, which I don't know how to test, and there are a bunch of options to check certain things, of which I'm not sure whether they're ever used) The CJKAnalyzer comes with the java lucene to its a special one, but not a custom one belonging to us, *presumably* this means that the lucene world knows any potential gotchas with trying to convert uses of it to clucene, I'm not exactly sure what it does over the generic one, but we've got some Japanese readers who should be able to read the final output of a conversion to see if the quality is sufficient. Question: does creating the ZIP need to be part of this? If so, what is the best way to create the archive? back in the day I the last time I convert the original java HelpLinker to c++ I *cough* just spawned off perl to do the zipping, e.g. see JarOutputStream::JarOutputStream in http://people.redhat.com/caolanm/ooocvs/workspace.helplinker01.patch could grab and re-use that. In the longer run we might expose some more stuff from package/inc to export out a simple zip api, but using (silly) JarOutputStream would do for now * I'm assuming that CLucene will *always* be compiled with TCHAR defined as wchar_t. This is because of my ignorance of how one does portable wide strings in LibreOffice. Please enlighten me. presumably this will "just work", FWIW we have an 8bit code unit "rtl::OString" and a UTF-16 rtl::OUString class in LibreOffice, not sure if we need to bridge from these to whatever CLucene uses at any point, but I'm sure its doable if necessary. * How to incorporate the CLucene dependency in the build process? its sort of tricky to do this, but plenty of examples, e.g. see the libwpd or libcdr or hunspell dirs which are special modules that build extra dependencies. Basically don't worry about this bit, get it converted to clucene and with some luck someone else will handle figuring out how to build clucene itself as part of our build -- Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug. _______________________________________________ Libreoffice-bugs mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/libreoffice-bugs
