RE: [docbook-apps] Japanese index
On 2018-04-25 Jirka Kosek wrote: > On 25.4.2018 11:57, Tony Graham wrote: > > Sorry, but I've never used it, so all that I know is what's on the website. > > I see. After digging some old emails I have been able to find this link: > > https://www.antennahouse.com/i18n-support-library-2/ > > It contains open-source part of library that should be working with > "kimber" method in the stylesheets. This could provide Jan with correct > Japanese indexing. Oops, I've forgotten there are other indexing methods. That 'kimber' looks very promising. Thanks a lot for the link to the Saxon extension. The original link at http://www.sagehill.net/docbookxsl/IndexIntl.html is broken now. If I understand correctly, comparing the open-source version 1 and the version 2, the latter brings enhancements in Chinese sorting and support for additional languages. Both is not directly related to Japanese so I'll start with that open source version. Thanks, Jan - To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org
Re: [docbook-apps] Japanese index
I forgot to mention that Eliot pointed me to a new package of i18n support that he created for DITA and it is packaged for the DITA-OT, but it can be adapted for use outside of DITA. It is written for Saxon 9, however. It is available on GitHub at: https://github.com/dita-community/org.dita-community.i18n Bob Stayton Sagehill Enterprises b...@sagehill.net On 4/25/2018 3:10 AM, Jirka Kosek wrote: On 25.4.2018 11:57, Tony Graham wrote: Sorry, but I've never used it, so all that I know is what's on the website. I see. After digging some old emails I have been able to find this link: https://www.antennahouse.com/i18n-support-library-2/ It contains open-source part of library that should be working with "kimber" method in the stylesheets. This could provide Jan with correct Japanese indexing.
Re: [docbook-apps] Japanese index
On 25/04/2018 11:10, Jirka Kosek wrote: On 25.4.2018 11:57, Tony Graham wrote: Sorry, but I've never used it, so all that I know is what's on the website. I see. After digging some old emails I have been able to find this link: https://www.antennahouse.com/i18n-support-library-2/ It contains open-source part of library that should be working with "kimber" method in the stylesheets. This could provide Jan with correct Japanese indexing. As relayed to me: Eliot Kimber originally developed the i18n Library for one of his customers and made it open source. Antenna House made some minor corrections and improvements and made those available under the open source license, the Support Library with no formal support. At the same time, Antenna House added Chinese sorting, both Traditional and Simplified, enhanced the library for DocBook, and offered official support. Over the years Antenna House has further enhanced the sorting module, greatly improved it, added additional languages, and created stylesheets (and developed the PDF5-ML DITA plugin). Regards, Tony Graham. -- Senior Architect XML Division Antenna House, Inc. Skerries, Ireland tgra...@antenna.co.jp - To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org
Re: [docbook-apps] Japanese index
Hello all, Thanks for tracking down that package, Jirka. I haven't tested it, but that should work with Japanese using the kimber indexing method as described in my book: http://www.sagehill.net/docbookxsl/IndexIntl.html#KimberIndexMethod I contacted Eliot Kimber, the author of the i18n_support library for whom the "kimber" method was named. He informed me that the original library was under the GNU Lesser GPL license, and Antenna House took it in 2008, enhanced it by paying for the building of a complete Traditional Chinese dictionary, and made it their commercial product. After that date, he made further enhancements, including locating an open source Chinese dictionary. I have a copy of a later version, but I don't think it has that dictionary. He says it is still under the GNU license and can be distributed. I will compare the two versions and will eventually put a package up on the DocBook Wiki for others to use. But for now, the Antenna House version should work. Bob Stayton Sagehill Enterprises b...@sagehill.net On 4/25/2018 3:10 AM, Jirka Kosek wrote: On 25.4.2018 11:57, Tony Graham wrote: Sorry, but I've never used it, so all that I know is what's on the website. I see. After digging some old emails I have been able to find this link: https://www.antennahouse.com/i18n-support-library-2/ It contains open-source part of library that should be working with "kimber" method in the stylesheets. This could provide Jan with correct Japanese indexing.
Re: [docbook-apps] Japanese index
On 25.4.2018 11:57, Tony Graham wrote: > Sorry, but I've never used it, so all that I know is what's on the website. I see. After digging some old emails I have been able to find this link: https://www.antennahouse.com/i18n-support-library-2/ It contains open-source part of library that should be working with "kimber" method in the stylesheets. This could provide Jan with correct Japanese indexing. -- -- Jirka Kosek e-mail: ji...@kosek.cz http://xmlguru.cz -- Professional XML and Web consulting and training services DocBook/DITA customization, custom XSLT/XSL-FO document processing -- Bringing you XML Prague conferencehttp://xmlprague.cz -- signature.asc Description: OpenPGP digital signature
Re: [docbook-apps] Japanese index
On 25/04/2018 10:50, Jirka Kosek wrote: ... It's probably not what you want to hear, but Antenna House does have a commercial product for doing DocBook indexes: https://www.antennahouse.com/antenna1/i18n-index-library/ Isn't this newer version of library that is needed for "kimber" indexing method? I though that Elliot intended to convince AH to make this library open-source, but it seems that my memory is wrong. Sorry, but I've never used it, so all that I know is what's on the website. Regards, Tony Graham. -- Senior Architect XML Division Antenna House, Inc. Skerries, Ireland tgra...@antenna.co.jp - To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org
Re: [docbook-apps] Japanese index
On 24.4.2018 21:53, Tony Graham wrote: >> But it is still unclear how to tweak the index code to generate groups >> from >> non-latin characters. > > I don't know, either. DocBook stylesheets support three methods of indexing, see: http://www.sagehill.net/docbookxsl/IndexIntl.html In "kosek" method you can easily define groups based on the first or first two characters of indexed words. Unfortunately there is currently no suitable definition for Japanese. And my Japanese knowledge is not enough to create such definition. But internals of this methods are described in the following paper: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.131.2069=rep1=pdf This might give you enough clue to adapt it to Japanese. If you will be successful it would be great if you can contribute definitions back to the stylesheets. Feel free to contact me if you need more info. > It's probably not what you want to hear, but Antenna House does have a > commercial product for doing DocBook indexes: > > https://www.antennahouse.com/antenna1/i18n-index-library/ Isn't this newer version of library that is needed for "kimber" indexing method? I though that Elliot intended to convince AH to make this library open-source, but it seems that my memory is wrong. -- -- Jirka Kosek e-mail: ji...@kosek.cz http://xmlguru.cz -- Professional XML and Web consulting and training services DocBook/DITA customization, custom XSLT/XSL-FO document processing -- Bringing you XML Prague conferencehttp://xmlprague.cz -- signature.asc Description: OpenPGP digital signature
Re: [docbook-apps] Japanese index
On 24/04/2018 19:39, Jan Tosovsky wrote: has anybody any experience with generating Japanese back-of-the-book index from DocBook source? More than 20 years ago. I am facing same issues discussed in this old thread (all entries end up in the Symbols section): https://lists.oasis-open.org/archives/docbook-apps/200605/msg00063.html If I understand correctly, indices in Japanese should be grouped phonetically: https://www.slideshare.net/k16shikano/imybp-light I've found promising Kuromoji library https://github.com/atilika/kuromoji I can imagine it could somehow pre-process all index entries and generate values for the 'sortas' attribute. Slide 35 of those slides shows a corner case that a morphological analyzer could get wrong. (I'm not able to test it, myself.) If you were using 'kuromoji', you could concatenate the values of the 'Reading' feature for all of the parts of speech of an index entry and use that as the 'sortas' value. But it is still unclear how to tweak the index code to generate groups from non-latin characters. I don't know, either. Or are there better ways? It's probably not what you want to hear, but Antenna House does have a commercial product for doing DocBook indexes: https://www.antennahouse.com/antenna1/i18n-index-library/ Regards, Tony Graham. -- Senior Architect XML Division Antenna House, Inc. Skerries, Ireland tgra...@antenna.co.jp - To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org