Re: Highlighted field gets truncated

2008-04-22 Thread Christian Wittern
Mike Klaas wrote: On 19-Apr-08, at 3:02 AM, Christian Wittern wrote: So it could be that the match is not part of the fragment? This sounds a bit strange. Is there a way to make sure the fragment contains the match other than returning the whole field and do the fragmenting myself

Re: Highlighted field gets truncated

2008-04-19 Thread Christian Wittern
the whole field and do the fragmenting myself? Fragments are returned as an xml list; you can combine them together however you like in client code. Solr can merge adjacent fragments for you if you wish. I see. That is great. Thanks, Christian -- Christian Wittern Institute for Research

Highlighted field gets truncated

2008-04-18 Thread Christian Wittern
=solr-tei.xsl Any hint on how to debug this would be highly appreciated! All the best, Christian -- Christian Wittern Institute for Research in Humanities, Kyoto University 47 Higashiogura-cho, Kitashirakawa, Sakyo-ku, Kyoto 606-8265, JAPAN

Re: Result based sorting for KWIC?

2008-03-11 Thread Christian Wittern
-- Christian Wittern Institute for Research in Humanities, Kyoto University 47 Higashiogura-cho, Kitashirakawa, Sakyo-ku, Kyoto 606-8265, JAPAN

Result based sorting for KWIC?

2008-03-10 Thread Christian Wittern
implement this? Any ideas appreciated, Christian -- Christian Wittern Institute for Research in Humanities, Kyoto University 47 Higashiogura-cho, Kitashirakawa, Sakyo-ku, Kyoto 606-8265, JAPAN

Re: invalid XML character

2008-03-01 Thread Christian Wittern
value. The easiest place to fix it is before the field values are serialized into XML. Indeed! All the best, Christian -- Christian Wittern Institute for Research in Humanities, Kyoto University 47 Higashiogura-cho, Kitashirakawa, Sakyo-ku, Kyoto 606-8265, JAPAN

Re: no support for CJK characters from Extension B in Solr

2008-02-28 Thread Christian Wittern
Ken Krugler wrote: What was the actual format of the Extension B characters in the XML being posted? I tried both a binary (UTF-8) format and numeric character representation of the type #x2; -- the results where the same. Christian -- Christian Wittern Institute for Research

Re: no support for CJK characters from Extension B in Solr

2008-02-28 Thread Christian Wittern
the example directory -- I am just assuming that this is doing The Right Thing:-) The encoding is (also?) specified in the XML file itself as UTF-8. Christian -- Christian Wittern Institute for Research in Humanities, Kyoto University 47 Higashiogura-cho, Kitashirakawa, Sakyo-ku, Kyoto 606-8265

no support for CJK characters from Extension B in Solr

2008-02-27 Thread Christian Wittern
sets, some of the characters in everyday use in Japan are now encoded in this area. It does therefore seems highly desirable that this problem gets solved. I am testing this on a Mac OS X 10.5.2 system, with Java 1.5.0_13 and Solr 1.2.0. Any hints appreciated, Christian Wittern -- Christian

Re: no support for CJK characters from Extension B in Solr

2008-02-27 Thread Christian Wittern
Leonardo Santagada wrote: On 28/02/2008, at 00:23, Christian Wittern wrote: The documents I am trying to index with Solr contain characters from the CJK Extension B, which had been added to Unicode in version 3.1 (March 2001). Just to give more information, does java suport this? I

help with using ngram analyser needed

2008-02-22 Thread Christian Wittern
-- Christian Wittern, Kyoto