Hi,

there are some serious issues with encoding of the data sent to Solr
in the released 3.6.0 version of Solrj (HttpSolrServer), for example:
https://issues.apache.org/jira/browse/SOLR-3375

I believe your issue should already be fixed in the 3.6.0 branch. The
contents from that branch will eventually become solr 3.6.1.

For now I recommend you use the Commons version of the solr server (if
You need to be on released version) or then just check out the fixed
version from the 3.6 branch.

--
 Sami Siren

On Thu, May 24, 2012 at 6:23 PM, Shane Perry <thry...@gmail.com> wrote:
> Hi,
>
> Upgrading from 3.5 to 3.6, the CommonsHttpServer was deprecated in
> favor of HttpServer.  After updating my code and running my unit
> tests, I have one test that fails.  Digging into it I found that
> FieldAnalysisRequest was return zero tokens for the test string (which
> is Arabic).  I am at a loss for what steps to take next and would
> appreciate any direction that could be given.  I have included a unit
> test which demonstrates the behavior.
>
> My field's schema is:
>
>    <fieldType name="text" class="solr.TextField" positionIncrementGap="100">
>      <analyzer>
>        <charFilter class="solr.HTMLStripCharFilterFactory" />
>        <tokenizer class="solr.StandardTokenizerFactory" />
>        <filter class="solr.StandardFilterFactory" />
>        <filter class="solr.TrimFilterFactory" />
>        <filter class="solr.LowerCaseFilterFactory" />
>      </analyzer>
>    </fieldType>
>
> I've also tried using the LegacyHTMLStripCharFilterFactory but with
> the same results.
>
> Thanks,
>
> Shane
>
>
> =====================
>
> public class TestServer {
>
>  private static final String HOST = "http://localhost:8080/junit-master";;
>  private static final String ARABIC_TEXT = "ﺐﻃﺎﻠﺒﻳ";
>  private static final String ARABIC_FIELD = "text";
>
>  @Test
>  public void testArabicCommonsHttpServer() throws Exception {
>    CommonsHttpSolrServer server = null;
>    try {
>      server = new CommonsHttpSolrServer(HOST);
>
>      server.setParser(new XMLResponseParser());
>    } catch (MalformedURLException ex) {
>    }
>
>    assertTrue(server != null);
>
>    List<String> tokens = analysis(analysis(server, ARABIC_FIELD, 
> ARABIC_TEXT));
>
>    assertTrue(!tokens.isEmpty());
>  }
>
>  @Test
>  public void testArabicHttpServer() throws Exception {
>    HttpSolrServer server = new HttpSolrServer(HOST);
>
>    server.setParser(new XMLResponseParser());
>
>    assertTrue(server != null);
>
>    List<String> tokens = analysis(analysis(server, ARABIC_FIELD, 
> ARABIC_TEXT));
>
>    assertTrue(!tokens.isEmpty());
>  }
>
>  private static FieldAnalysisResponse analysis(SolrServer server,
> String field, String text) {
>    FieldAnalysisResponse response = null;
>
>    FieldAnalysisRequest request = new FieldAnalysisRequest("/analysis/field").
>            addFieldName(field).
>            setFieldValue(text).
>            setQuery(text);
>
>    request.setMethod(METHOD.POST);
>
>    try {
>      response = request.process(server);
>    } catch (Exception ex) {
>    }
>
>    return response;
>  }
>
>  private static List<String> analysis(FieldAnalysisResponse response) {
>    List<String> token = new LinkedList<String>();
>
>    if (response == null) {
>      return token;
>    }
>
>    Iterator<Entry<String, Analysis>> iterator = response.
>            getAllFieldNameAnalysis().iterator();
>    if (iterator.hasNext()) {
>      Entry<String, Analysis> entry = iterator.next();
>      Iterator<AnalysisPhase> phaseIterator = 
> entry.getValue().getQueryPhases().
>              iterator();
>
>      List<TokenInfo> tokens = null;
>      while (phaseIterator.hasNext()) {
>        tokens = phaseIterator.next().getTokens(); // Only need the last one
>      }
>
>      for (TokenInfo ti : tokens) {
>        token.add(ti.getText());
>      }
>    }
>
>    return token;
>  }
> }

Reply via email to