Hi, there are some serious issues with encoding of the data sent to Solr in the released 3.6.0 version of Solrj (HttpSolrServer), for example: https://issues.apache.org/jira/browse/SOLR-3375
I believe your issue should already be fixed in the 3.6.0 branch. The contents from that branch will eventually become solr 3.6.1. For now I recommend you use the Commons version of the solr server (if You need to be on released version) or then just check out the fixed version from the 3.6 branch. -- Sami Siren On Thu, May 24, 2012 at 6:23 PM, Shane Perry <thry...@gmail.com> wrote: > Hi, > > Upgrading from 3.5 to 3.6, the CommonsHttpServer was deprecated in > favor of HttpServer. After updating my code and running my unit > tests, I have one test that fails. Digging into it I found that > FieldAnalysisRequest was return zero tokens for the test string (which > is Arabic). I am at a loss for what steps to take next and would > appreciate any direction that could be given. I have included a unit > test which demonstrates the behavior. > > My field's schema is: > > <fieldType name="text" class="solr.TextField" positionIncrementGap="100"> > <analyzer> > <charFilter class="solr.HTMLStripCharFilterFactory" /> > <tokenizer class="solr.StandardTokenizerFactory" /> > <filter class="solr.StandardFilterFactory" /> > <filter class="solr.TrimFilterFactory" /> > <filter class="solr.LowerCaseFilterFactory" /> > </analyzer> > </fieldType> > > I've also tried using the LegacyHTMLStripCharFilterFactory but with > the same results. > > Thanks, > > Shane > > > ===================== > > public class TestServer { > > private static final String HOST = "http://localhost:8080/junit-master"; > private static final String ARABIC_TEXT = "ﺐﻃﺎﻠﺒﻳ"; > private static final String ARABIC_FIELD = "text"; > > @Test > public void testArabicCommonsHttpServer() throws Exception { > CommonsHttpSolrServer server = null; > try { > server = new CommonsHttpSolrServer(HOST); > > server.setParser(new XMLResponseParser()); > } catch (MalformedURLException ex) { > } > > assertTrue(server != null); > > List<String> tokens = analysis(analysis(server, ARABIC_FIELD, > ARABIC_TEXT)); > > assertTrue(!tokens.isEmpty()); > } > > @Test > public void testArabicHttpServer() throws Exception { > HttpSolrServer server = new HttpSolrServer(HOST); > > server.setParser(new XMLResponseParser()); > > assertTrue(server != null); > > List<String> tokens = analysis(analysis(server, ARABIC_FIELD, > ARABIC_TEXT)); > > assertTrue(!tokens.isEmpty()); > } > > private static FieldAnalysisResponse analysis(SolrServer server, > String field, String text) { > FieldAnalysisResponse response = null; > > FieldAnalysisRequest request = new FieldAnalysisRequest("/analysis/field"). > addFieldName(field). > setFieldValue(text). > setQuery(text); > > request.setMethod(METHOD.POST); > > try { > response = request.process(server); > } catch (Exception ex) { > } > > return response; > } > > private static List<String> analysis(FieldAnalysisResponse response) { > List<String> token = new LinkedList<String>(); > > if (response == null) { > return token; > } > > Iterator<Entry<String, Analysis>> iterator = response. > getAllFieldNameAnalysis().iterator(); > if (iterator.hasNext()) { > Entry<String, Analysis> entry = iterator.next(); > Iterator<AnalysisPhase> phaseIterator = > entry.getValue().getQueryPhases(). > iterator(); > > List<TokenInfo> tokens = null; > while (phaseIterator.hasNext()) { > tokens = phaseIterator.next().getTokens(); // Only need the last one > } > > for (TokenInfo ti : tokens) { > token.add(ti.getText()); > } > } > > return token; > } > }