[ 
https://issues.apache.org/jira/browse/SOLR-5364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13823714#comment-13823714
 ] 

Vadym Lotar commented on SOLR-5364:
-----------------------------------

Interesting...

>From the documentation:

solr.StandardTokenizerFactory
Creates org.apache.lucene.analysis.standard.StandardTokenizer.
A good general purpose tokenizer that strips many extraneous characters and 
sets token types to meaningful values. Token types are only useful for 
subsequent token filters that are type-aware of the same token types. There 
aren't any filters that use StandardTokenizer's types. Word boundary rules from 
Unicode standard annex UAX#29

So this UAX#29 is also mentioned here, however UAX29URLEmailTokenizer extends 
abstract class Tokenizer and not related to StandardTokenizer.

The problem is that we actually cannot remove it from the configuration because 
it does what we really want to have.

Do you think it's not related to JVM parameters configuration or to a deadlock 
somewhere in the index writer?

> SolrCloud stops accepting updates
> ---------------------------------
>
>                 Key: SOLR-5364
>                 URL: https://issues.apache.org/jira/browse/SOLR-5364
>             Project: Solr
>          Issue Type: Bug
>          Components: SolrCloud
>    Affects Versions: 4.4, 4.5, 4.6
>            Reporter: Chris
>            Priority: Blocker
>
> I'm attempting to import data into a SolrCloud cluster. After a certain 
> amount of time, the cluster stops accepting updates.
> I have tried numerous suggestions in IRC from Elyorag and others without 
> resolve.
> I have had this issue with 4.4, and understood there was a deadlock issue 
> fixed in 4.5, which hasn't resolved the issue, neither have the 4.6 snapshots.
> I've tried with Tomcat, various tomcat configuration changes to threading, 
> and with Jetty. Tried with various index merging configurations as I 
> initially thought there was a deadlock with concurrent merg scheduler, 
> however same issue with SerialMergeScheduler.
> The cluster stops accepting updates after some amount of time, this seems to 
> vary and is inconsistent. Sometimes I manage to index 400k docs, other times 
> ~1million . Querying  the cluster continues to work. I can reproduce the 
> issue consistently, and is currently blocking our transition to Solr.
> I can provide stack traces, thread dumps, jstack dumps as required.
> Here are two jstacks thus far:
> http://pastebin.com/1ktjBYbf
> http://pastebin.com/8JiQc3rb
> I have got these jstacks from the latest 4.6 snapshot, also running solrj 
> snapshot. The issue is also consistently reproducable with BinaryRequest 
> writer.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to