[jira] [Resolved] (SOLR-9416) Exception writing document

Erick Erickson (JIRA) Tue, 16 Aug 2016 09:39:39 -0700

     [ 
https://issues.apache.org/jira/browse/SOLR-9416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Erick Erickson resolved SOLR-9416.
----------------------------------
    Resolution: Invalid

first, please raise issues like this on the user's list before raising a JIRA 
to be sure it's not pilot error.

In this case you can see the root cause of your problem by the line:
Caused by: java.lang.OutOfMemoryError: Java heap space

You're simply running with too little Java heap allocated for what you're 
trying to do. This is not a bug in Solr, but a problem with your environment.

> Exception writing document
> --------------------------
>
>                 Key: SOLR-9416
>                 URL: https://issues.apache.org/jira/browse/SOLR-9416
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: clients - java
>    Affects Versions: 5.2.1
>         Environment: Solr on AIX
>            Reporter: Kiran Chowdhury
>            Priority: Blocker
>
> We are getting following exception while writting into solr .. its not that 
> coming always.. but very freequent.. once we restart solr this problem go 
> away but come back again after couple of days .. 
> Stack trace from solr admin log ->
> org.apache.solr.common.SolrException: Exception writing document id 41184272 
> to the index; possible analysis error.
>       at 
> org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:167)
>       at 
> org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:69)
>       at 
> org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51)
>       at 
> org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(DistributedUpdateProcessor.java:955)
>       at 
> org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:1110)
>       at 
> org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:706)
>       at 
> org.apache.solr.update.processor.LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:104)
>       at 
> org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.processUpdate(JsonLoader.java:143)
>       at 
> org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.load(JsonLoader.java:113)
>       at org.apache.solr.handler.loader.JsonLoader.load(JsonLoader.java:76)
>       at 
> org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:98)
>       at 
> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
>       at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:143)
>       at org.apache.solr.core.SolrCore.execute(SolrCore.java:2068)
>       at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:669)
>       at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:462)
>       at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:210)
>       at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:179)
>       at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
>       at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
>       at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>       at 
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577)
>       at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223)
>       at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127)
>       at 
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
>       at 
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
>       at 
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061)
>       at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>       at 
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:215)
>       at 
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:110)
>       at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
>       at org.eclipse.jetty.server.Server.handle(Server.java:499)
>       at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:310)
>       at 
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:257)
>       at 
> org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540)
>       at 
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635)
>       at 
> org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555)
>       at java.lang.Thread.run(Thread.java:785)
> Caused by: org.apache.lucene.store.AlreadyClosedException: this IndexWriter 
> is closed
>       at org.apache.lucene.index.IndexWriter.ensureOpen(IndexWriter.java:719)
>       at org.apache.lucene.index.IndexWriter.ensureOpen(IndexWriter.java:733)
>       at 
> org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1471)
>       at 
> org.apache.solr.update.DirectUpdateHandler2.addDoc0(DirectUpdateHandler2.java:239)
>       at 
> org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:163)
>       ... 37 more
> Caused by: java.lang.OutOfMemoryError: Java heap space
>       at org.apache.lucene.util.ArrayUtil.grow(ArrayUtil.java:354)
>       at 
> org.apache.lucene.codecs.compressing.CompressingStoredFieldsWriter.writeField(CompressingStoredFieldsWriter.java:297)
>       at 
> org.apache.lucene.index.DefaultIndexingChain.processField(DefaultIndexingChain.java:361)
>       at 
> org.apache.lucene.index.DefaultIndexingChain.processDocument(DefaultIndexingChain.java:300)
>       at 
> org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:234)
>       at 
> org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:450)
>       at 
> org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1475)
>       ... 39 more
> schema.xml ->
> <?xml version="1.0" encoding="UTF-8" ?>
> <!--
>  Licensed to the Apache Software Foundation (ASF) under one or more
>  contributor license agreements.  See the NOTICE file distributed with
>  this work for additional information regarding copyright ownership.
>  The ASF licenses this file to You under the Apache License, Version 2.0
>  (the "License"); you may not use this file except in compliance with
>  the License.  You may obtain a copy of the License at
>      http://www.apache.org/licenses/LICENSE-2.0
>  Unless required by applicable law or agreed to in writing, software
>  distributed under the License is distributed on an "AS IS" BASIS,
>  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
>  See the License for the specific language governing permissions and
>  limitations under the License.
> -->
> <!-- 
>      For more details about configurations options that may appear in
>      this file, see http://wiki.apache.org/solr/SolrConfigXml. 
> -->
> <config>
>   <!-- In all configuration below, a prefix of "solr." for class names
>        is an alias that causes solr to search appropriate packages,
>        including org.apache.solr.(search|update|request|core|analysis)
>        You may also specify a fully qualified Java classname if you
>        have your own custom plugins.
>     -->
>   <!-- Controls what version of Lucene various components of Solr
>        adhere to.  Generally, you want to use the latest version to
>        get all bug fixes and improvements. It is highly recommended
>        that you fully re-index after changing this setting as it can
>        affect both how text is indexed and queried.
>   -->
>   <luceneMatchVersion>5.2.1</luceneMatchVersion>
>   <!-- Data Directory
>        Used to specify an alternate directory to hold all index data
>        other than the default ./data under the Solr home.  If
>        replication is in use, this should match the replication
>        configuration.
>     -->
>   <dataDir>${solr.data.dir:}</dataDir>
>   <!-- The DirectoryFactory to use for indexes.
>        
>        solr.StandardDirectoryFactory is filesystem
>        based and tries to pick the best implementation for the current
>        JVM and platform.  solr.NRTCachingDirectoryFactory, the default,
>        wraps solr.StandardDirectoryFactory and caches small files in memory
>        for better NRT performance.
>        One can force a particular implementation via 
> solr.MMapDirectoryFactory,
>        solr.NIOFSDirectoryFactory, or solr.SimpleFSDirectoryFactory.
>        solr.RAMDirectoryFactory is memory based, not
>        persistent, and doesn't work with replication.
>     -->
>   <directoryFactory name="DirectoryFactory" 
>                     
> class="${solr.directoryFactory:solr.NRTCachingDirectoryFactory}">
>   </directoryFactory> 
>   <!-- The CodecFactory for defining the format of the inverted index.
>        The default implementation is SchemaCodecFactory, which is the 
> official Lucene
>        index format, but hooks into the schema to provide per-field 
> customization of
>        the postings lists and per-document values in the fieldType element
>        (postingsFormat/docValuesFormat). Note that most of the alternative 
> implementations
>        are experimental, so if you choose to customize the index format, it's 
> a good
>        idea to convert back to the official format e.g. via 
> IndexWriter.addIndexes(IndexReader)
>        before upgrading to a newer version to avoid unnecessary reindexing.
>   -->
>   <codecFactory class="solr.SchemaCodecFactory"/>
>   <schemaFactory class="ClassicIndexSchemaFactory"/>
>   <!-- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>        Index Config - These settings control low-level behavior of indexing
>        Most example settings here show the default value, but are commented
>        out, to more easily see where customizations have been made.
>        
>        Note: This replaces <indexDefaults> and <mainIndex> from older versions
>        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 
> -->
>   <indexConfig>
>     <!-- LockFactory 
>          This option specifies which Lucene LockFactory implementation
>          to use.
>       
>          single = SingleInstanceLockFactory - suggested for a
>                   read-only index or when there is no possibility of
>                   another process trying to modify the index.
>          native = NativeFSLockFactory - uses OS native file locking.
>                   Do not use when multiple solr webapps in the same
>                   JVM are attempting to share a single index.
>          simple = SimpleFSLockFactory  - uses a plain file for locking
>          Defaults: 'native' is default for Solr3.6 and later, otherwise
>                    'simple' is the default
>          More details on the nuances of each LockFactory...
>          http://wiki.apache.org/lucene-java/AvailableLockFactories
>     -->
>     <lockType>${solr.lock.type:native}</lockType>
>     <!-- Lucene Infostream
>        
>          To aid in advanced debugging, Lucene provides an "InfoStream"
>          of detailed information when indexing.
>          Setting the value to true will instruct the underlying Lucene
>          IndexWriter to write its info stream to solr's log. By default,
>          this is enabled here, and controlled through log4j.properties.
>       -->
>      <infoStream>true</infoStream>
>   </indexConfig>
>   <!-- JMX
>        
>        This example enables JMX if and only if an existing MBeanServer
>        is found, use this if you want to configure JMX through JVM
>        parameters. Remove this to disable exposing Solr configuration
>        and statistics to JMX.
>        For more details see http://wiki.apache.org/solr/SolrJmx
>     -->
>   <jmx />
>   <!-- If you want to connect to a particular server, specify the
>        agentId 
>     -->
>   <!-- <jmx agentId="myAgent" /> -->
>   <!-- If you want to start a new MBeanServer, specify the serviceUrl -->
>   <!-- <jmx serviceUrl="service:jmx:rmi:///jndi/rmi://localhost:9999/solr"/>
>     -->
>   <!-- The default high-performance update handler -->
>   <updateHandler class="solr.DirectUpdateHandler2">
>     <!-- Enables a transaction log, used for real-time get, durability, and
>          and solr cloud replica recovery.  The log can grow as big as
>          uncommitted changes to the index, so use of a hard autoCommit
>          is recommended (see below).
>          "dir" - the target directory for transaction logs, defaults to the
>                 solr data directory.
>          "numVersionBuckets" - sets the number of buckets used to keep
>                 track of max version values when checking for re-ordered
>                 updates; increase this value to reduce the cost of
>                 synchronizing access to version buckets during high-volume
>                 indexing, this requires 8 bytes (long) * numVersionBuckets
>                 of heap space per Solr core.
>     -->
>     <updateLog>
>       <str name="dir">${solr.ulog.dir:}</str>
>       <int name="numVersionBuckets">${solr.ulog.numVersionBuckets:65536}</int>
>     </updateLog>
>  
>     <!-- AutoCommit
>          Perform a hard commit automatically under certain conditions.
>          Instead of enabling autoCommit, consider using "commitWithin"
>          when adding documents. 
>          http://wiki.apache.org/solr/UpdateXmlMessages
>          maxDocs - Maximum number of documents to add since the last
>                    commit before automatically triggering a new commit.
>          maxTime - Maximum amount of time in ms that is allowed to pass
>                    since a document was added before automatically
>                    triggering a new commit. 
>          openSearcher - if false, the commit causes recent index changes
>            to be flushed to stable storage, but does not cause a new
>            searcher to be opened to make those changes visible.
>          If the updateLog is enabled, then it's highly recommended to
>          have some sort of hard autoCommit to limit the log size.
>       -->
>      <autoCommit> 
>        <maxTime>${solr.autoCommit.maxTime:15000}</maxTime> 
>        <openSearcher>false</openSearcher> 
>      </autoCommit>
>     <!-- softAutoCommit is like autoCommit except it causes a
>          'soft' commit which only ensures that changes are visible
>          but does not ensure that data is synced to disk.  This is
>          faster and more near-realtime friendly than a hard commit.
>       -->
>      <autoSoftCommit> 
>        <maxTime>10</maxTime> 
>      </autoSoftCommit>
>   </updateHandler>
>   
>   <!-- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>        Query section - these settings control query time things like caches
>        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 
> -->
>   <query>
>     <!-- Max Boolean Clauses
>          Maximum number of clauses in each BooleanQuery,  an exception
>          is thrown if exceeded.
>          ** WARNING **
>          
>          This option actually modifies a global Lucene property that
>          will affect all SolrCores.  If multiple solrconfig.xml files
>          disagree on this property, the value at any given moment will
>          be based on the last SolrCore to be initialized.
>          
>       -->
>     <maxBooleanClauses>1024</maxBooleanClauses>
>     <!-- Solr Internal Query Caches
>          There are two implementations of cache available for Solr,
>          LRUCache, based on a synchronized LinkedHashMap, and
>          FastLRUCache, based on a ConcurrentHashMap.  
>          FastLRUCache has faster gets and slower puts in single
>          threaded operation and thus is generally faster than LRUCache
>          when the hit ratio of the cache is high (> 75%), and may be
>          faster under other scenarios on multi-cpu systems.
>     -->
>     <!-- Filter Cache
>          Cache used by SolrIndexSearcher for filters (DocSets),
>          unordered sets of *all* documents that match a query.  When a
>          new searcher is opened, its caches may be prepopulated or
>          "autowarmed" using data from caches in the old searcher.
>          autowarmCount is the number of items to prepopulate.  For
>          LRUCache, the autowarmed items will be the most recently
>          accessed items.
>          Parameters:
>            class - the SolrCache implementation LRUCache or
>                (LRUCache or FastLRUCache)
>            size - the maximum number of entries in the cache
>            initialSize - the initial capacity (number of entries) of
>                the cache.  (see java.util.HashMap)
>            autowarmCount - the number of entries to prepopulate from
>                and old cache.  
>       -->
>     <filterCache class="solr.FastLRUCache"
>                  size="512"
>                  initialSize="512"
>                  autowarmCount="0"/>
>     <!-- Query Result Cache
>         Caches results of searches - ordered lists of document ids
>         (DocList) based on a query, a sort, and the range of documents 
> requested.
>         Additional supported parameter by LRUCache:
>            maxRamMB - the maximum amount of RAM (in MB) that this cache is 
> allowed
>                       to occupy
>      -->
>     <queryResultCache class="solr.LRUCache"
>                      size="512"
>                      initialSize="512"
>                      autowarmCount="0"/>
>    
>     <!-- Document Cache
>          Caches Lucene Document objects (the stored fields for each
>          document).  Since Lucene internal document ids are transient,
>          this cache will not be autowarmed.  
>       -->
>     <documentCache class="solr.LRUCache"
>                    size="512"
>                    initialSize="512"
>                    autowarmCount="0"/>
>     
>     <!-- custom cache currently used by block join --> 
>     <cache name="perSegFilter"
>       class="solr.search.LRUCache"
>       size="10"
>       initialSize="0"
>       autowarmCount="10"
>       regenerator="solr.NoOpRegenerator" />
>     <!-- Lazy Field Loading
>          If true, stored fields that are not requested will be loaded
>          lazily.  This can result in a significant speed improvement
>          if the usual case is to not load all stored fields,
>          especially if the skipped fields are large compressed text
>          fields.
>     -->
>     <enableLazyFieldLoading>true</enableLazyFieldLoading>
>    <!-- Result Window Size
>         An optimization for use with the queryResultCache.  When a search
>         is requested, a superset of the requested number of document ids
>         are collected.  For example, if a search for a particular query
>         requests matching documents 10 through 19, and queryWindowSize is 50,
>         then documents 0 through 49 will be collected and cached.  Any further
>         requests in that range can be satisfied via the cache.  
>      -->
>    <queryResultWindowSize>20</queryResultWindowSize>
>    <!-- Maximum number of documents to cache for any entry in the
>         queryResultCache. 
>      -->
>    <queryResultMaxDocsCached>200</queryResultMaxDocsCached>
>     <!-- Use Cold Searcher
>          If a search request comes in and there is no current
>          registered searcher, then immediately register the still
>          warming searcher and use it.  If "false" then all requests
>          will block until the first searcher is done warming.
>       -->
>     <useColdSearcher>false</useColdSearcher>
>     <!-- Max Warming Searchers
>          
>          Maximum number of searchers that may be warming in the
>          background concurrently.  An error is returned if this limit
>          is exceeded.
>          Recommend values of 1-2 for read-only slaves, higher for
>          masters w/o cache warming.
>       -->
>     <maxWarmingSearchers>2</maxWarmingSearchers>
>   </query>
>   <!-- Request Dispatcher
>        This section contains instructions for how the SolrDispatchFilter
>        should behave when processing requests for this SolrCore.
>        handleSelect is a legacy option that affects the behavior of requests
>        such as /select?qt=XXX
>        handleSelect="true" will cause the SolrDispatchFilter to process
>        the request and dispatch the query to a handler specified by the 
>        "qt" param, assuming "/select" isn't already registered.
>        handleSelect="false" will cause the SolrDispatchFilter to
>        ignore "/select" requests, resulting in a 404 unless a handler
>        is explicitly registered with the name "/select"
>        handleSelect="true" is not recommended for new users, but is the 
> default
>        for backwards compatibility
>     -->
>   <requestDispatcher handleSelect="false" >
>     <!-- Request Parsing
>          These settings indicate how Solr Requests may be parsed, and
>          what restrictions may be placed on the ContentStreams from
>          those requests
>          enableRemoteStreaming - enables use of the stream.file
>          and stream.url parameters for specifying remote streams.
>          multipartUploadLimitInKB - specifies the max size (in KiB) of
>          Multipart File Uploads that Solr will allow in a Request.
>          
>          formdataUploadLimitInKB - specifies the max size (in KiB) of
>          form data (application/x-www-form-urlencoded) sent via
>          POST. You can use POST to pass request parameters not
>          fitting into the URL.
>          
>          addHttpRequestToContext - if set to true, it will instruct
>          the requestParsers to include the original HttpServletRequest
>          object in the context map of the SolrQueryRequest under the 
>          key "httpRequest". It will not be used by any of the existing
>          Solr components, but may be useful when developing custom 
>          plugins.
>          
>          *** WARNING ***
>          The settings below authorize Solr to fetch remote files, You
>          should make sure your system has some authentication before
>          using enableRemoteStreaming="true"
>       --> 
>     <requestParsers enableRemoteStreaming="true" 
>                     multipartUploadLimitInKB="2048000"
>                     formdataUploadLimitInKB="2048"
>                     addHttpRequestToContext="false"/>
>     <!-- HTTP Caching
>          Set HTTP caching related parameters (for proxy caches and clients).
>          The options below instruct Solr not to output any HTTP Caching
>          related headers
>       -->
>     <httpCaching never304="true" />
>   </requestDispatcher>
>   <!-- Request Handlers 
>        http://wiki.apache.org/solr/SolrRequestHandler
>        Incoming queries will be dispatched to a specific handler by name
>        based on the path specified in the request.
>        Legacy behavior: If the request path uses "/select" but no Request
>        Handler has that name, and if handleSelect="true" has been specified in
>        the requestDispatcher, then the Request Handler is dispatched based on
>        the qt parameter.  Handlers without a leading '/' are accessed this way
>        like so: http://host/app/[core/]select?qt=name  If no qt is
>        given, then the requestHandler that declares default="true" will be
>        used or the one named "standard".
>        If a Request Handler is declared with startup="lazy", then it will
>        not be initialized until the first request that uses it.
>     -->
>   <!-- SearchHandler
>        http://wiki.apache.org/solr/SearchHandler
>        For processing Search Queries, the primary Request Handler
>        provided with Solr is "SearchHandler" It delegates to a sequent
>        of SearchComponents (see below) and supports distributed
>        queries across multiple shards
>     -->
>   <requestHandler name="/select" class="solr.SearchHandler">
>     <!-- default values for query parameters can be specified, these
>          will be overridden by parameters in the request
>       -->
>      <lst name="defaults">
>        <str name="echoParams">explicit</str>
>        <int name="rows">10</int>
>      </lst>
>     </requestHandler>
>   <!-- A request handler that returns indented JSON by default -->
>   <requestHandler name="/query" class="solr.SearchHandler">
>      <lst name="defaults">
>        <str name="echoParams">explicit</str>
>        <str name="wt">json</str>
>        <str name="indent">true</str>
>        <str name="df">contenttype</str>
>      </lst>
>   </requestHandler>
>   <!--
>     The export request handler is used to export full sorted result sets.
>     Do not change these defaults.
>   -->
>   <requestHandler name="/export" class="solr.SearchHandler">
>     <lst name="invariants">
>       <str name="rq">{!xport}</str>
>       <str name="wt">xsort</str>
>       <str name="distrib">false</str>
>     </lst>
>     <arr name="components">
>       <str>query</str>
>     </arr>
>   </requestHandler>
>   <initParams path="/update/**,/query,/select,/tvrh,/elevate,/spell">
>     <lst name="defaults">
>       <str name="df">contenttype</str>
>     </lst>
>   </initParams>
>   <!-- Field Analysis Request Handler
>        RequestHandler that provides much the same functionality as
>        analysis.jsp. Provides the ability to specify multiple field
>        types and field names in the same request and outputs
>        index-time and query-time analysis for each of them.
>        Request parameters are:
>        analysis.fieldname - field name whose analyzers are to be used
>        analysis.fieldtype - field type whose analyzers are to be used
>        analysis.fieldvalue - text for index-time analysis
>        q (or analysis.q) - text for query time analysis
>        analysis.showmatch (true|false) - When set to true and when
>            query analysis is performed, the produced tokens of the
>            field value analysis will be marked as "matched" for every
>            token that is produces by the query analysis
>    -->
>   <requestHandler name="/analysis/field" 
>                   startup="lazy"
>                   class="solr.FieldAnalysisRequestHandler" />
>   <!-- Document Analysis Handler
>        http://wiki.apache.org/solr/AnalysisRequestHandler
>        An analysis handler that provides a breakdown of the analysis
>        process of provided documents. This handler expects a (single)
>        content stream with the following format:
>        <docs>
>          <doc>
>            <field name="id">1</field>
>            <field name="name">The Name</field>
>            <field name="text">The Text Value</field>
>          </doc>
>          <doc>...</doc>
>          <doc>...</doc>
>          ...
>        </docs>
>     Note: Each document must contain a field which serves as the
>     unique key. This key is used in the returned response to associate
>     an analysis breakdown to the analyzed document.
>     Like the FieldAnalysisRequestHandler, this handler also supports
>     query analysis by sending either an "analysis.query" or "q"
>     request parameter that holds the query text to be analyzed. It
>     also supports the "analysis.showmatch" parameter which when set to
>     true, all field tokens that match the query tokens will be marked
>     as a "match". 
>   -->
>   <requestHandler name="/analysis/document" 
>                   class="solr.DocumentAnalysisRequestHandler" 
>                   startup="lazy" />
>   <!-- Echo the request contents back to the client -->
>   <requestHandler name="/debug/dump" class="solr.DumpRequestHandler" >
>     <lst name="defaults">
>      <str name="echoParams">explicit</str> 
>      <str name="echoHandler">true</str>
>     </lst>
>   </requestHandler>
>   
>   <!-- Search Components
>        Search components are registered to SolrCore and used by 
>        instances of SearchHandler (which can access them by name)
>        
>        By default, the following components are available:
>        
>        <searchComponent name="query"     class="solr.QueryComponent" />
>        <searchComponent name="facet"     class="solr.FacetComponent" />
>        <searchComponent name="mlt"       class="solr.MoreLikeThisComponent" />
>        <searchComponent name="highlight" class="solr.HighlightComponent" />
>        <searchComponent name="stats"     class="solr.StatsComponent" />
>        <searchComponent name="debug"     class="solr.DebugComponent" />
>        
>      -->
>   <!-- Terms Component
>        http://wiki.apache.org/solr/TermsComponent
>        A component to return terms and document frequency of those
>        terms
>     -->
>   <searchComponent name="terms" class="solr.TermsComponent"/>
>   <!-- A request handler for demonstrating the terms component -->
>   <requestHandler name="/terms" class="solr.SearchHandler" startup="lazy">
>      <lst name="defaults">
>       <bool name="terms">true</bool>
>       <bool name="distrib">false</bool>
>     </lst>     
>     <arr name="components">
>       <str>terms</str>
>     </arr>
>   </requestHandler>
>   <!-- Legacy config for the admin interface -->
>   <admin>
>     <defaultQuery>*:*</defaultQuery>
>   </admin>
> </config>
> solrconfig.xml ->
> <?xml version="1.0" encoding="UTF-8" ?>
> <!--
>  Licensed to the Apache Software Foundation (ASF) under one or more
>  contributor license agreements.  See the NOTICE file distributed with
>  this work for additional information regarding copyright ownership.
>  The ASF licenses this file to You under the Apache License, Version 2.0
>  (the "License"); you may not use this file except in compliance with
>  the License.  You may obtain a copy of the License at
>      http://www.apache.org/licenses/LICENSE-2.0
>  Unless required by applicable law or agreed to in writing, software
>  distributed under the License is distributed on an "AS IS" BASIS,
>  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
>  See the License for the specific language governing permissions and
>  limitations under the License.
> -->
> <!-- 
>      For more details about configurations options that may appear in
>      this file, see http://wiki.apache.org/solr/SolrConfigXml. 
> -->
> <config>
>   <!-- In all configuration below, a prefix of "solr." for class names
>        is an alias that causes solr to search appropriate packages,
>        including org.apache.solr.(search|update|request|core|analysis)
>        You may also specify a fully qualified Java classname if you
>        have your own custom plugins.
>     -->
>   <!-- Controls what version of Lucene various components of Solr
>        adhere to.  Generally, you want to use the latest version to
>        get all bug fixes and improvements. It is highly recommended
>        that you fully re-index after changing this setting as it can
>        affect both how text is indexed and queried.
>   -->
>   <luceneMatchVersion>5.2.1</luceneMatchVersion>
>   <!-- Data Directory
>        Used to specify an alternate directory to hold all index data
>        other than the default ./data under the Solr home.  If
>        replication is in use, this should match the replication
>        configuration.
>     -->
>   <dataDir>${solr.data.dir:}</dataDir>
>   <!-- The DirectoryFactory to use for indexes.
>        
>        solr.StandardDirectoryFactory is filesystem
>        based and tries to pick the best implementation for the current
>        JVM and platform.  solr.NRTCachingDirectoryFactory, the default,
>        wraps solr.StandardDirectoryFactory and caches small files in memory
>        for better NRT performance.
>        One can force a particular implementation via 
> solr.MMapDirectoryFactory,
>        solr.NIOFSDirectoryFactory, or solr.SimpleFSDirectoryFactory.
>        solr.RAMDirectoryFactory is memory based, not
>        persistent, and doesn't work with replication.
>     -->
>   <directoryFactory name="DirectoryFactory" 
>                     
> class="${solr.directoryFactory:solr.NRTCachingDirectoryFactory}">
>   </directoryFactory> 
>   <!-- The CodecFactory for defining the format of the inverted index.
>        The default implementation is SchemaCodecFactory, which is the 
> official Lucene
>        index format, but hooks into the schema to provide per-field 
> customization of
>        the postings lists and per-document values in the fieldType element
>        (postingsFormat/docValuesFormat). Note that most of the alternative 
> implementations
>        are experimental, so if you choose to customize the index format, it's 
> a good
>        idea to convert back to the official format e.g. via 
> IndexWriter.addIndexes(IndexReader)
>        before upgrading to a newer version to avoid unnecessary reindexing.
>   -->
>   <codecFactory class="solr.SchemaCodecFactory"/>
>   <schemaFactory class="ClassicIndexSchemaFactory"/>
>   <!-- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>        Index Config - These settings control low-level behavior of indexing
>        Most example settings here show the default value, but are commented
>        out, to more easily see where customizations have been made.
>        
>        Note: This replaces <indexDefaults> and <mainIndex> from older versions
>        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 
> -->
>   <indexConfig>
>     <!-- LockFactory 
>          This option specifies which Lucene LockFactory implementation
>          to use.
>       
>          single = SingleInstanceLockFactory - suggested for a
>                   read-only index or when there is no possibility of
>                   another process trying to modify the index.
>          native = NativeFSLockFactory - uses OS native file locking.
>                   Do not use when multiple solr webapps in the same
>                   JVM are attempting to share a single index.
>          simple = SimpleFSLockFactory  - uses a plain file for locking
>          Defaults: 'native' is default for Solr3.6 and later, otherwise
>                    'simple' is the default
>          More details on the nuances of each LockFactory...
>          http://wiki.apache.org/lucene-java/AvailableLockFactories
>     -->
>     <lockType>${solr.lock.type:native}</lockType>
>     <!-- Lucene Infostream
>        
>          To aid in advanced debugging, Lucene provides an "InfoStream"
>          of detailed information when indexing.
>          Setting the value to true will instruct the underlying Lucene
>          IndexWriter to write its info stream to solr's log. By default,
>          this is enabled here, and controlled through log4j.properties.
>       -->
>      <infoStream>true</infoStream>
>   </indexConfig>
>   <!-- JMX
>        
>        This example enables JMX if and only if an existing MBeanServer
>        is found, use this if you want to configure JMX through JVM
>        parameters. Remove this to disable exposing Solr configuration
>        and statistics to JMX.
>        For more details see http://wiki.apache.org/solr/SolrJmx
>     -->
>   <jmx />
>   <!-- If you want to connect to a particular server, specify the
>        agentId 
>     -->
>   <!-- <jmx agentId="myAgent" /> -->
>   <!-- If you want to start a new MBeanServer, specify the serviceUrl -->
>   <!-- <jmx serviceUrl="service:jmx:rmi:///jndi/rmi://localhost:9999/solr"/>
>     -->
>   <!-- The default high-performance update handler -->
>   <updateHandler class="solr.DirectUpdateHandler2">
>     <!-- Enables a transaction log, used for real-time get, durability, and
>          and solr cloud replica recovery.  The log can grow as big as
>          uncommitted changes to the index, so use of a hard autoCommit
>          is recommended (see below).
>          "dir" - the target directory for transaction logs, defaults to the
>                 solr data directory.
>          "numVersionBuckets" - sets the number of buckets used to keep
>                 track of max version values when checking for re-ordered
>                 updates; increase this value to reduce the cost of
>                 synchronizing access to version buckets during high-volume
>                 indexing, this requires 8 bytes (long) * numVersionBuckets
>                 of heap space per Solr core.
>     -->
>     <updateLog>
>       <str name="dir">${solr.ulog.dir:}</str>
>       <int name="numVersionBuckets">${solr.ulog.numVersionBuckets:65536}</int>
>     </updateLog>
>  
>     <!-- AutoCommit
>          Perform a hard commit automatically under certain conditions.
>          Instead of enabling autoCommit, consider using "commitWithin"
>          when adding documents. 
>          http://wiki.apache.org/solr/UpdateXmlMessages
>          maxDocs - Maximum number of documents to add since the last
>                    commit before automatically triggering a new commit.
>          maxTime - Maximum amount of time in ms that is allowed to pass
>                    since a document was added before automatically
>                    triggering a new commit. 
>          openSearcher - if false, the commit causes recent index changes
>            to be flushed to stable storage, but does not cause a new
>            searcher to be opened to make those changes visible.
>          If the updateLog is enabled, then it's highly recommended to
>          have some sort of hard autoCommit to limit the log size.
>       -->
>      <autoCommit> 
>        <maxTime>${solr.autoCommit.maxTime:15000}</maxTime> 
>        <openSearcher>false</openSearcher> 
>      </autoCommit>
>     <!-- softAutoCommit is like autoCommit except it causes a
>          'soft' commit which only ensures that changes are visible
>          but does not ensure that data is synced to disk.  This is
>          faster and more near-realtime friendly than a hard commit.
>       -->
>      <autoSoftCommit> 
>        <maxTime>10</maxTime> 
>      </autoSoftCommit>
>   </updateHandler>
>   
>   <!-- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>        Query section - these settings control query time things like caches
>        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 
> -->
>   <query>
>     <!-- Max Boolean Clauses
>          Maximum number of clauses in each BooleanQuery,  an exception
>          is thrown if exceeded.
>          ** WARNING **
>          
>          This option actually modifies a global Lucene property that
>          will affect all SolrCores.  If multiple solrconfig.xml files
>          disagree on this property, the value at any given moment will
>          be based on the last SolrCore to be initialized.
>          
>       -->
>     <maxBooleanClauses>1024</maxBooleanClauses>
>     <!-- Solr Internal Query Caches
>          There are two implementations of cache available for Solr,
>          LRUCache, based on a synchronized LinkedHashMap, and
>          FastLRUCache, based on a ConcurrentHashMap.  
>          FastLRUCache has faster gets and slower puts in single
>          threaded operation and thus is generally faster than LRUCache
>          when the hit ratio of the cache is high (> 75%), and may be
>          faster under other scenarios on multi-cpu systems.
>     -->
>     <!-- Filter Cache
>          Cache used by SolrIndexSearcher for filters (DocSets),
>          unordered sets of *all* documents that match a query.  When a
>          new searcher is opened, its caches may be prepopulated or
>          "autowarmed" using data from caches in the old searcher.
>          autowarmCount is the number of items to prepopulate.  For
>          LRUCache, the autowarmed items will be the most recently
>          accessed items.
>          Parameters:
>            class - the SolrCache implementation LRUCache or
>                (LRUCache or FastLRUCache)
>            size - the maximum number of entries in the cache
>            initialSize - the initial capacity (number of entries) of
>                the cache.  (see java.util.HashMap)
>            autowarmCount - the number of entries to prepopulate from
>                and old cache.  
>       -->
>     <filterCache class="solr.FastLRUCache"
>                  size="512"
>                  initialSize="512"
>                  autowarmCount="0"/>
>     <!-- Query Result Cache
>         Caches results of searches - ordered lists of document ids
>         (DocList) based on a query, a sort, and the range of documents 
> requested.
>         Additional supported parameter by LRUCache:
>            maxRamMB - the maximum amount of RAM (in MB) that this cache is 
> allowed
>                       to occupy
>      -->
>     <queryResultCache class="solr.LRUCache"
>                      size="512"
>                      initialSize="512"
>                      autowarmCount="0"/>
>    
>     <!-- Document Cache
>          Caches Lucene Document objects (the stored fields for each
>          document).  Since Lucene internal document ids are transient,
>          this cache will not be autowarmed.  
>       -->
>     <documentCache class="solr.LRUCache"
>                    size="512"
>                    initialSize="512"
>                    autowarmCount="0"/>
>     
>     <!-- custom cache currently used by block join --> 
>     <cache name="perSegFilter"
>       class="solr.search.LRUCache"
>       size="10"
>       initialSize="0"
>       autowarmCount="10"
>       regenerator="solr.NoOpRegenerator" />
>     <!-- Lazy Field Loading
>          If true, stored fields that are not requested will be loaded
>          lazily.  This can result in a significant speed improvement
>          if the usual case is to not load all stored fields,
>          especially if the skipped fields are large compressed text
>          fields.
>     -->
>     <enableLazyFieldLoading>true</enableLazyFieldLoading>
>    <!-- Result Window Size
>         An optimization for use with the queryResultCache.  When a search
>         is requested, a superset of the requested number of document ids
>         are collected.  For example, if a search for a particular query
>         requests matching documents 10 through 19, and queryWindowSize is 50,
>         then documents 0 through 49 will be collected and cached.  Any further
>         requests in that range can be satisfied via the cache.  
>      -->
>    <queryResultWindowSize>20</queryResultWindowSize>
>    <!-- Maximum number of documents to cache for any entry in the
>         queryResultCache. 
>      -->
>    <queryResultMaxDocsCached>200</queryResultMaxDocsCached>
>     <!-- Use Cold Searcher
>          If a search request comes in and there is no current
>          registered searcher, then immediately register the still
>          warming searcher and use it.  If "false" then all requests
>          will block until the first searcher is done warming.
>       -->
>     <useColdSearcher>false</useColdSearcher>
>     <!-- Max Warming Searchers
>          
>          Maximum number of searchers that may be warming in the
>          background concurrently.  An error is returned if this limit
>          is exceeded.
>          Recommend values of 1-2 for read-only slaves, higher for
>          masters w/o cache warming.
>       -->
>     <maxWarmingSearchers>2</maxWarmingSearchers>
>   </query>
>   <!-- Request Dispatcher
>        This section contains instructions for how the SolrDispatchFilter
>        should behave when processing requests for this SolrCore.
>        handleSelect is a legacy option that affects the behavior of requests
>        such as /select?qt=XXX
>        handleSelect="true" will cause the SolrDispatchFilter to process
>        the request and dispatch the query to a handler specified by the 
>        "qt" param, assuming "/select" isn't already registered.
>        handleSelect="false" will cause the SolrDispatchFilter to
>        ignore "/select" requests, resulting in a 404 unless a handler
>        is explicitly registered with the name "/select"
>        handleSelect="true" is not recommended for new users, but is the 
> default
>        for backwards compatibility
>     -->
>   <requestDispatcher handleSelect="false" >
>     <!-- Request Parsing
>          These settings indicate how Solr Requests may be parsed, and
>          what restrictions may be placed on the ContentStreams from
>          those requests
>          enableRemoteStreaming - enables use of the stream.file
>          and stream.url parameters for specifying remote streams.
>          multipartUploadLimitInKB - specifies the max size (in KiB) of
>          Multipart File Uploads that Solr will allow in a Request.
>          
>          formdataUploadLimitInKB - specifies the max size (in KiB) of
>          form data (application/x-www-form-urlencoded) sent via
>          POST. You can use POST to pass request parameters not
>          fitting into the URL.
>          
>          addHttpRequestToContext - if set to true, it will instruct
>          the requestParsers to include the original HttpServletRequest
>          object in the context map of the SolrQueryRequest under the 
>          key "httpRequest". It will not be used by any of the existing
>          Solr components, but may be useful when developing custom 
>          plugins.
>          
>          *** WARNING ***
>          The settings below authorize Solr to fetch remote files, You
>          should make sure your system has some authentication before
>          using enableRemoteStreaming="true"
>       --> 
>     <requestParsers enableRemoteStreaming="true" 
>                     multipartUploadLimitInKB="2048000"
>                     formdataUploadLimitInKB="2048"
>                     addHttpRequestToContext="false"/>
>     <!-- HTTP Caching
>          Set HTTP caching related parameters (for proxy caches and clients).
>          The options below instruct Solr not to output any HTTP Caching
>          related headers
>       -->
>     <httpCaching never304="true" />
>   </requestDispatcher>
>   <!-- Request Handlers 
>        http://wiki.apache.org/solr/SolrRequestHandler
>        Incoming queries will be dispatched to a specific handler by name
>        based on the path specified in the request.
>        Legacy behavior: If the request path uses "/select" but no Request
>        Handler has that name, and if handleSelect="true" has been specified in
>        the requestDispatcher, then the Request Handler is dispatched based on
>        the qt parameter.  Handlers without a leading '/' are accessed this way
>        like so: http://host/app/[core/]select?qt=name  If no qt is
>        given, then the requestHandler that declares default="true" will be
>        used or the one named "standard".
>        If a Request Handler is declared with startup="lazy", then it will
>        not be initialized until the first request that uses it.
>     -->
>   <!-- SearchHandler
>        http://wiki.apache.org/solr/SearchHandler
>        For processing Search Queries, the primary Request Handler
>        provided with Solr is "SearchHandler" It delegates to a sequent
>        of SearchComponents (see below) and supports distributed
>        queries across multiple shards
>     -->
>   <requestHandler name="/select" class="solr.SearchHandler">
>     <!-- default values for query parameters can be specified, these
>          will be overridden by parameters in the request
>       -->
>      <lst name="defaults">
>        <str name="echoParams">explicit</str>
>        <int name="rows">10</int>
>      </lst>
>     </requestHandler>
>   <!-- A request handler that returns indented JSON by default -->
>   <requestHandler name="/query" class="solr.SearchHandler">
>      <lst name="defaults">
>        <str name="echoParams">explicit</str>
>        <str name="wt">json</str>
>        <str name="indent">true</str>
>        <str name="df">contenttype</str>
>      </lst>
>   </requestHandler>
>   <!--
>     The export request handler is used to export full sorted result sets.
>     Do not change these defaults.
>   -->
>   <requestHandler name="/export" class="solr.SearchHandler">
>     <lst name="invariants">
>       <str name="rq">{!xport}</str>
>       <str name="wt">xsort</str>
>       <str name="distrib">false</str>
>     </lst>
>     <arr name="components">
>       <str>query</str>
>     </arr>
>   </requestHandler>
>   <initParams path="/update/**,/query,/select,/tvrh,/elevate,/spell">
>     <lst name="defaults">
>       <str name="df">contenttype</str>
>     </lst>
>   </initParams>
>   <!-- Field Analysis Request Handler
>        RequestHandler that provides much the same functionality as
>        analysis.jsp. Provides the ability to specify multiple field
>        types and field names in the same request and outputs
>        index-time and query-time analysis for each of them.
>        Request parameters are:
>        analysis.fieldname - field name whose analyzers are to be used
>        analysis.fieldtype - field type whose analyzers are to be used
>        analysis.fieldvalue - text for index-time analysis
>        q (or analysis.q) - text for query time analysis
>        analysis.showmatch (true|false) - When set to true and when
>            query analysis is performed, the produced tokens of the
>            field value analysis will be marked as "matched" for every
>            token that is produces by the query analysis
>    -->
>   <requestHandler name="/analysis/field" 
>                   startup="lazy"
>                   class="solr.FieldAnalysisRequestHandler" />
>   <!-- Document Analysis Handler
>        http://wiki.apache.org/solr/AnalysisRequestHandler
>        An analysis handler that provides a breakdown of the analysis
>        process of provided documents. This handler expects a (single)
>        content stream with the following format:
>        <docs>
>          <doc>
>            <field name="id">1</field>
>            <field name="name">The Name</field>
>            <field name="text">The Text Value</field>
>          </doc>
>          <doc>...</doc>
>          <doc>...</doc>
>          ...
>        </docs>
>     Note: Each document must contain a field which serves as the
>     unique key. This key is used in the returned response to associate
>     an analysis breakdown to the analyzed document.
>     Like the FieldAnalysisRequestHandler, this handler also supports
>     query analysis by sending either an "analysis.query" or "q"
>     request parameter that holds the query text to be analyzed. It
>     also supports the "analysis.showmatch" parameter which when set to
>     true, all field tokens that match the query tokens will be marked
>     as a "match". 
>   -->
>   <requestHandler name="/analysis/document" 
>                   class="solr.DocumentAnalysisRequestHandler" 
>                   startup="lazy" />
>   <!-- Echo the request contents back to the client -->
>   <requestHandler name="/debug/dump" class="solr.DumpRequestHandler" >
>     <lst name="defaults">
>      <str name="echoParams">explicit</str> 
>      <str name="echoHandler">true</str>
>     </lst>
>   </requestHandler>
>   
>   <!-- Search Components
>        Search components are registered to SolrCore and used by 
>        instances of SearchHandler (which can access them by name)
>        
>        By default, the following components are available:
>        
>        <searchComponent name="query"     class="solr.QueryComponent" />
>        <searchComponent name="facet"     class="solr.FacetComponent" />
>        <searchComponent name="mlt"       class="solr.MoreLikeThisComponent" />
>        <searchComponent name="highlight" class="solr.HighlightComponent" />
>        <searchComponent name="stats"     class="solr.StatsComponent" />
>        <searchComponent name="debug"     class="solr.DebugComponent" />
>        
>      -->
>   <!-- Terms Component
>        http://wiki.apache.org/solr/TermsComponent
>        A component to return terms and document frequency of those
>        terms
>     -->
>   <searchComponent name="terms" class="solr.TermsComponent"/>
>   <!-- A request handler for demonstrating the terms component -->
>   <requestHandler name="/terms" class="solr.SearchHandler" startup="lazy">
>      <lst name="defaults">
>       <bool name="terms">true</bool>
>       <bool name="distrib">false</bool>
>     </lst>     
>     <arr name="components">
>       <str>terms</str>
>     </arr>
>   </requestHandler>
>   <!-- Legacy config for the admin interface -->
>   <admin>
>     <defaultQuery>*:*</defaultQuery>
>   </admin>
> </config>



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Resolved] (SOLR-9416) Exception writing document

Reply via email to