Re: Errors using the Embedded Solar Server

2015-01-21 Thread Carl Roberts
OK - I figured out the logging.  Here is the logging output plus the 
console output and the stack trace:


main] INFO org.apache.solr.core.SolrResourceLoader - new 
SolrResourceLoader for directory: '/Users/carlroberts/dev/solr-4.10.3/'
[main] INFO org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/lib/commons-logging-1.2.jar' to 
classloader
[main] INFO org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/lib/servlet-api.jar' to classloader
[main] INFO org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/lib/slf4j-simple-1.7.5.jar' to 
classloader
[main] INFO org.apache.solr.core.ConfigSolr - Loading container 
configuration from /Users/carlroberts/dev/solr-4.10.3/solr.xml
[main] INFO org.apache.solr.core.CoreContainer - New CoreContainer 
2050551931

db
/Users/carlroberts/dev/solr-4.10.3/[main] INFO 
org.apache.solr.core.CoreContainer - Loading cores into CoreContainer 
[instanceDir=/Users/carlroberts/dev/solr-4.10.3/]


[main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - 
Setting socketTimeout to: 0
[main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - 
Setting urlScheme to: null
[main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - 
Setting connTimeout to: 0
[main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - 
Setting maxConnectionsPerHost to: 20
[main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - 
Setting corePoolSize to: 0
[main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - 
Setting maximumPoolSize to: 2147483647
[main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - 
Setting maxThreadIdleTime to: 5
[main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - 
Setting sizeOfQueue to: -1
[main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - 
Setting fairnessPolicy to: false
[main] INFO org.apache.solr.update.UpdateShardHandler - Creating 
UpdateShardHandler HTTP client with params: 
socketTimeout=0connTimeout=0retry=false
[main] INFO org.apache.solr.logging.LogWatcher - SLF4J impl is 
org.slf4j.impl.SimpleLoggerFactory

[main] INFO org.apache.solr.logging.LogWatcher - No LogWatcher configured
[main] INFO org.apache.solr.core.CoreContainer - Host Name: null
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - new SolrResourceLoader for 
directory: '/Users/carlroberts/dev/solr-4.10.3/db/'
[coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrConfig - 
Adding specified lib dirs to ClassLoader
[coreLoadExecutor-5-thread-1] WARN 
org.apache.solr.core.SolrResourceLoader - Can't find (or read) directory 
to add to classloader: ../../../contrib/extraction/lib (resolved as: 
/Users/carlroberts/dev/solr-4.10.3/db/../../../contrib/extraction/lib).
[coreLoadExecutor-5-thread-1] WARN 
org.apache.solr.core.SolrResourceLoader - Can't find (or read) directory 
to add to classloader: ../../../dist/ (resolved as: 
/Users/carlroberts/dev/solr-4.10.3/db/../../../dist).
[coreLoadExecutor-5-thread-1] WARN 
org.apache.solr.core.SolrResourceLoader - Can't find (or read) directory 
to add to classloader: ../../../contrib/clustering/lib/ (resolved as: 
/Users/carlroberts/dev/solr-4.10.3/db/../../../contrib/clustering/lib).
[coreLoadExecutor-5-thread-1] WARN 
org.apache.solr.core.SolrResourceLoader - Can't find (or read) directory 
to add to classloader: ../../../dist/ (resolved as: 
/Users/carlroberts/dev/solr-4.10.3/db/../../../dist).
[coreLoadExecutor-5-thread-1] WARN 
org.apache.solr.core.SolrResourceLoader - Can't find (or read) directory 
to add to classloader: ../../../contrib/langid/lib/ (resolved as: 
/Users/carlroberts/dev/solr-4.10.3/db/../../../contrib/langid/lib).
[coreLoadExecutor-5-thread-1] WARN 
org.apache.solr.core.SolrResourceLoader - Can't find (or read) directory 
to add to classloader: ../../../dist/ (resolved as: 
/Users/carlroberts/dev/solr-4.10.3/db/../../../dist).
[coreLoadExecutor-5-thread-1] WARN 
org.apache.solr.core.SolrResourceLoader - Can't find (or read) directory 
to add to classloader: ../../../contrib/velocity/lib (resolved as: 
/Users/carlroberts/dev/solr-4.10.3/db/../../../contrib/velocity/lib).
[coreLoadExecutor-5-thread-1] WARN 
org.apache.solr.core.SolrResourceLoader - Can't find (or read) directory 
to add to classloader: ../../../dist/ (resolved as: 
/Users/carlroberts/dev/solr-4.10.3/db/../../../dist).
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.update.SolrIndexConfig - IndexWriter infoStream solr 
logging is enabled
[coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrConfig - 
Using Lucene MatchVersion: 4.10.3
[coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.Config - Loaded 
SolrConfig: solrconfig.xml
[coreLoadExecutor-5-thread-1] INFO org.apache.solr.schema.IndexSchema - 
Reading Solr Schema from 
/Users/carlroberts/dev/solr-4.10.3/db/conf/schema.xml

Re: Errors using the Embedded Solar Server

2015-01-21 Thread Shawn Heisey
On 1/21/2015 9:56 AM, Carl Roberts wrote:
 BTW - I don't know if this will help also, but here is a screen shot
 of my classpath in eclipse.

The URL in the slf4j error message does describe the problem with
logging, but if you know nothing about slf4j, it probably won't help you
much.

Make sure you're including all the jars from example/lib/ext in the
download in your project's lib directory, as well as the
log4j.properties file from the resources directory.  You will probably
need to edit the log4j.properties file to change the location of the
logfile.

Thanks,
Shawn



Solr Recovery process

2015-01-21 Thread Nishanth S
Hello Everyone,

I am hitting a few issues with solr replicas going into recovery and then
doing a full index copy.I am trying to understand the solr recovery
process.I have read a few blogs  on this and saw  that when leader notifies
a replica to  recover(in my case it is due to connection resets) it will
try to do a peer sync first and  if the missed updates are more than 100 it
will do a full index copy from the leader.I am trying to understand what
peer sync is and where does tlog come into picture.Are tlogs replayed only
during server restart?.Can some one  help me with this?

Thanks,
Nishanth


Re: AW: AW: AW: transactions@Solr(J)

2015-01-21 Thread Shawn Heisey
On 1/21/2015 9:15 AM, Clemens Wyss DEV wrote:
 What I meant is:
 If I do SolrServer#rollback after 11 documents were added, will then only 1 
 or all 11 docments that have been added in the 
 SolrServer-tranascation/context?

If autoCommit is set to 10 docs and openSearcher is true, it would roll
back one document, assuming that the 11th document didn't make it into
the index before the commit actually started.  I'm not sure if the
autoCommit settings are perfectly atomic, or if there would be enough of
a time gap to allow a few more documents to make it in.  If you added
the documents one at a time, I could be sure that the rollback would be
one document.

If openSearcher is false, I'm not sure whether it would do one or 11.  I
just don't know enough about the underlying API.

Thanks,
Shawn



Re: Errors using the Embedded Solar Server

2015-01-21 Thread Carl Roberts

I had to hardcode the path in solrconfig.xml from this:

${solr.install.dir:}

to this:

 /Users/carlroberts/dev/solr-4.10.3/


to avoid the classloader warnings, but I still get the same error. I am 
not sure where the ${solr.install.dir:} value gets pulled from but 
apparently that is not working.  Here is the new output:


[main] INFO org.apache.solr.core.SolrResourceLoader - new 
SolrResourceLoader for directory: '/Users/carlroberts/dev/solr-4.10.3/'
[main] INFO org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/lib/commons-logging-1.2.jar' to 
classloader
[main] INFO org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/lib/servlet-api.jar' to classloader
[main] INFO org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/lib/slf4j-simple-1.7.5.jar' to 
classloader
[main] INFO org.apache.solr.core.ConfigSolr - Loading container 
configuration from /Users/carlroberts/dev/solr-4.10.3/solr.xml
[main] INFO org.apache.solr.core.CoreContainer - New CoreContainer 
1023143764
[main] INFO org.apache.solr.core.CoreContainer - Loading cores into 
CoreContainer [instanceDir=/Users/carlroberts/dev/solr-4.10.3/]

db
/Users/carlroberts/dev/solr-4.10.3/
[main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - 
Setting socketTimeout to: 0
[main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - 
Setting urlScheme to: null
[main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - 
Setting connTimeout to: 0
[main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - 
Setting maxConnectionsPerHost to: 20
[main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - 
Setting corePoolSize to: 0
[main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - 
Setting maximumPoolSize to: 2147483647
[main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - 
Setting maxThreadIdleTime to: 5
[main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - 
Setting sizeOfQueue to: -1
[main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - 
Setting fairnessPolicy to: false
[main] INFO org.apache.solr.update.UpdateShardHandler - Creating 
UpdateShardHandler HTTP client with params: 
socketTimeout=0connTimeout=0retry=false
[main] INFO org.apache.solr.logging.LogWatcher - SLF4J impl is 
org.slf4j.impl.SimpleLoggerFactory

[main] INFO org.apache.solr.logging.LogWatcher - No LogWatcher configured
[main] INFO org.apache.solr.core.CoreContainer - Host Name: null
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - new SolrResourceLoader for 
directory: '/Users/carlroberts/dev/solr-4.10.3/db/'
[coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrConfig - 
Adding specified lib dirs to ClassLoader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/apache-mime4j-core-0.7.2.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/apache-mime4j-dom-0.7.2.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/aspectjrt-1.6.11.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/bcmail-jdk15-1.45.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/bcprov-jdk15-1.45.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/boilerpipe-1.1.0.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/commons-compress-1.7.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/dom4j-1.6.1.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/fontbox-1.8.4.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/icu4j-53.1.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/isoparser-1.0-RC-1.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 

Re: Errors using the Embedded Solar Server

2015-01-21 Thread Alan Woodward
Ah, OK, you need to include a logging jar in your classpath - the log4j and 
slf4j-log4j jars in the solr distribution will help here.  Once you've got some 
logging set up, then you should be able to work out what's going wrong!

Alan Woodward
www.flax.co.uk


On 21 Jan 2015, at 16:53, Carl Roberts wrote:

 So far I have not been able to get the logging to work - here is what I get 
 in the console prior to the exception:
 
 SLF4J: Failed to load class org.slf4j.impl.StaticLoggerBinder.
 SLF4J: Defaulting to no-operation (NOP) logger implementation
 SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further 
 details.
 db
 /Users/carlroberts/dev/solr-4.10.3/
 false
 {}
 []
 /Users/carlroberts/dev/solr-4.10.3/
 
 
 On 1/21/15, 11:50 AM, Alan Woodward wrote:
 That certainly looks like it ought to work.  Is there log output that you 
 could show us as well?
 
 Alan Woodward
 www.flax.co.uk
 
 
 On 21 Jan 2015, at 16:09, Carl Roberts wrote:
 
 Hi,
 
 I have downloaded the code and documentation for Solr version 4.10.3.
 
 I am trying to follow SolrJ Wiki guide and I am running into errors.  The 
 latest error is this one:
 
 Exception in thread main org.apache.solr.common.SolrException: No such 
 core: db
at 
 org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:112)
at 
 org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:124)
at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:68)
at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:54)
at solr.Test.main(Test.java:39)
 
 My code is this:
 
 package solr;
 
 import java.io.File;
 import java.io.IOException;
 import java.util.ArrayList;
 import java.util.Collection;
 
 import org.apache.solr.client.solrj.SolrServerException;
 import org.apache.solr.client.solrj.embedded.EmbeddedSolrServer;
 import org.apache.solr.common.SolrInputDocument;
 import org.apache.solr.core.CoreContainer;
 import org.apache.solr.core.SolrCore;
 
 
 public class Test {
public static void main(String [] args){
CoreContainer container = new 
 CoreContainer(/Users/carlroberts/dev/solr-4.10.3);
System.out.println(container.getDefaultCoreName());
System.out.println(container.getSolrHome());
container.load();
System.out.println(container.isLoaded(db));
System.out.println(container.getCoreInitFailures());
CollectionSolrCore cores = container.getCores();
System.out.println(cores);
EmbeddedSolrServer server = new EmbeddedSolrServer( container, db 
 );
SolrInputDocument doc1 = new SolrInputDocument();
doc1.addField( id, id1, 1.0f );
doc1.addField( name, doc1, 1.0f );
doc1.addField( price, 10 );
SolrInputDocument doc2 = new SolrInputDocument();
doc2.addField( id, id2, 1.0f );
doc2.addField( name, doc2, 1.0f );
doc2.addField( price, 20 );
CollectionSolrInputDocument docs = new
ArrayListSolrInputDocument();
docs.add( doc1 );
docs.add( doc2 );
try{
server.add( docs );
server.commit();
server.deleteByQuery( *:* );
}catch(IOException e){
e.printStackTrace();
}catch(SolrServerException e){
e.printStackTrace();
}
}
 }
 
 
 My solr.xml file is this:
 
 ?xml version=1.0 encoding=UTF-8 ?
 !--
 Licensed to the Apache Software Foundation (ASF) under one or more
 contributor license agreements.  See the NOTICE file distributed with
 this work for additional information regarding copyright ownership.
 The ASF licenses this file to You under the Apache License, Version 2.0
 (the License); you may not use this file except in compliance with
 the License.  You may obtain a copy of the License at
 
 http://www.apache.org/licenses/LICENSE-2.0
 
 Unless required by applicable law or agreed to in writing, software
 distributed under the License is distributed on an AS IS BASIS,
 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 See the License for the specific language governing permissions and
 limitations under the License.
 --
 
 !--
   This is an example of a simple solr.xml file for configuring one or
   more Solr Cores, as well as allowing Cores to be added, removed, and
   reloaded via HTTP requests.
 
   More information about options available in this configuration file,
   and Solr Core administration can be found online:
   http://wiki.apache.org/solr/CoreAdmin
 --
 
 solr
  cores adminPath=/admin/cores defaultCoreName=db
core default=true instanceDir=db/ name=db/
  /cores
 /solr
 
 And my db/conf directory was copied from example/solr/collection/conf 
 directory and it contains the solrconfig.xml file and schema.xml file.
 
 I have noticed that the documentation that shows how to use the 
 EmbeddedSolarServer is outdated as it indicates I should use 
 

Re: Errors using the Embedded Solar Server

2015-01-21 Thread Carl Roberts

Hi,

Could there be a bug in the EmbeddedSolrServer that is causing this?

Is it still supported in version 4.10.3?

If it is, can someone please provide me assistance with this?

Regards,

Joe

On 1/21/15, 12:18 PM, Carl Roberts wrote:

I had to hardcode the path in solrconfig.xml from this:

${solr.install.dir:}

to this:

 /Users/carlroberts/dev/solr-4.10.3/


to avoid the classloader warnings, but I still get the same error. I 
am not sure where the ${solr.install.dir:} value gets pulled from but 
apparently that is not working.  Here is the new output:


[main] INFO org.apache.solr.core.SolrResourceLoader - new 
SolrResourceLoader for directory: '/Users/carlroberts/dev/solr-4.10.3/'
[main] INFO org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/lib/commons-logging-1.2.jar' 
to classloader
[main] INFO org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/lib/servlet-api.jar' to 
classloader
[main] INFO org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/lib/slf4j-simple-1.7.5.jar' 
to classloader
[main] INFO org.apache.solr.core.ConfigSolr - Loading container 
configuration from /Users/carlroberts/dev/solr-4.10.3/solr.xml
[main] INFO org.apache.solr.core.CoreContainer - New CoreContainer 
1023143764
[main] INFO org.apache.solr.core.CoreContainer - Loading cores into 
CoreContainer [instanceDir=/Users/carlroberts/dev/solr-4.10.3/]

db
/Users/carlroberts/dev/solr-4.10.3/
[main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory 
- Setting socketTimeout to: 0
[main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory 
- Setting urlScheme to: null
[main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory 
- Setting connTimeout to: 0
[main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory 
- Setting maxConnectionsPerHost to: 20
[main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory 
- Setting corePoolSize to: 0
[main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory 
- Setting maximumPoolSize to: 2147483647
[main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory 
- Setting maxThreadIdleTime to: 5
[main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory 
- Setting sizeOfQueue to: -1
[main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory 
- Setting fairnessPolicy to: false
[main] INFO org.apache.solr.update.UpdateShardHandler - Creating 
UpdateShardHandler HTTP client with params: 
socketTimeout=0connTimeout=0retry=false
[main] INFO org.apache.solr.logging.LogWatcher - SLF4J impl is 
org.slf4j.impl.SimpleLoggerFactory

[main] INFO org.apache.solr.logging.LogWatcher - No LogWatcher configured
[main] INFO org.apache.solr.core.CoreContainer - Host Name: null
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - new SolrResourceLoader for 
directory: '/Users/carlroberts/dev/solr-4.10.3/db/'
[coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrConfig - 
Adding specified lib dirs to ClassLoader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/apache-mime4j-core-0.7.2.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/apache-mime4j-dom-0.7.2.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/aspectjrt-1.6.11.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/bcmail-jdk15-1.45.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/bcprov-jdk15-1.45.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/boilerpipe-1.1.0.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/commons-compress-1.7.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/dom4j-1.6.1.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/fontbox-1.8.4.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/icu4j-53.1.jar' 
to 

Solr 4.10.3 start up issue

2015-01-21 Thread Darren Spehr
Hi everyone -

I posted a question on stackoverflow but in hindsight this would have been
a better place to start. Below is the link.

Basically I can't get the example working when using an external ZK cluster
and auto-core discovery. Solr 4.10.1 works fine, but the newest release
never gets new nodes into the active state. There are no errors or
warnings, and compared to the log output of 4.10.1, the difference is that
nodes never make it to leader election.

Here is the stackoverflow question, along with the full log output:
http://stackoverflow.com/questions/28004832/solr-4-10-3-is-not-proceeding-to-leader-election-on-new-cluster-startup-hangs

Any help and guidance would be appreciated. Thanks!

-- 
Darren


Is Solr a good candidate to index 100s of nodes in one XML file?

2015-01-21 Thread Carl Roberts

Hi,

Is Solr a good candidate to index 100s of nodes in one XML file?

I have an RSS feed XML file that has 100s of nodes with several elements 
in each node that I have to index, so I was planning to parse the XML 
with Stax and extract the data from each node and add it to Solr.  There 
will always be only one one file to start with and then a second file as 
the RSS feeds supplies updates.  I want to return certain fields of each 
node when I search certain fields of the same node.  Is Solr overkill in 
this case?  Should I just use Lucene instead?


Regards,

Joe


Ignore whitesapce, underscore using KeywordTokenizer... EdgeNGramFilter

2015-01-21 Thread Vishal Swaroop
I am trying to implement type-ahead suggestion for single field which
should ignore whitesapce, underscore or special characters in autosuggest.

It works as suggested by Alex using KeywordTokenizerFactory but how to
ignore whitesapce, underscore...

Example itemName data can be :
ABC E12 : if user types ABCE suggestion should be ABC E12
ABCE_12 : if user types ABCE1 suggestion should be ABCE_12

Schema.xml
field name=itemName type=text_general_edge_ngram indexed=true
stored=true multiValued=false /

fieldType name=text_general_edge_ngram class=solr.TextField
positionIncrementGap=100
   analyzer type=index
tokenizer class=solr.KeywordTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.EdgeNGramFilterFactory minGramSize=1
maxGramSize=15 side=front/
   /analyzer
   analyzer type=query
tokenizer class=solr.LowerCaseTokenizerFactory/
   /analyzer
/fieldType


Re: Errors using the Embedded Solar Server

2015-01-21 Thread Carl Roberts
Already did.  And the logging gets me no closer to fixing the issue. 
Here is the logging.


[main] INFO org.apache.solr.core.SolrResourceLoader - new 
SolrResourceLoader for directory: '/Users/carlroberts/dev/solr-4.10.3/'
[main] INFO org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/lib/commons-logging-1.2.jar' to 
classloader
[main] INFO org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/lib/servlet-api.jar' to classloader
[main] INFO org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/lib/slf4j-simple-1.7.5.jar' to 
classloader
[main] INFO org.apache.solr.core.ConfigSolr - Loading container 
configuration from /Users/carlroberts/dev/solr-4.10.3/solr.xml
[main] INFO org.apache.solr.core.CoreContainer - New CoreContainer 
1727098510
[main] INFO org.apache.solr.core.CoreContainer - Loading cores into 
CoreContainer [instanceDir=/Users/carlroberts/dev/solr-4.10.3/]
[main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - 
Setting socketTimeout to: 0
[main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - 
Setting urlScheme to: null
[main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - 
Setting connTimeout to: 0
[main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - 
Setting maxConnectionsPerHost to: 20
[main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - 
Setting corePoolSize to: 0
[main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - 
Setting maximumPoolSize to: 2147483647
[main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - 
Setting maxThreadIdleTime to: 5
[main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - 
Setting sizeOfQueue to: -1
[main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - 
Setting fairnessPolicy to: false
[main] INFO org.apache.solr.update.UpdateShardHandler - Creating 
UpdateShardHandler HTTP client with params: 
socketTimeout=0connTimeout=0retry=false
[main] INFO org.apache.solr.logging.LogWatcher - SLF4J impl is 
org.slf4j.impl.SimpleLoggerFactory

[main] INFO org.apache.solr.logging.LogWatcher - No LogWatcher configured
[main] INFO org.apache.solr.core.CoreContainer - Host Name: null
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - new SolrResourceLoader for 
directory: '/Users/carlroberts/dev/solr-4.10.3/db/'
[coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrConfig - 
Adding specified lib dirs to ClassLoader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/apache-mime4j-core-0.7.2.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/apache-mime4j-dom-0.7.2.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/aspectjrt-1.6.11.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/bcmail-jdk15-1.45.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/bcprov-jdk15-1.45.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/boilerpipe-1.1.0.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/commons-compress-1.7.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/dom4j-1.6.1.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/fontbox-1.8.4.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/icu4j-53.1.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/isoparser-1.0-RC-1.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/jdom-1.0.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 

Re: How to return custom collector info

2015-01-21 Thread tedsolr
I was confused because I couldn't believe my jars might be out of sync. But
of course they were. I had to create a new eclipse project to sort it out,
but that exception has disappeared. Sorry for the confusing post.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-return-custom-collector-info-tp4180502p4180877.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: How much maximum data can we hard commit in Solr?

2015-01-21 Thread Shawn Heisey
On 1/21/2015 9:13 AM, Nitin Solanki wrote:
 Thanks. Great Explanation.. One more thing I want to ask. Which is best
 doing only hard commit or both hard and soft commit? I want to index 21 GB
 of data.

My recommendations for the autoCommit settings are on that URL that I
linked - maxTime set to five minutes with openSearcher set to false.  It
also has a maxDocs setting ... you would need to come up with a
reasonable setting for that, or just leave it out.  That takes care of
all hard commit requirements as they relate to the transaction log.

Aside from that, I would recommend using soft commits (either explicit
or autoSoftCommit) for document visibility.  Hard commits with
opensearcher=true work fine, but soft commits have a *little* bit less
impact.  It would be up to you to decide when and how often to do that,
but I wouldn't do it more frequently than once a minute unless you can
take steps to make those soft commits happen REALLY fast.  Making
commits happen faster is a separate discussion.

Further reading about commits and the transaction log:

http://lucidworks.com/blog/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/

Thanks,
Shawn



Re: How to index data from multiple data source

2015-01-21 Thread Diego Pino
Hi Yusniel,

Solr manages documents as a whole. This means updating an existing document 
means replacing. So you should/could index metadata and full text in one step, 
one solr document under one unique ID. That would the simplest case. You could 
also also use nested  child documents to use block joins(depending on what 
version of Solr you are using, more info here: 
http://blog.griddynamics.com/2013/09/solr-block-join-support.html), but in my 
opinion this would be an overkill. We also manage a type of semantic - linked 
data mimic using  additional fields(named by real ontology predicate/property 
names to join documents that are related, see 
https://wiki.apache.org/solr/Join). So you could add the full text as an 
additional document with it's own ID and fill a solr document field with the ID 
of the parent metadata document. The on query time you can join them. Joins in 
solr always give as result the joined document(TO), not both (it's no like a 
SQL join, more like and inner query), so we experimented with self joins (the 
field holding the parent ID document also holds it's own ID), but as you can 
understand this is in no way optimal.

Related: We are using a Digital Objects Repository (Fedora Commons + Islandora) 
to archive exactly what you wan't to do. Our PDF files, and also many other 
type of data and metadata, are ingested as objects inside the repository, 
including technical metadata, MODS, DC, binary stream and full text. Then this 
whole object (as a FOXML) goes through an XSLT transformation and into Solr. If 
you are interested you can browse Islandoras google group. 
https://groups.google.com/forum/#!forum/islandora and visit Islandora's WIKI. 
https://wiki.duraspace.org/display/ISLANDORA714/Islandora. There is much 
documentation under the fedoragsearch module that does the real indexing. You 
can see our schemas and solr config there. 

Feel free to write me if you need/wan't more data.

Cheers

Diego Pino Navarro
Krayon Media
Pedro de Valdivia 575
Pucón - Chile
F:+56-45-2442469




On Jan 21, 2015, at 2:43 AM, Yusniel Hidalgo Delgado yhdelg...@uci.cu wrote:

 
 
 Dear Solr community, 
 
 
 
 
 I am diving into Solr recently and I need help in the following usage 
 scenery. I am working on a project for extract and search bibliographic 
 metadata from PDF files. Firstly, my PDF files are processed to extract 
 bibliographic metadata such as title, authors, affiliations, keywords and 
 abstract. These metadata are stored in a relational database and then are 
 indexed in Solr via DIH, however, I need to index also the fulltext of PDF 
 and maintain the same ID between metadata indexed and fulltext of PDF indexed 
 in Solr index. How to do that? How to configure sorlconfig.xml and schema.xml 
 to do it? 
 
 
 
 
 Thanks in advance. 
 
 
 
 
 Best regards 
 
 Yusniel Hidalgo Delgado 
 Semantic Web Research Group 
 University of Informatics Sciences 
 http://gws-uci.blogspot.com/ 
 Havana, Cuba 
 
 
 
 
 ---
 XII Aniversario de la creación de la Universidad de las Ciencias 
 Informáticas. 12 años de historia junto a Fidel. 12 de diciembre de 2014.



Errors using the Embedded Solar Server

2015-01-21 Thread Carl Roberts

Hi,

I have downloaded the code and documentation for Solr version 4.10.3.

I am trying to follow SolrJ Wiki guide and I am running into errors.  
The latest error is this one:


Exception in thread main org.apache.solr.common.SolrException: No such 
core: db
at 
org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:112)
at 
org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:124)

at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:68)
at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:54)
at solr.Test.main(Test.java:39)

My code is this:

package solr;

import java.io.File;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Collection;

import org.apache.solr.client.solrj.SolrServerException;
import org.apache.solr.client.solrj.embedded.EmbeddedSolrServer;
import org.apache.solr.common.SolrInputDocument;
import org.apache.solr.core.CoreContainer;
import org.apache.solr.core.SolrCore;


public class Test {
public static void main(String [] args){
CoreContainer container = new 
CoreContainer(/Users/carlroberts/dev/solr-4.10.3);

System.out.println(container.getDefaultCoreName());
System.out.println(container.getSolrHome());
container.load();
System.out.println(container.isLoaded(db));
System.out.println(container.getCoreInitFailures());
CollectionSolrCore cores = container.getCores();
System.out.println(cores);
EmbeddedSolrServer server = new EmbeddedSolrServer( container, 
db );

SolrInputDocument doc1 = new SolrInputDocument();
doc1.addField( id, id1, 1.0f );
doc1.addField( name, doc1, 1.0f );
doc1.addField( price, 10 );
SolrInputDocument doc2 = new SolrInputDocument();
doc2.addField( id, id2, 1.0f );
doc2.addField( name, doc2, 1.0f );
doc2.addField( price, 20 );
CollectionSolrInputDocument docs = new
ArrayListSolrInputDocument();
docs.add( doc1 );
docs.add( doc2 );
try{
server.add( docs );
server.commit();
server.deleteByQuery( *:* );
}catch(IOException e){
e.printStackTrace();
}catch(SolrServerException e){
e.printStackTrace();
}
}
}


My solr.xml file is this:

?xml version=1.0 encoding=UTF-8 ?
!--
 Licensed to the Apache Software Foundation (ASF) under one or more
 contributor license agreements.  See the NOTICE file distributed with
 this work for additional information regarding copyright ownership.
 The ASF licenses this file to You under the Apache License, Version 2.0
 (the License); you may not use this file except in compliance with
 the License.  You may obtain a copy of the License at

 http://www.apache.org/licenses/LICENSE-2.0

 Unless required by applicable law or agreed to in writing, software
 distributed under the License is distributed on an AS IS BASIS,
 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 See the License for the specific language governing permissions and
 limitations under the License.
--

!--
   This is an example of a simple solr.xml file for configuring one or
   more Solr Cores, as well as allowing Cores to be added, removed, and
   reloaded via HTTP requests.

   More information about options available in this configuration file,
   and Solr Core administration can be found online:
   http://wiki.apache.org/solr/CoreAdmin
--

solr
  cores adminPath=/admin/cores defaultCoreName=db
core default=true instanceDir=db/ name=db/
  /cores
/solr

And my db/conf directory was copied from example/solr/collection/conf 
directory and it contains the solrconfig.xml file and schema.xml file.


I have noticed that the documentation that shows how to use the 
EmbeddedSolarServer is outdated as it indicates I should use 
CoreContainer.Initializer class which doesn't exist, and 
container.load(path, file) which also doesn't exist.


At this point I have no idea why I am getting the No such core error and 
I have googled it and there seems to be tons of threads showing this 
error but for different reasons, and I have tried all the suggested 
resolutions and get nowhere with this.


Can you please help?

Regards,

Joe


Re: Errors using the Embedded Solar Server

2015-01-21 Thread Carl Roberts
So far I have not been able to get the logging to work - here is what I 
get in the console prior to the exception:


SLF4J: Failed to load class org.slf4j.impl.StaticLoggerBinder.
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for 
further details.

db
/Users/carlroberts/dev/solr-4.10.3/
false
{}
[]
/Users/carlroberts/dev/solr-4.10.3/


On 1/21/15, 11:50 AM, Alan Woodward wrote:

That certainly looks like it ought to work.  Is there log output that you could 
show us as well?

Alan Woodward
www.flax.co.uk


On 21 Jan 2015, at 16:09, Carl Roberts wrote:


Hi,

I have downloaded the code and documentation for Solr version 4.10.3.

I am trying to follow SolrJ Wiki guide and I am running into errors.  The 
latest error is this one:

Exception in thread main org.apache.solr.common.SolrException: No such core: 
db
at 
org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:112)
at 
org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:124)
at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:68)
at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:54)
at solr.Test.main(Test.java:39)

My code is this:

package solr;

import java.io.File;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Collection;

import org.apache.solr.client.solrj.SolrServerException;
import org.apache.solr.client.solrj.embedded.EmbeddedSolrServer;
import org.apache.solr.common.SolrInputDocument;
import org.apache.solr.core.CoreContainer;
import org.apache.solr.core.SolrCore;


public class Test {
public static void main(String [] args){
CoreContainer container = new 
CoreContainer(/Users/carlroberts/dev/solr-4.10.3);
System.out.println(container.getDefaultCoreName());
System.out.println(container.getSolrHome());
container.load();
System.out.println(container.isLoaded(db));
System.out.println(container.getCoreInitFailures());
CollectionSolrCore cores = container.getCores();
System.out.println(cores);
EmbeddedSolrServer server = new EmbeddedSolrServer( container, db );
SolrInputDocument doc1 = new SolrInputDocument();
doc1.addField( id, id1, 1.0f );
doc1.addField( name, doc1, 1.0f );
doc1.addField( price, 10 );
SolrInputDocument doc2 = new SolrInputDocument();
doc2.addField( id, id2, 1.0f );
doc2.addField( name, doc2, 1.0f );
doc2.addField( price, 20 );
CollectionSolrInputDocument docs = new
ArrayListSolrInputDocument();
docs.add( doc1 );
docs.add( doc2 );
try{
server.add( docs );
server.commit();
server.deleteByQuery( *:* );
}catch(IOException e){
e.printStackTrace();
}catch(SolrServerException e){
e.printStackTrace();
}
}
}


My solr.xml file is this:

?xml version=1.0 encoding=UTF-8 ?
!--
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements.  See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the License); you may not use this file except in compliance with
the License.  You may obtain a copy of the License at

 http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an AS IS BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
--

!--
   This is an example of a simple solr.xml file for configuring one or
   more Solr Cores, as well as allowing Cores to be added, removed, and
   reloaded via HTTP requests.

   More information about options available in this configuration file,
   and Solr Core administration can be found online:
   http://wiki.apache.org/solr/CoreAdmin
--

solr
  cores adminPath=/admin/cores defaultCoreName=db
core default=true instanceDir=db/ name=db/
  /cores
/solr

And my db/conf directory was copied from example/solr/collection/conf directory 
and it contains the solrconfig.xml file and schema.xml file.

I have noticed that the documentation that shows how to use the 
EmbeddedSolarServer is outdated as it indicates I should use 
CoreContainer.Initializer class which doesn't exist, and container.load(path, 
file) which also doesn't exist.

At this point I have no idea why I am getting the No such core error and I have 
googled it and there seems to be tons of threads showing this error but for 
different reasons, and I have tried all the suggested resolutions and get 
nowhere with this.

Can you please help?

Regards,

Joe






Re: How much maximum data can we hard commit in Solr?

2015-01-21 Thread Nitin Solanki
Thanks. Great Explanation.. One more thing I want to ask. Which is best
doing only hard commit or both hard and soft commit? I want to index 21 GB
of data.

On Wed, Jan 21, 2015 at 7:48 PM, Shawn Heisey apa...@elyograg.org wrote:

 On 1/21/2015 6:01 AM, Nitin Solanki wrote:
  How much of maximum data we can commit on Solr using hard commit without
  using Soft commit.
  maxTime is 1000 in autoCommit
 
  Details explanation is on Stackoverflow
  
 http://stackoverflow.com/questions/28067853/how-much-maximum-data-can-we-hard-commit-in-solr
 

 The answer to the question you asked: All of it.

 I suspect you are actually trying to ask a different question.

 Some additional info, hopefully you can use it to answer what you'd
 really like to know:

 You could build your entire index with no commits and then issue a
 single hard commit and everything would work.  The problem with that
 approach is that if you have the updateLog turned on, then every single
 one of those documents will be reindexed from the transaction log at
 Solr startup - it could take a REALLY long time.

 http://wiki.apache.org/solr/SolrPerformanceProblems#Slow_startup

 Hard commits are the only way to close a transaction log and open a new
 one.  Solr keeps enough transaction logs around so that it can re-index
 a minimum of 100 documents ... but it can't break the transaction logs
 into parts, so if everything is in one log, then that giant log will be
 replayed on startup.

 A maxTime of 1000 on autoCommit or autoSoftCommit is usually way too
 low.  We find that this setting is normally driven by unrealistic
 requirements from sales or marketing, who say that data must be
 available within one second of indexing.  It is extremely rare for this
 to be truly required.

 The autoCommit settings control automatic hard commits, and
 autoSoftCommit naturally controls automatic soft commits.  With a
 maxTime of 1000, you will be issuing a commit every single second while
 you index.  Commits are very resource-intensive operations, doing them
 once a second will keep your hardware VERY busy.  Normally a commit
 operation will take a lot longer than one second to complete, so if you
 are starting another one a second later, they will overlap, and that can
 cause a lot of problems.

 Thanks,
 Shawn




AW: AW: AW: transactions@Solr(J)

2015-01-21 Thread Clemens Wyss DEV
What I meant is:
If I do SolrServer#rollback after 11 documents were added, will then only 1 or 
all 11 docments that have been added in the SolrServer-tranascation/context?

-Ursprüngliche Nachricht-
Von: Shawn Heisey [mailto:apa...@elyograg.org] 
Gesendet: Mittwoch, 21. Januar 2015 15:24
An: solr-user@lucene.apache.org
Betreff: Re: AW: AW: transactions@Solr(J)

On 1/20/2015 11:42 PM, Clemens Wyss DEV wrote:
 But then what happens if:
 Autocommit is set to 10 docs
 and
 I add 11 docs and then decide (due to an exception?) to rollback.
 
 Will only one (i.e. the last added) document be rollen back?

The way I understand the low-level architecture, yes -- assuming that all 11 
documents actually got indexed.  If the exception happened because of document 
5 was badly formed, only documents 1-4 will have been indexed, and in that 
case, all four of them would get rolled back.

Thanks,
Shawn



Re: MultiPhraseQuery:Rewrite to BooleanQuery

2015-01-21 Thread ku3ia
Tomoko Uchida wrote
 Hi,
 
 Strictly speaking, MultiPhraseQuery and BooleanQuery wrapping PhraseQuerys
 are not equal.
 
 For each query, Query.rewrite() returns different object. (with Lucene
 4.10.3)
 q1.rewrite(reader).toString() returns:
 body:blueberry chocolate (pie tart), where q1 is your first multi
 phrase query.
 q2.rewrite(reader).toString() returns:
 body:blueberry chocolate pie body:blueberry chocolate tart, where
 q2 is your second boolean query.
 
 In practice... I *think* two queries may return same set of documents, but
 I'm not sure about scoring/ranking.
 
 I suggest you ask to java-user@lucene mailing list as for Lucene API.
 
 Regards,
 Tomoko
 
 
 
 2015-01-21 19:12 GMT+09:00 ku3ia lt;

 demesg@

 gt;:
 
 Any ideas?



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/MultiPhraseQuery-Rewrite-to-BooleanQuery-tp4180638p4180820.html
 Sent from the Solr - User mailing list archive at Nabble.com.


Thanks, I'll try it.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/MultiPhraseQuery-Rewrite-to-BooleanQuery-tp4180638p4180887.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Errors using the Embedded Solar Server

2015-01-21 Thread Alan Woodward
That certainly looks like it ought to work.  Is there log output that you could 
show us as well?

Alan Woodward
www.flax.co.uk


On 21 Jan 2015, at 16:09, Carl Roberts wrote:

 Hi,
 
 I have downloaded the code and documentation for Solr version 4.10.3.
 
 I am trying to follow SolrJ Wiki guide and I am running into errors.  The 
 latest error is this one:
 
 Exception in thread main org.apache.solr.common.SolrException: No such 
 core: db
at 
 org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:112)
at 
 org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:124)
at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:68)
at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:54)
at solr.Test.main(Test.java:39)
 
 My code is this:
 
 package solr;
 
 import java.io.File;
 import java.io.IOException;
 import java.util.ArrayList;
 import java.util.Collection;
 
 import org.apache.solr.client.solrj.SolrServerException;
 import org.apache.solr.client.solrj.embedded.EmbeddedSolrServer;
 import org.apache.solr.common.SolrInputDocument;
 import org.apache.solr.core.CoreContainer;
 import org.apache.solr.core.SolrCore;
 
 
 public class Test {
public static void main(String [] args){
CoreContainer container = new 
 CoreContainer(/Users/carlroberts/dev/solr-4.10.3);
System.out.println(container.getDefaultCoreName());
System.out.println(container.getSolrHome());
container.load();
System.out.println(container.isLoaded(db));
System.out.println(container.getCoreInitFailures());
CollectionSolrCore cores = container.getCores();
System.out.println(cores);
EmbeddedSolrServer server = new EmbeddedSolrServer( container, db );
SolrInputDocument doc1 = new SolrInputDocument();
doc1.addField( id, id1, 1.0f );
doc1.addField( name, doc1, 1.0f );
doc1.addField( price, 10 );
SolrInputDocument doc2 = new SolrInputDocument();
doc2.addField( id, id2, 1.0f );
doc2.addField( name, doc2, 1.0f );
doc2.addField( price, 20 );
CollectionSolrInputDocument docs = new
ArrayListSolrInputDocument();
docs.add( doc1 );
docs.add( doc2 );
try{
server.add( docs );
server.commit();
server.deleteByQuery( *:* );
}catch(IOException e){
e.printStackTrace();
}catch(SolrServerException e){
e.printStackTrace();
}
}
 }
 
 
 My solr.xml file is this:
 
 ?xml version=1.0 encoding=UTF-8 ?
 !--
 Licensed to the Apache Software Foundation (ASF) under one or more
 contributor license agreements.  See the NOTICE file distributed with
 this work for additional information regarding copyright ownership.
 The ASF licenses this file to You under the Apache License, Version 2.0
 (the License); you may not use this file except in compliance with
 the License.  You may obtain a copy of the License at
 
 http://www.apache.org/licenses/LICENSE-2.0
 
 Unless required by applicable law or agreed to in writing, software
 distributed under the License is distributed on an AS IS BASIS,
 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 See the License for the specific language governing permissions and
 limitations under the License.
 --
 
 !--
   This is an example of a simple solr.xml file for configuring one or
   more Solr Cores, as well as allowing Cores to be added, removed, and
   reloaded via HTTP requests.
 
   More information about options available in this configuration file,
   and Solr Core administration can be found online:
   http://wiki.apache.org/solr/CoreAdmin
 --
 
 solr
  cores adminPath=/admin/cores defaultCoreName=db
core default=true instanceDir=db/ name=db/
  /cores
 /solr
 
 And my db/conf directory was copied from example/solr/collection/conf 
 directory and it contains the solrconfig.xml file and schema.xml file.
 
 I have noticed that the documentation that shows how to use the 
 EmbeddedSolarServer is outdated as it indicates I should use 
 CoreContainer.Initializer class which doesn't exist, and container.load(path, 
 file) which also doesn't exist.
 
 At this point I have no idea why I am getting the No such core error and I 
 have googled it and there seems to be tons of threads showing this error but 
 for different reasons, and I have tried all the suggested resolutions and get 
 nowhere with this.
 
 Can you please help?
 
 Regards,
 
 Joe



Re: Errors using the Embedded Solar Server

2015-01-21 Thread Alan Woodward
Aha, I think you're being stung by 
https://issues.apache.org/jira/browse/SOLR-6643.  Which will be fixed in the 
upcoming 5.0 release, or you can patch your system with the patch attached to 
that issue.

Alan Woodward
www.flax.co.uk


On 21 Jan 2015, at 19:44, Carl Roberts wrote:

 Already did.  And the logging gets me no closer to fixing the issue. Here is 
 the logging.
 
 [main] INFO org.apache.solr.core.SolrResourceLoader - new SolrResourceLoader 
 for directory: '/Users/carlroberts/dev/solr-4.10.3/'
 [main] INFO org.apache.solr.core.SolrResourceLoader - Adding 
 'file:/Users/carlroberts/dev/solr-4.10.3/lib/commons-logging-1.2.jar' to 
 classloader
 [main] INFO org.apache.solr.core.SolrResourceLoader - Adding 
 'file:/Users/carlroberts/dev/solr-4.10.3/lib/servlet-api.jar' to classloader
 [main] INFO org.apache.solr.core.SolrResourceLoader - Adding 
 'file:/Users/carlroberts/dev/solr-4.10.3/lib/slf4j-simple-1.7.5.jar' to 
 classloader
 [main] INFO org.apache.solr.core.ConfigSolr - Loading container configuration 
 from /Users/carlroberts/dev/solr-4.10.3/solr.xml
 [main] INFO org.apache.solr.core.CoreContainer - New CoreContainer 1727098510
 [main] INFO org.apache.solr.core.CoreContainer - Loading cores into 
 CoreContainer [instanceDir=/Users/carlroberts/dev/solr-4.10.3/]
 [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - 
 Setting socketTimeout to: 0
 [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - 
 Setting urlScheme to: null
 [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - 
 Setting connTimeout to: 0
 [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - 
 Setting maxConnectionsPerHost to: 20
 [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - 
 Setting corePoolSize to: 0
 [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - 
 Setting maximumPoolSize to: 2147483647
 [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - 
 Setting maxThreadIdleTime to: 5
 [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - 
 Setting sizeOfQueue to: -1
 [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - 
 Setting fairnessPolicy to: false
 [main] INFO org.apache.solr.update.UpdateShardHandler - Creating 
 UpdateShardHandler HTTP client with params: 
 socketTimeout=0connTimeout=0retry=false
 [main] INFO org.apache.solr.logging.LogWatcher - SLF4J impl is 
 org.slf4j.impl.SimpleLoggerFactory
 [main] INFO org.apache.solr.logging.LogWatcher - No LogWatcher configured
 [main] INFO org.apache.solr.core.CoreContainer - Host Name: null
 [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrResourceLoader - 
 new SolrResourceLoader for directory: '/Users/carlroberts/dev/solr-4.10.3/db/'
 [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrConfig - Adding 
 specified lib dirs to ClassLoader
 [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrResourceLoader - 
 Adding 
 'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/apache-mime4j-core-0.7.2.jar'
  to classloader
 [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrResourceLoader - 
 Adding 
 'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/apache-mime4j-dom-0.7.2.jar'
  to classloader
 [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrResourceLoader - 
 Adding 
 'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/aspectjrt-1.6.11.jar'
  to classloader
 [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrResourceLoader - 
 Adding 
 'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/bcmail-jdk15-1.45.jar'
  to classloader
 [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrResourceLoader - 
 Adding 
 'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/bcprov-jdk15-1.45.jar'
  to classloader
 [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrResourceLoader - 
 Adding 
 'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/boilerpipe-1.1.0.jar'
  to classloader
 [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrResourceLoader - 
 Adding 
 'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/commons-compress-1.7.jar'
  to classloader
 [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrResourceLoader - 
 Adding 
 'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/dom4j-1.6.1.jar'
  to classloader
 [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrResourceLoader - 
 Adding 
 'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/fontbox-1.8.4.jar'
  to classloader
 [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrResourceLoader - 
 Adding 
 'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/icu4j-53.1.jar'
  to classloader
 [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrResourceLoader - 
 Adding 
 

permanently reducing logging levels for Solr

2015-01-21 Thread Nemani, Raj
All,

How can I reduce the logging levels to SEVERE that survives a Tomcat restart or 
a machine reboot in Solr.  As you may know, I can change the logging levels 
from the logging page in admin console but those changes are not persistent 
across Tomcat server restart or machine reboot.
Following is the information about the Solr version from Info page in admin 
console.

Solr Specification Version: 3.2.0
Solr Implementation Version: 3.2.0 1129474 - rmuir - 2011-05-30 23:07:15
Lucene Specification Version: 3.2.0
Lucene Implementation Version: 3.2.0 1129474 - 2011-05-30 23:08:57

Please let me know if there is any other information that you may need.

Thank you in advance for your help

Raj



Re: Is Solr a good candidate to index 100s of nodes in one XML file?

2015-01-21 Thread Shawn Heisey
On 1/21/2015 12:53 PM, Carl Roberts wrote:
 Is Solr a good candidate to index 100s of nodes in one XML file?

 I have an RSS feed XML file that has 100s of nodes with several
 elements in each node that I have to index, so I was planning to parse
 the XML with Stax and extract the data from each node and add it to
 Solr.  There will always be only one one file to start with and then a
 second file as the RSS feeds supplies updates.  I want to return
 certain fields of each node when I search certain fields of the same
 node.  Is Solr overkill in this case?  Should I just use Lucene instead?

Effectively, Solr *is* Lucene.  You edit configuration files instead of
writing Lucene code, because Solr is a fully customizable search server,
not a programming API.  That also means that it's not as flexible as
Lucene ... but it's a lot easier.

If you're capable of writing Lucene code, chances are that you'll be
able to write an application that is highly tailored to your situation
that will have better performance than Solr ... but you'll be writing
the entire program yourself.  Solr lets you install an existing program
and just change the configuration.

Thanks,
Shawn



boosting by geodist - GC Overhead Limit exceeded

2015-01-21 Thread Mihran Shahinian
I am running solr 4.10.2 with geofilt (~20% of docs have 30+ lat/lon
points) and everything work hunky dori. Than I added a bf with geodist
along the lines of:
recip(geodist(),5,20,5)  after few hours of running I end up with OOM
GC overhead limit exceeded. I've seen this
https://issues.apache.org/jira/browse/LUCENE-4698 and few other relevant
tickets. Wanted to check if anyone has any successful remedies.

Many thanks,
Mihran

My gc params on amazon xl instance:
-server -Xmx8g -Xms8g
-XX:+HeapDumpOnOutOfMemoryError \
-XX:NewRatio=3 \
-XX:SurvivorRatio=4 \
-XX:TargetSurvivorRatio=90 \
-XX:MaxTenuringThreshold=8 \
-XX:ConcGCThreads=4 -XX:ParallelGCThreads=4 \
-XX:+CMSScavengeBeforeRemark \
-XX:PretenureSizeThreshold=64m \
-XX:+UseCMSInitiatingOccupancyOnly \
-XX:CMSInitiatingOccupancyFraction=50 \
-XX:CMSMaxAbortablePrecleanTime=6000 \
-XX:+CMSParallelRemarkEnabled \
-XX:+ParallelRefProcEnabled

Screenshot from Eclipse Mat
[image: Inline image 1]


Re: Ignore whitesapce, underscore using KeywordTokenizer... EdgeNGramFilter

2015-01-21 Thread Alvaro Cabrerizo
Hi,

Not sure, but I think that the PatternReplaceFilterFactory or
the PatternReplaceCharFilterFactory could help you deleting those
characters.

Regards.
On Jan 21, 2015 7:59 PM, Vishal Swaroop vishal@gmail.com wrote:

 I am trying to implement type-ahead suggestion for single field which
 should ignore whitesapce, underscore or special characters in autosuggest.

 It works as suggested by Alex using KeywordTokenizerFactory but how to
 ignore whitesapce, underscore...

 Example itemName data can be :
 ABC E12 : if user types ABCE suggestion should be ABC E12
 ABCE_12 : if user types ABCE1 suggestion should be ABCE_12

 Schema.xml
 field name=itemName type=text_general_edge_ngram indexed=true
 stored=true multiValued=false /

 fieldType name=text_general_edge_ngram class=solr.TextField
 positionIncrementGap=100
analyzer type=index
 tokenizer class=solr.KeywordTokenizerFactory/
 filter class=solr.LowerCaseFilterFactory/
 filter class=solr.EdgeNGramFilterFactory minGramSize=1
 maxGramSize=15 side=front/
/analyzer
analyzer type=query
 tokenizer class=solr.LowerCaseTokenizerFactory/
/analyzer
 /fieldType



Re: Is Solr a good candidate to index 100s of nodes in one XML file?

2015-01-21 Thread Alexandre Rafalovitch
Solr is just fine for this.

It even ships with an example of how to read an RSS file under the DIH
directory. DIH is also most likely what you will use for the first
implementation. Don't need to worry about Stax or anything, unless
your file format is very weird or has overlapping namespaces (DIH XML
parser does not care about namespaces).

Regards,
  Alex.

Sign up for my Solr resources newsletter at http://www.solr-start.com/


On 21 January 2015 at 14:53, Carl Roberts carl.roberts.zap...@gmail.com wrote:
 Hi,

 Is Solr a good candidate to index 100s of nodes in one XML file?

 I have an RSS feed XML file that has 100s of nodes with several elements in
 each node that I have to index, so I was planning to parse the XML with Stax
 and extract the data from each node and add it to Solr.  There will always
 be only one one file to start with and then a second file as the RSS feeds
 supplies updates.  I want to return certain fields of each node when I
 search certain fields of the same node.  Is Solr overkill in this case?
 Should I just use Lucene instead?

 Regards,

 Joe


RE: Ignore whitesapce, underscore using KeywordTokenizer... EdgeNGramFilter

2015-01-21 Thread David M Giannone
This is what we use for our autosuggest field in Solr 3.4.  It works for us as 
you describe below.


fieldType name=autocomplete_edge class=solr.TextField
analyzer type=index
tokenizer 
class=solr.KeywordTokenizerFactory/
charFilter 
class=solr.MappingCharFilterFactory mapping=mapping-ISOLatin1Accent.txt/
filter class=solr.LowerCaseFilterFactory/
filter 
class=solr.PatternReplaceFilterFactory pattern=([\.,;:-_]) replacement=  
replace=all/
filter class=solr.EdgeNGramFilterFactory 
maxGramSize=30 minGramSize=1/
filter 
class=solr.PatternReplaceFilterFactory pattern=([^\w\d]) replacement= 
replace=all/
/analyzer
analyzer type=query
tokenizer 
class=solr.KeywordTokenizerFactory/
charFilter 
class=solr.MappingCharFilterFactory mapping=mapping-ISOLatin1Accent.txt/
filter class=solr.LowerCaseFilterFactory/
filter 
class=solr.PatternReplaceFilterFactory pattern=([\.,;:-_]) replacement=  
replace=all/
filter 
class=solr.PatternReplaceFilterFactory pattern=([^\w\d]) replacement= 
replace=all/
filter 
class=solr.PatternReplaceFilterFactory pattern=^(.{30})(.*)? 
replacement=$1 replace=all/
/analyzer
/fieldType



-Original Message-
From: Vishal Swaroop [mailto:vishal@gmail.com]
Sent: Wednesday, January 21, 2015 4:40 PM
To: solr-user@lucene.apache.org
Subject: Re: Ignore whitesapce, underscore using KeywordTokenizer... 
EdgeNGramFilter

I tried adding *PatternReplaceFilterFactory *in index section but it is not 
working

Example itemName data can be :
- ABC E12 : if user types ABCE suggestion should be ABC E12
- ABCE_12 : if user types ABCE1 suggestion should be ABCE_12

field name=itemName type=text_general_edge_ngram indexed=true
stored=true multiValued=false /

fieldType name=text_general_edge_ngram class=solr.TextField
positionIncrementGap=100
   analyzer type=index
tokenizer class=solr.KeywordTokenizerFactory/
*filter class=solr.PatternReplaceFilterFactory pattern=(\s+)
replacement= replace=all /*
filter class=solr.LowerCaseFilterFactory/
filter class=solr.EdgeNGramFilterFactory minGramSize=1
maxGramSize=15 side=front/
   /analyzer

   analyzer type=query
tokenizer class=solr.LowerCaseTokenizerFactory/
   /analyzer
/fieldType

On Wed, Jan 21, 2015 at 3:31 PM, Alvaro Cabrerizo topor...@gmail.com
wrote:

 Hi,

 Not sure, but I think that the PatternReplaceFilterFactory or the
 PatternReplaceCharFilterFactory could help you deleting those
 characters.

 Regards.
 On Jan 21, 2015 7:59 PM, Vishal Swaroop vishal@gmail.com wrote:

  I am trying to implement type-ahead suggestion for single field
  which should ignore whitesapce, underscore or special characters in
 autosuggest.
 
  It works as suggested by Alex using KeywordTokenizerFactory but how
  to ignore whitesapce, underscore...
 
  Example itemName data can be :
  ABC E12 : if user types ABCE suggestion should be ABC E12
  ABCE_12 : if user types ABCE1 suggestion should be ABCE_12
 
  Schema.xml
  field name=itemName type=text_general_edge_ngram indexed=true
  stored=true multiValued=false /
 
  fieldType name=text_general_edge_ngram class=solr.TextField
  positionIncrementGap=100
 analyzer type=index
  tokenizer class=solr.KeywordTokenizerFactory/
  filter class=solr.LowerCaseFilterFactory/
  filter class=solr.EdgeNGramFilterFactory minGramSize=1
  maxGramSize=15 side=front/
 /analyzer
 analyzer type=query
  tokenizer class=solr.LowerCaseTokenizerFactory/
 /analyzer
  /fieldType
 



Nothing in this message is intended to constitute an electronic signature 
unless a specific statement to the contrary is included in this message.

Confidentiality Note: This message is intended only for the person or entity to 
which it is addressed. It may contain confidential and/or privileged material. 
Any review, transmission, dissemination or other use, or taking of any action 
in reliance upon this message by persons or entities other than the intended 
recipient is prohibited and may be unlawful. If you received this message in 
error, please contact the sender and delete it from your computer.


Re: permanently reducing logging levels for Solr

2015-01-21 Thread Rajesh Hazari
Hi,

Just add log4j.logger.org.apache.solr=SEVERE to you log4j properties.

*Thanks,*
*Rajesh,*
*(mobile) : 8328789519.*

On Wed, Jan 21, 2015 at 3:14 PM, Nemani, Raj raj.nem...@turner.com wrote:

 All,

 How can I reduce the logging levels to SEVERE that survives a Tomcat
 restart or a machine reboot in Solr.  As you may know, I can change the
 logging levels from the logging page in admin console but those changes are
 not persistent across Tomcat server restart or machine reboot.
 Following is the information about the Solr version from Info page in
 admin console.

 Solr Specification Version: 3.2.0
 Solr Implementation Version: 3.2.0 1129474 - rmuir - 2011-05-30 23:07:15
 Lucene Specification Version: 3.2.0
 Lucene Implementation Version: 3.2.0 1129474 - 2011-05-30 23:08:57

 Please let me know if there is any other information that you may need.

 Thank you in advance for your help

 Raj




Re: boosting by geodist - GC Overhead Limit exceeded

2015-01-21 Thread Chris Hostetter
On Wed, 21 Jan 2015, Mihran Shahinian wrote:

: Date: Wed, 21 Jan 2015 16:06:18 -0600
: From: Mihran Shahinian slowmih...@gmail.com
: Reply-To: solr-user@lucene.apache.org
: To: solr-user@lucene.apache.org
: Subject: boosting by geodist - GC Overhead Limit exceeded
: 
: I am running solr 4.10.2 with geofilt (~20% of docs have 30+ lat/lon
: points) and everything work hunky dori. Than I added a bf with geodist
: along the lines of:
: recip(geodist(),5,20,5)  after few hours of running I end up with OOM
: GC overhead limit exceeded. I've seen this
: https://issues.apache.org/jira/browse/LUCENE-4698 and few other relevant
: tickets. Wanted to check if anyone has any successful remedies.
: 
: Many thanks,
: Mihran
: 
: My gc params on amazon xl instance:
: -server -Xmx8g -Xms8g
: -XX:+HeapDumpOnOutOfMemoryError \
: -XX:NewRatio=3 \
: -XX:SurvivorRatio=4 \
: -XX:TargetSurvivorRatio=90 \
: -XX:MaxTenuringThreshold=8 \
: -XX:ConcGCThreads=4 -XX:ParallelGCThreads=4 \
: -XX:+CMSScavengeBeforeRemark \
: -XX:PretenureSizeThreshold=64m \
: -XX:+UseCMSInitiatingOccupancyOnly \
: -XX:CMSInitiatingOccupancyFraction=50 \
: -XX:CMSMaxAbortablePrecleanTime=6000 \
: -XX:+CMSParallelRemarkEnabled \
: -XX:+ParallelRefProcEnabled
: 
: Screenshot from Eclipse Mat
: [image: Inline image 1]
: 

-Hoss
http://www.lucidworks.com/


Re: Solr 4.10.3 start up issue

2015-01-21 Thread Shalin Shekhar Mangar
Hi Darren,

Can you please show the contents of the clusterstate.json from ZooKeeper?
Please use github gist or a pastebin like service. The Admin UI has a
dump screen which shows the entire content of ZooKeeper as a json.

On Wed, Jan 21, 2015 at 6:15 PM, Darren Spehr darre...@gmail.com wrote:

 Hi everyone -

 I posted a question on stackoverflow but in hindsight this would have been
 a better place to start. Below is the link.

 Basically I can't get the example working when using an external ZK cluster
 and auto-core discovery. Solr 4.10.1 works fine, but the newest release
 never gets new nodes into the active state. There are no errors or
 warnings, and compared to the log output of 4.10.1, the difference is that
 nodes never make it to leader election.

 Here is the stackoverflow question, along with the full log output:

 http://stackoverflow.com/questions/28004832/solr-4-10-3-is-not-proceeding-to-leader-election-on-new-cluster-startup-hangs

 Any help and guidance would be appreciated. Thanks!

 --
 Darren




-- 
Regards,
Shalin Shekhar Mangar.


Re: Errors using the Embedded Solar Server

2015-01-21 Thread Shawn Heisey
On 1/21/2015 5:16 PM, Carl Roberts wrote:
 BTW - it seems that is very hard to get started with the Embedded
 server.  The doc is out of date.  The code seems to be untested and buggy.

 On 1/21/15, 7:15 PM, Carl Roberts wrote:
 HmmmIt looks like FutureTask is calling setException(Throwable t)
 with this exception which is not making it to the console.

 What I don't understand is why it is throwing that exception.  I made
 sure that I added lucene-queries-4.10.3.jar file to the classpath by
 adding it to the solr home directory.  See the new tracing:

I'm pretty sure that all the lucene jars need to be available *before*
Solr reaches the point in the log that you have quoted, where it adds
jars from ${solr.solr.home}/lib.  This would be the same location where
the solrj and solr-core jars live.  The only kind of jars that should be
in the solr home lib directory are extra jars for extra features that
you might specify in schema.xml (or some places in solrconfig.xml), like
the ICU analysis jars, tika, mysql, etc.

Thanks,
Shawn



Re: Issue with Solr multiple sort

2015-01-21 Thread Chris Hostetter
:   I'm  facing a problem with multiple field sort in Solr. I'm using the
: following fields in sort :
: 
: PublishDate asc,DocumentType asc

correction: you are using: PublishDate desc,DocumentType desc

: The sort is only happening on PublishDate, DocumentType seemsto completely
: ignored. Here's my field type definition.

the results you posted are perfectly sorted according to the criteria in 
your URL... 

2015-01-17, 2014-11-17, 2013-01-17, 2012-10-17, 2012-01-17, 2011-01-17, 
then 2 docs from 2006-01-17 correctly ordered by secondary sort: O 
before H.

...did you not post the query/results you ment to post?  what exactly is 
it about the result ordering that you are getting do you think is 
incorrect?

: result name=response numFound=8 start=0
: doc
: date name=PublishDate2015-01-17T00:00:00Z/date
: str name=DocumentTypeHotfixes/str
: /doc
: doc
: date name=PublishDate2014-11-17T00:00:00Z/date
: str name=DocumentTypeHotfixes/str
: /doc
: doc
: date name=PublishDate2013-01-17T00:00:00Z/date
: str name=DocumentTypeTutorials/str
: /doc
: doc
: date name=PublishDate2012-10-17T00:00:00Z/date
: str name=DocumentTypeService Packs/str
: /doc
: doc
: date name=PublishDate2012-01-17T00:00:00Z/date
: str name=DocumentTypeTutorials/str
: /doc
: doc
: date name=PublishDate2011-01-17T00:00:00Z/date
: str name=DocumentTypeTutorials /str
: /doc
: doc
: date name=PublishDate2006-01-17T00:00:00Z/date
: str name=DocumentTypeObject Enablers/str
: /doc
: doc
: date name=PublishDate2006-01-17T00:00:00Z/date
: str name=DocumentTypeHotfixes/str
: /doc
: /result
: 
: As you can see, the sorting happened only on PublishDate. I'm using Solr
: 4.7.
: 
: Not sure what I'm missing here, any pointers will be appreciated.
: 
: Thanks,
: Shamik
: 

-Hoss
http://www.lucidworks.com/


Re: Solr Recovery process

2015-01-21 Thread Shalin Shekhar Mangar
Hi Nishanth,

The recovery happens as follows:

1. PeerSync is attempted first. If the number of new updates on leader is
less than 100 then the missing documents are fetched directly and indexed
locally. The tlog tells us the last 100 updates very quickly. Other uses of
the tlog are for durability of updates and of course, startup recovery.
2. If the above step fails then replication recovery is attempted. A hard
commit is called on the leader and then the leader is polled for the latest
index version and generation. If the leader's version and generation are
greater than local index's version/generation then the difference of the
index files between leader and replica are fetched and installed.
3. If the above fails (because leader's version/generation is somehow equal
or more than local) then a full index recovery happens and the entire index
from the leader is fetched and installed locally.

There are some other details involved in this process too but probably not
worth going into here.

On Wed, Jan 21, 2015 at 5:13 PM, Nishanth S nishanth.2...@gmail.com wrote:

 Hello Everyone,

 I am hitting a few issues with solr replicas going into recovery and then
 doing a full index copy.I am trying to understand the solr recovery
 process.I have read a few blogs  on this and saw  that when leader notifies
 a replica to  recover(in my case it is due to connection resets) it will
 try to do a peer sync first and  if the missed updates are more than 100 it
 will do a full index copy from the leader.I am trying to understand what
 peer sync is and where does tlog come into picture.Are tlogs replayed only
 during server restart?.Can some one  help me with this?

 Thanks,
 Nishanth




-- 
Regards,
Shalin Shekhar Mangar.


Re: Ignore whitesapce, underscore using KeywordTokenizer... EdgeNGramFilter

2015-01-21 Thread Alvaro Cabrerizo
Hi Visal,

Maybe the next pattern can help you (the conf attached by David is really
nice):

...pattern=(\s)+ replacement= replace=all/

Hope it helps.

On Wed, Jan 21, 2015 at 10:57 PM, David M Giannone david.giann...@gm.com
wrote:

 This is what we use for our autosuggest field in Solr 3.4.  It works for
 us as you describe below.


 fieldType name=autocomplete_edge class=solr.TextField
 analyzer type=index
 tokenizer
 class=solr.KeywordTokenizerFactory/
 charFilter
 class=solr.MappingCharFilterFactory
 mapping=mapping-ISOLatin1Accent.txt/
 filter
 class=solr.LowerCaseFilterFactory/
 filter
 class=solr.PatternReplaceFilterFactory pattern=([\.,;:-_])
 replacement=  replace=all/
 filter
 class=solr.EdgeNGramFilterFactory maxGramSize=30 minGramSize=1/
 filter
 class=solr.PatternReplaceFilterFactory pattern=([^\w\d]) replacement=
 replace=all/
 /analyzer
 analyzer type=query
 tokenizer
 class=solr.KeywordTokenizerFactory/
 charFilter
 class=solr.MappingCharFilterFactory
 mapping=mapping-ISOLatin1Accent.txt/
 filter
 class=solr.LowerCaseFilterFactory/
 filter
 class=solr.PatternReplaceFilterFactory pattern=([\.,;:-_])
 replacement=  replace=all/
 filter
 class=solr.PatternReplaceFilterFactory pattern=([^\w\d]) replacement=
 replace=all/
 filter
 class=solr.PatternReplaceFilterFactory pattern=^(.{30})(.*)?
 replacement=$1 replace=all/
 /analyzer
 /fieldType



 -Original Message-
 From: Vishal Swaroop [mailto:vishal@gmail.com]
 Sent: Wednesday, January 21, 2015 4:40 PM
 To: solr-user@lucene.apache.org
 Subject: Re: Ignore whitesapce, underscore using KeywordTokenizer...
 EdgeNGramFilter

 I tried adding *PatternReplaceFilterFactory *in index section but it is
 not working

 Example itemName data can be :
 - ABC E12 : if user types ABCE suggestion should be ABC E12
 - ABCE_12 : if user types ABCE1 suggestion should be ABCE_12

 field name=itemName type=text_general_edge_ngram indexed=true
 stored=true multiValued=false /

 fieldType name=text_general_edge_ngram class=solr.TextField
 positionIncrementGap=100
analyzer type=index
 tokenizer class=solr.KeywordTokenizerFactory/
 *filter class=solr.PatternReplaceFilterFactory pattern=(\s+)
 replacement= replace=all /*
 filter class=solr.LowerCaseFilterFactory/
 filter class=solr.EdgeNGramFilterFactory minGramSize=1
 maxGramSize=15 side=front/
/analyzer

analyzer type=query
 tokenizer class=solr.LowerCaseTokenizerFactory/
/analyzer
 /fieldType

 On Wed, Jan 21, 2015 at 3:31 PM, Alvaro Cabrerizo topor...@gmail.com
 wrote:

  Hi,
 
  Not sure, but I think that the PatternReplaceFilterFactory or the
  PatternReplaceCharFilterFactory could help you deleting those
  characters.
 
  Regards.
  On Jan 21, 2015 7:59 PM, Vishal Swaroop vishal@gmail.com wrote:
 
   I am trying to implement type-ahead suggestion for single field
   which should ignore whitesapce, underscore or special characters in
  autosuggest.
  
   It works as suggested by Alex using KeywordTokenizerFactory but how
   to ignore whitesapce, underscore...
  
   Example itemName data can be :
   ABC E12 : if user types ABCE suggestion should be ABC E12
   ABCE_12 : if user types ABCE1 suggestion should be ABCE_12
  
   Schema.xml
   field name=itemName type=text_general_edge_ngram indexed=true
   stored=true multiValued=false /
  
   fieldType name=text_general_edge_ngram class=solr.TextField
   positionIncrementGap=100
  analyzer type=index
   tokenizer class=solr.KeywordTokenizerFactory/
   filter class=solr.LowerCaseFilterFactory/
   filter class=solr.EdgeNGramFilterFactory minGramSize=1
   maxGramSize=15 side=front/
  /analyzer
  analyzer type=query
   tokenizer class=solr.LowerCaseTokenizerFactory/
  /analyzer
   /fieldType
  
 


 Nothing in this message is intended to constitute an electronic signature
 unless a specific statement to the contrary is included in this message.

 Confidentiality Note: This message is intended only for the person or
 entity to which it is addressed. It may contain confidential and/or
 privileged material. Any review, transmission, dissemination or other use,
 or taking of any action in reliance upon this message by persons or
 entities other than the intended recipient is prohibited and may be
 unlawful. If you received this message in error, please contact the sender
 and delete it from your computer.



Issue with Solr multiple sort

2015-01-21 Thread Shamik Bandopadhyay
Hi,

  I'm  facing a problem with multiple field sort in Solr. I'm using the
following fields in sort :

PublishDate asc,DocumentType asc

The sort is only happening on PublishDate, DocumentType seemsto completely
ignored. Here's my field type definition.

field name=PublishDate type=tdate indexed=true stored=true
default=NOW/
field name=DocumentType type=string indexed=true stored=true
multiValued=false required=false omitNorms=true/

Here's the sample query:

http://localhost:8983/solr/select?sort=PublishDate+desc%2CDocumentType+descq=cat:searchfl=PublishDate,DocumentTypedebugQuery=true

Here's the output :

result name=response numFound=8 start=0
doc
date name=PublishDate2015-01-17T00:00:00Z/date
str name=DocumentTypeHotfixes/str
/doc
doc
date name=PublishDate2014-11-17T00:00:00Z/date
str name=DocumentTypeHotfixes/str
/doc
doc
date name=PublishDate2013-01-17T00:00:00Z/date
str name=DocumentTypeTutorials/str
/doc
doc
date name=PublishDate2012-10-17T00:00:00Z/date
str name=DocumentTypeService Packs/str
/doc
doc
date name=PublishDate2012-01-17T00:00:00Z/date
str name=DocumentTypeTutorials/str
/doc
doc
date name=PublishDate2011-01-17T00:00:00Z/date
str name=DocumentTypeTutorials /str
/doc
doc
date name=PublishDate2006-01-17T00:00:00Z/date
str name=DocumentTypeObject Enablers/str
/doc
doc
date name=PublishDate2006-01-17T00:00:00Z/date
str name=DocumentTypeHotfixes/str
/doc
/result

As you can see, the sorting happened only on PublishDate. I'm using Solr
4.7.

Not sure what I'm missing here, any pointers will be appreciated.

Thanks,
Shamik


Re: Solr 4.10.3 start up issue

2015-01-21 Thread Chris Hostetter

: I posted a question on stackoverflow but in hindsight this would have been
: a better place to start. Below is the link.
: 
: Basically I can't get the example working when using an external ZK cluster
: and auto-core discovery. Solr 4.10.1 works fine, but the newest release

your SO URL shows the output of using your custom configs, but not what 
you got with the example configs -- so it's not clear to me if there is 
really just one problem, or perhaps 2?

you also mentioned a lot of details about how you are using solr with zk, 
and what doens't work, but it's not clear if you tried other simpler steps 
using your configs -- or the example configs -- and if those simpler *did* 
work (ie: single node solr startup?)

my best guess, based on the logs you did post and the mention of 
lib/mq/solr-search-ahead-2.0.0.jar in those logs, is that the entire 
question of zk and slcuster state and leaders is a red herring, and what 
you are running into is: SOLR-6643...

https://issues.apache.org/jira/browse/SOLR-6643

...if i'm right, then simple core discovery with your configs on a single 
node solr instance w/o any knowledge of ZK will also fail to init the core 
-- and if you try to use the CoreAdmin API to CREATE a core, you'll ge 
some kind of LinkageError.




: Here is the stackoverflow question, along with the full log output:
: 
http://stackoverflow.com/questions/28004832/solr-4-10-3-is-not-proceeding-to-leader-election-on-new-cluster-startup-hangs


-Hoss
http://www.lucidworks.com/


Re: permanently reducing logging levels for Solr

2015-01-21 Thread Shawn Heisey
On 1/21/2015 1:14 PM, Nemani, Raj wrote:
 How can I reduce the logging levels to SEVERE that survives a Tomcat restart 
 or a machine reboot in Solr.  As you may know, I can change the logging 
 levels from the logging page in admin console but those changes are not 
 persistent across Tomcat server restart or machine reboot.
 Following is the information about the Solr version from Info page in admin 
 console.

 Solr Specification Version: 3.2.0
 Solr Implementation Version: 3.2.0 1129474 - rmuir - 2011-05-30 23:07:15
 Lucene Specification Version: 3.2.0
 Lucene Implementation Version: 3.2.0 1129474 - 2011-05-30 23:08:57

 Please let me know if there is any other information that you may need.

 Thank you in advance for your help

The Solr 3.x example uses java.util.logging, not the log4j that was
introduced in the example for 4.3.0.  Your other reply talks about
log4j, which may not be the right framework for your install.

I have no way to know what container or logging framework you're using. 
You will need to create a configuration file for whatever slf4j binding
is in use on your install and most likely add a system property to your
java commandline for startup so that your logging config gets used.  If
you're using java.util.logging, look for help with the
java.util.logging.config.file system property.

FYI -- if you reduce the logging level to WARN, a normally functioning
Solr will log almost nothing, and you'll be able to see ERROR and WARN
messages, which is extremely important for troubleshooting.  Dropping
the level to SEVERE is not necessary, and will make it impossible to
tell what happened when something goes wrong.

Thanks,
Shawn



Re: permanently reducing logging levels for Solr

2015-01-21 Thread Shawn Heisey
On 1/21/2015 7:24 PM, Shawn Heisey wrote:
 I have no way to know what container or logging framework you're using. 

Followup on this:

Unless you have modified the solr war for version 3.2.0 to change the
logging jars, you will definitely be using java.util.logging.  Here's
some URLs that may offer insight on the config file you'll need:

http://www.javapractices.com/topic/TopicAction.do?Id=143

http://tutorials.jenkov.com/java-logging/configuration.html

http://www.java2s.com/Code/Java/Language-Basics/ConfiguringLoggerDefaultValueswithaPropertiesFile.htm

Thanks,
Shawn



Re: Solr 4.10.3 start up issue

2015-01-21 Thread Darren Spehr
Thanks Hoss, this is exactly what I needed. I had previously run the
example using nothing more than an external ZK hosting my own
configuration. This of course means one of two things - my conf was bad, or
Solr was at fault. The conf has been working for ages so I didn't test a
replacement (it's amazing how a little frustration can fuel such hubris). I
had thought to do this before - and should have; I uploaded the full
example collection configuration to ZK just now and tried again. Magic, it
worked, which left me feeling a bit glum. Well, happy that it wasn't Solr.
Now if you'll excuse me, I have a conf review to perform.

Darren

On Wed, Jan 21, 2015 at 6:48 PM, Chris Hostetter hossman_luc...@fucit.org
wrote:


 : I posted a question on stackoverflow but in hindsight this would have
 been
 : a better place to start. Below is the link.
 :
 : Basically I can't get the example working when using an external ZK
 cluster
 : and auto-core discovery. Solr 4.10.1 works fine, but the newest release

 your SO URL shows the output of using your custom configs, but not what
 you got with the example configs -- so it's not clear to me if there is
 really just one problem, or perhaps 2?

 you also mentioned a lot of details about how you are using solr with zk,
 and what doens't work, but it's not clear if you tried other simpler steps
 using your configs -- or the example configs -- and if those simpler *did*
 work (ie: single node solr startup?)

 my best guess, based on the logs you did post and the mention of
 lib/mq/solr-search-ahead-2.0.0.jar in those logs, is that the entire
 question of zk and slcuster state and leaders is a red herring, and what
 you are running into is: SOLR-6643...

 https://issues.apache.org/jira/browse/SOLR-6643

 ...if i'm right, then simple core discovery with your configs on a single
 node solr instance w/o any knowledge of ZK will also fail to init the core
 -- and if you try to use the CoreAdmin API to CREATE a core, you'll ge
 some kind of LinkageError.




 : Here is the stackoverflow question, along with the full log output:
 :
 http://stackoverflow.com/questions/28004832/solr-4-10-3-is-not-proceeding-to-leader-election-on-new-cluster-startup-hangs


 -Hoss
 http://www.lucidworks.com/




-- 
Darren


If I change schema.xml then reIndex is neccessary in Solr or not?

2015-01-21 Thread Nitin Solanki
I *indexed* *2GB* of data. Now I want to *change* the *type* of *field*
from *textSpell* to *string* type into

*schema.xml.*
Detail Explanation on Stackoverflow. Below is the link:

http://stackoverflow.com/questions/28072109/if-i-change-schema-xml-then-reindex-is-neccessary-in-solr-or-not/28073815#28073815


Re: If I change schema.xml then reIndex is neccessary in Solr or not?

2015-01-21 Thread Gora Mohanty
On 22 January 2015 at 11:23, Nitin Solanki nitinml...@gmail.com wrote:
 I *indexed* *2GB* of data. Now I want to *change* the *type* of *field*
 from *textSpell* to *string* type into

Yes, one would need to reindex.

Regards,
Gora


Re: Errors using the Embedded Solar Server

2015-01-21 Thread Carl Roberts

Hi Shawn,

Many thanks for all your help.  Moving the lucene JARs from 
solr.solr.home/lib to the same classpath directory as the solr JARs plus 
adding a bunch more dependency JAR files and most of the files from the 
collection1/conf directory - these ones to be exact, has me a lot closer 
to my goal:


rw-r--r--   1 carlroberts  staff 38 Jan 21 20:41 _rest_managed.json
-rw-r--r--   1 carlroberts  staff 56 Jan 21 20:41 
_schema_analysis_stopwords_english.json

-rw-r--r--   1 carlroberts  staff   4041 Dec 10 00:37 currency.xml
-rw-r--r--   1 carlroberts  staff   1386 Dec 10 00:37 elevate.xml
drwxr-xr-x  41 carlroberts  staff   1394 Dec 10 00:37 lang
-rw-r--r--   1 carlroberts  staff894 Dec 10 00:37 protwords.txt
-rw-r--r--@  1 carlroberts  staff  62063 Jan 21 13:02 schema.xml
-rw-r--r--@  1 carlroberts  staff  76821 Jan 21 13:03 solrconfig.xml
-rw-r--r--   1 carlroberts  staff 16 Dec 10 00:37 spellings.txt
-rw-r--r--   1 carlroberts  staff795 Dec 10 00:37 stopwords.txt
-rw-r--r--   1 carlroberts  staff   1148 Dec 10 00:37 synonyms.txt


I am now getting this:

[main] INFO org.apache.solr.core.SolrResourceLoader - new 
SolrResourceLoader for directory: '/Users/carlroberts/dev/solr-4.10.3/'
[main] INFO org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/lib/commons-logging-1.2.jar' to 
classloader
[main] INFO org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/lib/servlet-api.jar' to classloader
[main] INFO org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/lib/slf4j-simple-1.7.5.jar' to 
classloader
[main] INFO org.apache.solr.core.ConfigSolr - Loading container 
configuration from /Users/carlroberts/dev/solr-4.10.3/solr.xml

[main] INFO org.apache.solr.core.CoreContainer - New CoreContainer 139145087
[main] INFO org.apache.solr.core.CoreContainer - Loading cores into 
CoreContainer [instanceDir=/Users/carlroberts/dev/solr-4.10.3/]
[main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - 
Setting socketTimeout to: 0
[main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - 
Setting urlScheme to: null
[main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - 
Setting connTimeout to: 0
[main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - 
Setting maxConnectionsPerHost to: 20
[main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - 
Setting corePoolSize to: 0
[main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - 
Setting maximumPoolSize to: 2147483647
[main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - 
Setting maxThreadIdleTime to: 5
[main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - 
Setting sizeOfQueue to: -1
[main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - 
Setting fairnessPolicy to: false
[main] INFO org.apache.solr.update.UpdateShardHandler - Creating 
UpdateShardHandler HTTP client with params: 
socketTimeout=0connTimeout=0retry=false
[main] INFO org.apache.solr.logging.LogWatcher - SLF4J impl is 
org.slf4j.impl.SimpleLoggerFactory

[main] INFO org.apache.solr.logging.LogWatcher - No LogWatcher configured
[main] INFO org.apache.solr.core.CoreContainer - Host Name: null
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - new SolrResourceLoader for 
directory: '/Users/carlroberts/dev/solr-4.10.3/db/'
[coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrConfig - 
Adding specified lib dirs to ClassLoader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/apache-mime4j-core-0.7.2.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/apache-mime4j-dom-0.7.2.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/aspectjrt-1.6.11.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/bcmail-jdk15-1.45.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/bcprov-jdk15-1.45.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/boilerpipe-1.1.0.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/commons-compress-1.7.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 

Re: Errors using the Embedded Solar Server

2015-01-21 Thread Shawn Heisey
On 1/21/2015 7:02 PM, Carl Roberts wrote:
 Got it all working...:)
 
 I just replaced the solrconfig.xml and schema.xml files that I was using
 with the ones from collection1 in one of the examples.  I had modified
 those files to remove certain sections which I thought were not needed
 and apparently I don't understand those files very well yet...:)

Glad you got it working.  Here's the problem.  In that log you included,
the error was:

ERROR org.apache.solr.core.SolrCore -
org.apache.solr.common.SolrException: undefined field text

Your solrconfig.xml file referenced a field named text (probably in
the df parameter of a request handler) ... but your schema.xml did not
have that field defined.

Thanks,
Shawn



Re: Errors using the Embedded Solar Server

2015-01-21 Thread Carl Roberts
Ah - OK - let me try that.   BTW - I applied the fix from the bug link 
you gave me to log the errors and I am now at least getting the actual 
errors:


*default core name=db
solr home=/Users/carlroberts/dev/solr-4.10.3/
db is loaded=false
core init 
failures={db=org.apache.solr.core.CoreContainer$CoreLoadFailure@4d351f9b}

cores=[]
Exception in thread main org.apache.solr.common.SolrException: 
SolrCore 'db' is not available due to init failure: JVM Error creating 
core [db]: org/apache/lucene/queries/function/ValueSource

at org.apache.solr.core.CoreContainer.getCore(CoreContainer.java:749)
at 
org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:110)
at 
org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:124)

at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:68)
at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:54)
at solr.Test.main(Test.java:38)
Caused by: org.apache.solr.common.SolrException: JVM Error creating core 
[db]: org/apache/lucene/queries/function/ValueSource

at org.apache.solr.core.CoreContainer.create(CoreContainer.java:508)
at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:255)
at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:249)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NoClassDefFoundError: 
org/apache/lucene/queries/function/ValueSource

at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:274)
at 
org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:484)
at 
org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:521)
at 
org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:517)
at 
org.apache.solr.schema.FieldTypePluginLoader.create(FieldTypePluginLoader.java:81)
at 
org.apache.solr.schema.FieldTypePluginLoader.create(FieldTypePluginLoader.java:43)
at 
org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:151)

at org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:486)
at org.apache.solr.schema.IndexSchema.init(IndexSchema.java:166)
at 
org.apache.solr.schema.IndexSchemaFactory.create(IndexSchemaFactory.java:55)
at 
org.apache.solr.schema.IndexSchemaFactory.buildIndexSchema(IndexSchemaFactory.java:69)
at 
org.apache.solr.core.ConfigSetService.createIndexSchema(ConfigSetService.java:90)
at 
org.apache.solr.core.ConfigSetService.getConfig(ConfigSetService.java:62)

at org.apache.solr.core.CoreContainer.create(CoreContainer.java:489)
... 6 more
Caused by: java.lang.ClassNotFoundException: 
org.apache.lucene.queries.function.ValueSource

at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
... 21 more
*
On 1/21/15, 7:32 PM, Shawn Heisey wrote:

On 1/21/2015 5:16 PM, Carl Roberts wrote:

BTW - it seems that is very hard to get started with the Embedded
server.  The doc is out of date.  The code seems to be untested and buggy.

On 1/21/15, 7:15 PM, Carl Roberts wrote:

HmmmIt looks like FutureTask is calling setException(Throwable t)
with this exception which is not making it to the console.

What I don't understand is why it is throwing that exception.  I made
sure that I added lucene-queries-4.10.3.jar file to the classpath by
adding it to the solr home directory.  See the new tracing:

I'm pretty sure that all the lucene jars need to be available *before*
Solr reaches the point in the log that you have quoted, where it adds
jars from ${solr.solr.home}/lib.  This would be the same location where
the solrj and solr-core jars live.  The only kind of jars that should be
in the solr home lib directory are extra jars for extra features that
you might specify in schema.xml (or some places in solrconfig.xml), like
the ICU analysis jars, tika, mysql, etc.

Thanks,
Shawn





Re: Issue with Solr multiple sort

2015-01-21 Thread shamik
Thanks Hoss for clearing up my doubt. I was confused with the ordering. So I
guess, the first field is always the primary sort field followed by
secondary.

Thanks again.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Issue-with-Solr-multiple-sort-tp4181056p4181062.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Errors using the Embedded Solar Server

2015-01-21 Thread Carl Roberts

Got it all working...:)

I just replaced the solrconfig.xml and schema.xml files that I was using 
with the ones from collection1 in one of the examples.  I had modified 
those files to remove certain sections which I thought were not needed 
and apparently I don't understand those files very well yet...:)


Many thanks,

Joe

On 1/21/15, 8:47 PM, Carl Roberts wrote:

Hi Shawn,

Many thanks for all your help.  Moving the lucene JARs from 
solr.solr.home/lib to the same classpath directory as the solr JARs 
plus adding a bunch more dependency JAR files and most of the files 
from the collection1/conf directory - these ones to be exact, has me a 
lot closer to my goal:


rw-r--r--   1 carlroberts  staff 38 Jan 21 20:41 _rest_managed.json
-rw-r--r--   1 carlroberts  staff 56 Jan 21 20:41 
_schema_analysis_stopwords_english.json

-rw-r--r--   1 carlroberts  staff   4041 Dec 10 00:37 currency.xml
-rw-r--r--   1 carlroberts  staff   1386 Dec 10 00:37 elevate.xml
drwxr-xr-x  41 carlroberts  staff   1394 Dec 10 00:37 lang
-rw-r--r--   1 carlroberts  staff894 Dec 10 00:37 protwords.txt
-rw-r--r--@  1 carlroberts  staff  62063 Jan 21 13:02 schema.xml
-rw-r--r--@  1 carlroberts  staff  76821 Jan 21 13:03 solrconfig.xml
-rw-r--r--   1 carlroberts  staff 16 Dec 10 00:37 spellings.txt
-rw-r--r--   1 carlroberts  staff795 Dec 10 00:37 stopwords.txt
-rw-r--r--   1 carlroberts  staff   1148 Dec 10 00:37 synonyms.txt


I am now getting this:

[main] INFO org.apache.solr.core.SolrResourceLoader - new 
SolrResourceLoader for directory: '/Users/carlroberts/dev/solr-4.10.3/'
[main] INFO org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/lib/commons-logging-1.2.jar' 
to classloader
[main] INFO org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/lib/servlet-api.jar' to 
classloader
[main] INFO org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/lib/slf4j-simple-1.7.5.jar' 
to classloader
[main] INFO org.apache.solr.core.ConfigSolr - Loading container 
configuration from /Users/carlroberts/dev/solr-4.10.3/solr.xml
[main] INFO org.apache.solr.core.CoreContainer - New CoreContainer 
139145087
[main] INFO org.apache.solr.core.CoreContainer - Loading cores into 
CoreContainer [instanceDir=/Users/carlroberts/dev/solr-4.10.3/]
[main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory 
- Setting socketTimeout to: 0
[main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory 
- Setting urlScheme to: null
[main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory 
- Setting connTimeout to: 0
[main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory 
- Setting maxConnectionsPerHost to: 20
[main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory 
- Setting corePoolSize to: 0
[main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory 
- Setting maximumPoolSize to: 2147483647
[main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory 
- Setting maxThreadIdleTime to: 5
[main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory 
- Setting sizeOfQueue to: -1
[main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory 
- Setting fairnessPolicy to: false
[main] INFO org.apache.solr.update.UpdateShardHandler - Creating 
UpdateShardHandler HTTP client with params: 
socketTimeout=0connTimeout=0retry=false
[main] INFO org.apache.solr.logging.LogWatcher - SLF4J impl is 
org.slf4j.impl.SimpleLoggerFactory

[main] INFO org.apache.solr.logging.LogWatcher - No LogWatcher configured
[main] INFO org.apache.solr.core.CoreContainer - Host Name: null
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - new SolrResourceLoader for 
directory: '/Users/carlroberts/dev/solr-4.10.3/db/'
[coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrConfig - 
Adding specified lib dirs to ClassLoader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/apache-mime4j-core-0.7.2.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/apache-mime4j-dom-0.7.2.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/aspectjrt-1.6.11.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/bcmail-jdk15-1.45.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/bcprov-jdk15-1.45.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 

Re: Solr Recovery process

2015-01-21 Thread Nishanth S
Thank you Shalin.So in a system where the indexing rate is more than 5K TPS
or so the replica  will never be able to recover   through peer sync
process.In  my case I have mostly seen  step 3 where a full copy happens
and  if the index size is huge it takes a very long time for replicas to
recover.Is there a way we can  configure the  number of missed updates for
peer sync.

Thanks,
Nishanth

On Wed, Jan 21, 2015 at 4:47 PM, Shalin Shekhar Mangar 
shalinman...@gmail.com wrote:

 Hi Nishanth,

 The recovery happens as follows:

 1. PeerSync is attempted first. If the number of new updates on leader is
 less than 100 then the missing documents are fetched directly and indexed
 locally. The tlog tells us the last 100 updates very quickly. Other uses of
 the tlog are for durability of updates and of course, startup recovery.
 2. If the above step fails then replication recovery is attempted. A hard
 commit is called on the leader and then the leader is polled for the latest
 index version and generation. If the leader's version and generation are
 greater than local index's version/generation then the difference of the
 index files between leader and replica are fetched and installed.
 3. If the above fails (because leader's version/generation is somehow equal
 or more than local) then a full index recovery happens and the entire index
 from the leader is fetched and installed locally.

 There are some other details involved in this process too but probably not
 worth going into here.

 On Wed, Jan 21, 2015 at 5:13 PM, Nishanth S nishanth.2...@gmail.com
 wrote:

  Hello Everyone,
 
  I am hitting a few issues with solr replicas going into recovery and then
  doing a full index copy.I am trying to understand the solr recovery
  process.I have read a few blogs  on this and saw  that when leader
 notifies
  a replica to  recover(in my case it is due to connection resets) it will
  try to do a peer sync first and  if the missed updates are more than 100
 it
  will do a full index copy from the leader.I am trying to understand what
  peer sync is and where does tlog come into picture.Are tlogs replayed
 only
  during server restart?.Can some one  help me with this?
 
  Thanks,
  Nishanth
 



 --
 Regards,
 Shalin Shekhar Mangar.



RE: Field collapsing memory usage

2015-01-21 Thread Toke Eskildsen
Norgorn [lsunnyd...@mail.ru] wrote:
 So, as we see, memory, used by first shard to group, wasn't released.
 Caches are already nearly zero.

It should be one or the other: Either the memory is released or there is 
something in the caches. Anyway, DocValues is the way to go, so ensure that it 
turned on for your group field: We do grouping on indexes with 250M documents 
(and 200M+ unique values in the group field) without any significant memory 
overhead, using DocValues.

Caveat: If you ask for very large result sets, the memory usage will be high. 
But only temporarily.

- Toke Eskildsen


Field collapsing memory usage

2015-01-21 Thread Norgorn
We are trying to run SOLR with big index, using as little RAM as possible.
Simple search for our cases works nice, but field collapsing (group=true)
queries fall with OOM.

Our setup is several shards per SOLR entity, each shard on it's own HDD.
We've tried same queries, but to one specific shard, and those queries
worked well (no OOMs).

Then we changed shard being queried and measured RAM usage. We saw, that
while there is only one shard being queried, used RAM increased
significantly.

So, as we see, memory, used by first shard to group, wasn't released.
Caches are already nearly zero.

Changing shards, we've managed to make SOLR fall.

My question is, why is it so? What do we need to do, to release memory, to,
at the end, be able to query shards alternately (cause parallel group query
fails nearly always)?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Field-collapsing-memory-usage-tp4181092.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: If I change schema.xml then reIndex is neccessary in Solr or not?

2015-01-21 Thread Nitin Solanki
Ok. Thanx

On Thu, Jan 22, 2015 at 11:38 AM, Gora Mohanty g...@mimirtech.com wrote:

 On 22 January 2015 at 11:23, Nitin Solanki nitinml...@gmail.com wrote:
  I *indexed* *2GB* of data. Now I want to *change* the *type* of *field*
  from *textSpell* to *string* type into

 Yes, one would need to reindex.

 Regards,
 Gora



Re: MultiPhraseQuery:Rewrite to BooleanQuery

2015-01-21 Thread ku3ia
Any ideas?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/MultiPhraseQuery-Rewrite-to-BooleanQuery-tp4180638p4180820.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Using SolrCloud to implement a kind of federated search

2015-01-21 Thread Toke Eskildsen
On Tue, 2015-01-20 at 15:41 +0100, Jürgen Wagner (DVT) wrote:

[Snip: Valid concerns]

 3. Cardinality: there may be rather large collections and some smaller
 collections in the federation. If you use SolrCloud to obtain results,
 the ones from smaller collections will get more significance in the
 result mixing than the ones from the larger collections, as relevance
 will be relative to each federated source.

The math might be solvable or at least fuzzy solvable: SOLR-1632 takes
care of unifying term stats and site-specific boosts, defined in the
merger, can compensate somewhat for overall score-adjustments from the
different sites.

 4. Uniqueness: different systems may index the same documents. The
 idea of having a globally unique identifier should take this into
 account, i.e., it won't suffice to simply prefix each (locally unique)
 document id with a source identifier. The federated sources must be
 aware of being federated and possibly having overlaps. Otherwise, you
 will get multiple occurrences of very popular documents.

Different sources might have different meta-data on the same entity.
Some sort of nearly-duplicate-document-merge might be preferable.
 
 6. Orchestration: there will be some issues with the orchestration of
 these services. Zookeeper won't scale to the multiple datacenter
 topology, effectively leaving node discovery to some other mechanism
 yet to be defined.

If the nodes are locally run proxies exposed as a Solr shard, the
connection details will be de-coupled from ZooKeeper. That would also
allow for mapping of field names  values and similar site-specific
adjustments of requests  queries.

 In my experience, there is a clear distinction between technical 
 federated search (possibly something like the tribe nodes) and 
 semantic federated search (requiring special processing of results 
 obtained from different sources, ready to be consolidated).

We have spend a fair amount of time getting semantic federated search
(we call it integrated search) to work across our sources. The raw
requesting  merging is not too hard: Most of the development time has
been spend mapping values and adjusting how the merger should order the
documents.

- Toke Eskildsen, State and University Library, Denmark





Re: shards per disk

2015-01-21 Thread Toke Eskildsen
On Wed, 2015-01-21 at 09:46 +0100, Toke Eskildsen wrote:
 Anyway, RAID 0 does really help for random access, [...]

Should have been ...does not really help

- Toke Eskildsen




Re: shards per disk

2015-01-21 Thread Toke Eskildsen
On Wed, 2015-01-21 at 07:56 +0100, Nimrod Cohen wrote:
 RAID [0] configuration
 
 each shard has data on each one of the 8 disks in the RAID, on each
 query to get 1K docs, each shard request to get data from the one RAID
 disk, so we get 8 request to get date from all of the disks and we get
 a queue.

Your RAID-setup (whether it is hardware or software) should use a
parallel queue, so that requests to different physical drives are issued
in parallel under the hood. But RAID is not that well-defined, so maybe
your controller or your software uses a single sequential queue. In that
case, the pattern will be as you describe.

Anyway, RAID 0 does really help for random access, when your access
pattern is homogeneous across shards. Even if you fix the problem with
your current RAID 0 setup, it is unlikely that you would get a
noticeable performance advantage over separate drives. It would make it
easier to add shards though, as you would not have to purchase a new
drive or unbalance your setup by running multiple shards on some drives.

 Regarding the response time, 2-3 seconds is good for our usage also
 getting better is always better, if we will get better we might run
 the analysis on more than 1K.

Limit the amount of fields you request and try experimenting with SolrJ
and the binary protocol: I have found that the time for serializing the
result to XML can be quite high for large responses.

If the number of fields needed is very low and the content of those
fields is not large, you could try using faceting with DocValues to get
the content.


- Toke Eskildsen, State and University Library, Denmark





Add user-defined field into suggestions block.

2015-01-21 Thread Nitin Solanki
I am working on solr spell checker along with suggester. I am saving
document like this :

{ngram:the,count:10}
{ngram:the age,count:5}
{ngram:the age of,count:3}

where *ngram* is unique key and applied *StandardTokenizer* and
*ShingleFactoryFilter*(1 to 5 size).

So, when I search word *the* it gives results along with suggetions like
:

response:{numFound:63,start:0,maxScore:15.783233,docs:[
  {
count:10,
gram:the,
_version_:1489726792958738435}
  },

suggestion:[{
word:that,
freq:1169},
  {
word:they,
freq:712}]
   ]

So, suggestion gives *word* and *freq* field according to them. I want
to *add one more field* - *count* into suggestion block where *count*
should the same value which is available into documents. I don't want
to use *freq* field into suggestion block. Instead of that I want
*count* field. How can I do that?


How much maximum data can we hard commit in Solr?

2015-01-21 Thread Nitin Solanki
How much of maximum data we can commit on Solr using hard commit without
using Soft commit.
maxTime is 1000 in autoCommit

Details explanation is on Stackoverflow
http://stackoverflow.com/questions/28067853/how-much-maximum-data-can-we-hard-commit-in-solr
.


Re: How to make edge_ngram work with number, underscores, dashes and space

2015-01-21 Thread Vishal Swaroop
Thanks a lot Alex...

It looks like it works as expected... I removed EdgeNGramFilterFactory
from query section and used KeywordTokenizerFactory in index... this
is final version..

fieldType name=text_general_edge_ngram class=solr.TextField
positionIncrementGap=100
   analyzer type=index
tokenizer class=solr.KeywordTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.EdgeNGramFilterFactory minGramSize=1
maxGramSize=15 side=front/
   /analyzer
   analyzer type=query
tokenizer class=solr.LowerCaseTokenizerFactory/
   /analyzer
/fieldType

So... when is right to use *tokenizer *
class=solr.EdgeNGramTokenizerFactory/...



On Tue, Jan 20, 2015 at 11:46 PM, Alexandre Rafalovitch arafa...@gmail.com
wrote:

 So, try the suggested tokenizers and dump the ngrams from query. See
 what happens. Ask a separate question with corrected config/output if
 you still have issues.

 Regards,
Alex.
 
 Sign up for my Solr resources newsletter at http://www.solr-start.com/


 On 20 January 2015 at 23:08, Vishal Swaroop vishal@gmail.com wrote:
  Thanks for the response..
  a) I am trying to make it non-case-sensitive... itemName data is indexed
 in
  upper case
 
  b) I am looking to display the result as type-ahead suggestion which
 might
  include space, underscore, number...
 
  - ABC12DE : It does not work as soon as I type 1.. i.e. ABC1
  Output expected A, AB, ABC, ABC1... so on
  Data can also have underscores, dashes
  - ABC_12DE, : Output expected A, AB, ABC, ABC_, ABC_1... so
 on
 
  Filed name  type defined in schema :
  field name=itemName type=text_general_edge_ngram indexed=true
  stored=true multiValued=false /
 
  fieldType name=text_general_edge_ngram class=solr.TextField
  positionIncrementGap=100
 analyzer type=index
  tokenizer class=solr.LowerCaseTokenizerFactory/
  filter class=solr.EdgeNGramFilterFactory minGramSize=1
  maxGramSize=15 side=front/
 /analyzer
 analyzer type=query
  tokenizer class=solr.LowerCaseTokenizerFactory/
  filter class=solr.EdgeNGramFilterFactory minGramSize=1
  maxGramSize=15 side=front/
 /analyzer
  /fieldType
 
  On Tue, Jan 20, 2015 at 9:53 PM, Alexandre Rafalovitch 
 arafa...@gmail.com
  wrote:
 
  Were you actually trying to ...divides text at non-letters and
  converts them to lower case? Or were you trying to make it
  non-case-sensitive, which would be KeywordTokenizer and
  LowerCaseFilter?
 
  Also, normally we do not use NGRam filter on both Index and Query.
  That just makes things to match on common prefixes instead of matching
  what you are searching for to a prefix of original word.
 
  Regards,
  Alex.
  
  Sign up for my Solr resources newsletter at http://www.solr-start.com/
 
 
  On 20 January 2015 at 21:47, Vishal Swaroop vishal@gmail.com
 wrote:
   Hi,
  
   May be this is basic but I am trying to understand which Tokenizer and
   Filter to use. I followed some examples as mentioned in solr wiki but
   type-ahead does not show expected suggestions.
  
   Example itemName data can be :
   - ABC12DE : It does not work as soon as I type 1.. i.e. ABC1
   - ABC_12DE, ABC 12DE
   - Data can also have underscores, dashes
   - I am tyring ignorecase auto suggest
  
   Filed name  type defined in schema :
   field name=itemName type=text_general_edge_ngram indexed=true
   stored=true multiValued=false /
  
   fieldType name=text_general_edge_ngram class=solr.TextField
   positionIncrementGap=100
  analyzer type=index
   tokenizer class=solr.LowerCaseTokenizerFactory/
   filter class=solr.EdgeNGramFilterFactory minGramSize=1
   maxGramSize=15 side=front/
  /analyzer
  analyzer type=query
   tokenizer class=solr.LowerCaseTokenizerFactory/
   filter class=solr.EdgeNGramFilterFactory minGramSize=1
   maxGramSize=15 side=front/
  /analyzer
   /fieldType
 



Re: MultiPhraseQuery:Rewrite to BooleanQuery

2015-01-21 Thread Tomoko Uchida
Hi,

Strictly speaking, MultiPhraseQuery and BooleanQuery wrapping PhraseQuerys
are not equal.

For each query, Query.rewrite() returns different object. (with Lucene
4.10.3)
q1.rewrite(reader).toString() returns:
body:blueberry chocolate (pie tart), where q1 is your first multi
phrase query.
q2.rewrite(reader).toString() returns:
body:blueberry chocolate pie body:blueberry chocolate tart, where
q2 is your second boolean query.

In practice... I *think* two queries may return same set of documents, but
I'm not sure about scoring/ranking.

I suggest you ask to java-user@lucene mailing list as for Lucene API.

Regards,
Tomoko



2015-01-21 19:12 GMT+09:00 ku3ia dem...@gmail.com:

 Any ideas?



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/MultiPhraseQuery-Rewrite-to-BooleanQuery-tp4180638p4180820.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: How to index data from multiple data source

2015-01-21 Thread Shawn Heisey
On 1/20/2015 10:43 PM, Yusniel Hidalgo Delgado wrote:
 I am diving into Solr recently and I need help in the following usage 
 scenery. I am working on a project for extract and search bibliographic 
 metadata from PDF files. Firstly, my PDF files are processed to extract 
 bibliographic metadata such as title, authors, affiliations, keywords and 
 abstract. These metadata are stored in a relational database and then are 
 indexed in Solr via DIH, however, I need to index also the fulltext of PDF 
 and maintain the same ID between metadata indexed and fulltext of PDF indexed 
 in Solr index. How to do that? How to configure sorlconfig.xml and schema.xml 
 to do it? 

How are you doing the indexing?  If it's in a program you wrote
yourself, simply extend that program to obtain the information you need
and add it to the document that you index.  The Apache Tika project is
one way to parse rich text documents.

If you are using the dataimport handler, you are likely to need a nested
entity to gather the additional information and include it in the
document that is being indexed in the parent entity. The reply from
Alvaro shows one way to integrate Tika into DIH.  It looks like those
instructions are geared to an extremely old Solr version (3.6.2) and
probably won't work as-is on a newer version.  Solr 4.x was already
available when that blog post was written two years ago, so I don't know
why they went with 3.6.2.

Thanks,
Shawn



Re: How much maximum data can we hard commit in Solr?

2015-01-21 Thread Shawn Heisey
On 1/21/2015 6:01 AM, Nitin Solanki wrote:
 How much of maximum data we can commit on Solr using hard commit without
 using Soft commit.
 maxTime is 1000 in autoCommit
 
 Details explanation is on Stackoverflow
 http://stackoverflow.com/questions/28067853/how-much-maximum-data-can-we-hard-commit-in-solr

The answer to the question you asked: All of it.

I suspect you are actually trying to ask a different question.

Some additional info, hopefully you can use it to answer what you'd
really like to know:

You could build your entire index with no commits and then issue a
single hard commit and everything would work.  The problem with that
approach is that if you have the updateLog turned on, then every single
one of those documents will be reindexed from the transaction log at
Solr startup - it could take a REALLY long time.

http://wiki.apache.org/solr/SolrPerformanceProblems#Slow_startup

Hard commits are the only way to close a transaction log and open a new
one.  Solr keeps enough transaction logs around so that it can re-index
a minimum of 100 documents ... but it can't break the transaction logs
into parts, so if everything is in one log, then that giant log will be
replayed on startup.

A maxTime of 1000 on autoCommit or autoSoftCommit is usually way too
low.  We find that this setting is normally driven by unrealistic
requirements from sales or marketing, who say that data must be
available within one second of indexing.  It is extremely rare for this
to be truly required.

The autoCommit settings control automatic hard commits, and
autoSoftCommit naturally controls automatic soft commits.  With a
maxTime of 1000, you will be issuing a commit every single second while
you index.  Commits are very resource-intensive operations, doing them
once a second will keep your hardware VERY busy.  Normally a commit
operation will take a lot longer than one second to complete, so if you
are starting another one a second later, they will overlap, and that can
cause a lot of problems.

Thanks,
Shawn



Re: AW: AW: transactions@Solr(J)

2015-01-21 Thread Shawn Heisey
On 1/20/2015 11:42 PM, Clemens Wyss DEV wrote:
 But then what happens if:
 Autocommit is set to 10 docs
 and
 I add 11 docs and then decide (due to an exception?) to rollback.
 
 Will only one (i.e. the last added) document be rollen back?

The way I understand the low-level architecture, yes -- assuming that
all 11 documents actually got indexed.  If the exception happened
because of document 5 was badly formed, only documents 1-4 will have
been indexed, and in that case, all four of them would get rolled back.

Thanks,
Shawn