Re: Filtering HTML content in Solr 4.0.0

2012-10-26 Thread Rogério Pereira Araújo

I think you will have to write an UpdateProcessor to strip out html tags.

As per Solr 4.0 you can also use scripting languages like Python, Ruby and 
Javascript to write scripts for use as updateprocessors too.

-Mensagem Original- 
From: Pratyul Kapoor

Sent: Friday, October 26, 2012 3:56 AM
Subject: Filtering HTML content in Solr 4.0.0


I am using Solr 4.0.0. I have a HTML content as description of a product.
If I index it without any filtering it is giving errors on search.
How can I filter an HTML content.


Re: Multicore setup is ignored when deploying solr.war on Tomcat 5/6/7

2012-10-16 Thread Rogério Pereira Araújo

Hi Chris,

To answer your question, I tried both -Dsolr.solr.home and solr/home JNDI 
variable, in both cases I got the same result.

I checked the logs several times, solr always only loads up the collection1, 
if I rename the cores on solr.xml to anything else or add more cores, 
nothing happens.

Even if I put some garbage on solr.xml, by removing closing tags, no 
exception is generated.

I'm running Tomcat 7 and Solr 4 on Xubuntu 10.04, but I don't think the OS 
is the problem, I'll do the same test on other OSes.

-Mensagem Original- 
From: Chris Hostetter

Sent: Monday, October 15, 2012 5:38 PM
To: ;
Subject: Re: Multicore setup is ignored when deploying solr.war on Tomcat 

: on Tomcat I setup the system property pointing to solr/home path,
: unfortunatelly when I start tomcat the solr.xml is ignored and only the

Please elaborate on how exactly you pointed tomcat at your solr/home.

you mentioned system property but when using system properties to set
the Solr Home you wnat to set solr.solr.home .. solr/home is the JNDI
variable name used as an alternative.

if you look at the logging when solr first starts up, you should ese
several messages about how/where it's trying to locate the Solr Home Dir
... please double check that it's finding the one you intended.

Please give us more details about those log messages related to the solr
home dir, as well as how you are trying to set it, and what your directory
structure looks like in tomcat.

If you haven't seen it yet...


Re: Multicore setup is ignored when deploying solr.war on Tomcat 5/6/7

2012-10-15 Thread Rogério Pereira Araújo

Hi Vadim,

In fact tomcat is running in another non standard path, there's no old 
version deployed on tomcat, I double checked it.

Let me try in another environment.

-Mensagem Original- 
From: Vadim Kisselmann

Sent: Monday, October 15, 2012 6:01 AM
To: ;
Subject: Re: Multicore setup is ignored when deploying solr.war on Tomcat 

Hi Rogerio,
i can imagine what it is. Tomcat extract the war-files in
If you already run an older Solr-Version on your server, the old
extracted Solr-war could still be there (keyword: tomcat cache).
Delete the /var/lib/tomcatXX/webapps/solr - folder and restart tomcat,
when Tomcat should put your new war-file.
Best regards

2012/10/14 Rogerio Pereira

I'll try to be more specific Jack.

I just download the, from this archive I took the
core1 and core2 folders from multicore example and rename them to
collection1 and collection2, I also did all necessary changes on solr.xml
and solrconfig.xml and schema.xml on these two correct to reflect the new

After this step I just tried to deploy and war file on tomcat pointing to
the the directory (solr/home) where these two cores are located, solr.xml
is there, with collection1 and collection2 properly configured.

The question is, now matter what is contained on solr.xml, this file isn't
read at Tomcat startup, I tried to cause a parser error on solr.xml by
removing closing tags, but even with this change I can't get at least a
parser error.

I hope to be clear now.

2012/10/14 Jack Krupansky

I can't quite parse the same multicore deployment as we have on apache
solr 4.0 distribution archive. Could you rephrase and be more specific.
What archive?

Were you already using 4.0-ALPHA or BETA (or some snapshot of 4.0) or are
you moving from pre-4.0 to 4.0? The directory structure did change in 

Look at the example/solr directory.

-- Jack Krupansky

-Original Message- From: Rogerio Pereira
Sent: Sunday, October 14, 2012 10:01 AM
Subject: Multicore setup is ignored when deploying solr.war on Tomcat 


I tried to perform the same multicore deployment as we have on apache 
4.0 distribution archive, I created a directory for solr/home with 
inside and two subdirectories collection1 and collection2, these two 
are properly configured with conf folder and solrconfi.xml and 

on Tomcat I setup the system property pointing to solr/home path,
unfortunatelly when I start tomcat the solr.xml is ignored and only the
default collection1 is loaded.

As a test, I made changes on solr.xml to cause parser errors, and guess
what? These errors aren't reported on tomcat startup.

The same thing doesn't happens on multicore example that comes on
distribution archive, now I'm trying to figure out what's the black magic

Let me do the same kind of deployment on Windows and Mac OSX, if persist,
I'll update this thread.




Rogério Pereira Araújo

Skype: rogerio.araujo

(0xx62) 8240 7212
(0xx62) 3920 2666