Jetty, Tomcat or JBoss?

2010-04-17 Thread Andrea Gazzarini
Hi all, I have a web application which is basically a (user) search interface towards SOLR. My index is something like 7GB and has a lot of records so apart other things like optiming SOLR schema, config ,clustering etc... I'd like to keep SOLR installation as light as possible. At the moment

Re: Jetty, Tomcat or JBoss?

2010-04-17 Thread Lukáš Vlček
Hi, may be you should be aware that JBoss AS is using Tomcat for web container (with modified classloader), so if your web application is running inside JBoss AS then it is in fact running in Tomcat. I don't think Solr uses JEE technologies provided by JEE Application server (JMS, Transaction

Re: Jetty, Tomcat or JBoss?

2010-04-17 Thread Abdelhamid ABID
Solr does use JEE WEB components On 4/17/10, Lukáš Vlček lukas.vl...@gmail.com wrote: Hi, may be you should be aware that JBoss AS is using Tomcat for web container (with modified classloader), so if your web application is running inside JBoss AS then it is in fact running in Tomcat. I

Re: HTMLStripCharFilterFactory configuration problem

2010-04-17 Thread Ranveer Kumar
Hi Sven, Thanks for reply.. but how will I get the stored value instead of indexed value.. where I need to configure to get stored instead of indexed value. please help... thanks with regards On Wed, Apr 14, 2010 at 3:16 PM, Sven Maurmann sven.maurm...@kippdata.dewrote: Hi, please note

Re: HTMLStripCharFilterFactory configuration problem

2010-04-17 Thread Ahmet Arslan
Thanks for reply.. but how will I get the stored value instead of indexed value.. where I need to configure to get stored instead of indexed value. please help... You need to remove html tags before analysis (charfilter, tokenizer, tokenfilter) phase. For example if you are using DIH

Facet count problem

2010-04-17 Thread Ranveer Kumar
Hi, I am facing problem to get facet result count. I must be wrong somewhere. I am getting proper result count when searching by single word, but when searching by string then result count become wrong. for example : - search keyword : Bagdad bomb blast. I am getting 5 result count for facet

Re: HTMLStripCharFilterFactory configuration problem

2010-04-17 Thread Ranveer Kumar
thanks.. Actually I am using SolrJ client.. Is there anyway to do same using solrj. thanks On Sat, Apr 17, 2010 at 8:06 PM, Ahmet Arslan iori...@yahoo.com wrote: Thanks for reply.. but how will I get the stored value instead of indexed value.. where I need to configure to get stored

Re: HTMLStripCharFilterFactory configuration problem

2010-04-17 Thread Ahmet Arslan
Actually I am using SolrJ client.. Is there anyway to do same using solrj. thanks If you are using Java, life is easier. You can use this static function before adding a field to SolrInputDocument. static String stripHTMLX(String value) { StringBuilder out = new StringBuilder();

Re: Facet count problem

2010-04-17 Thread Ahmet Arslan
I am facing problem to get facet result count. I must be wrong somewhere. I am getting proper result count when searching by single word, but when searching by string then result count become wrong. for example : - search keyword : Bagdad bomb blast. I am getting 5 result count for facet

Re: run in background

2010-04-17 Thread Chris Hostetter
Better yet: run your servlet container as a daemon (server) process, and not just as something you execute manually as a user. the java -jar start.jar command is just provided to make it really easy for people to try out the solr example directly from a release on their local dev box -- it is

Solr Schema Question

2010-04-17 Thread Serdar Sahin
Hi, I am rather new to Solr and have a question. We have around 200.000 txt files which are placed into the file cloud. The file path is something similar to this: file/97/8f/840/fa4-1.txt file/a6/9d/ab0/ca2-2.txt etc. and we also store the metadata (like title, description, tags etc) about

Re: Tree Component for much Categories.

2010-04-17 Thread Lance Norskog
No. Solr, in database terms, stores one table with very complex SELECT operator. If, for a document, you can do a recursive search in the DB and pull the entire tree of categories for a document, you can store that entire tree with the document and be able to search for any of the categories. On

Re: Solr Schema Question

2010-04-17 Thread Sascha Szott
Hi Serdar, take a look at Solr's DataImportHandler: http://wiki.apache.org/solr/DataImportHandler Best, Sascha Serdar Sahin wrote: Hi, I am rather new to Solr and have a question. We have around 200.000 txt files which are placed into the file cloud. The file path is something similar to

Re: dismax and date boosts

2010-04-17 Thread Lance Norskog
No, a copyField will not do the xlation from (seconds from epoch) to (milliseconds from 1/1/1970). You should be able to do this with a combination of functions in your database SELECT call. The major DBs all have a wealth of functions that xform between numbers and dates. The DIH is smart about

Re: Solr Schema Question

2010-04-17 Thread Ahmet Arslan
I am rather new to Solr and have a question. We have around 200.000 txt files which are placed into the file cloud. The file path is something similar to this: file/97/8f/840/fa4-1.txt file/a6/9d/ab0/ca2-2.txt etc. and we also store the metadata (like title, description, tags etc)

Re: Kinda-sorta realtime?

2010-04-17 Thread Lance Norskog
I've never seen mention of a 1/2 second guarantee in any Lucene project. Are there any such projects? You can get a 30-second garbage collection pause with 8-16G of ram. On 4/16/10, Peter Sturge peter.stu...@googlemail.com wrote: Hi Don, We've got a similar requirement in our environment -

Re: Solr Index Lock Issue

2010-04-17 Thread Lance Norskog
The commit call can return before Solr is completely finished with everything it does after a commit. This tail processing can build up with successive commits until Solr is basically stuck. Commit should use waitFlush=true. This makes the commit call wait until the disk manipulation is done. If

Re: Solr Schema Question

2010-04-17 Thread Lance Norskog
The DataImportHandler can let you fetch the file name from the database record, and then load the file as a field and process the text with Tika. It will not be easy :) but it is possible. http://wiki.apache.org/solr/DataImportHandler On 4/17/10, Serdar Sahin anlamar...@gmail.com wrote: Hi,

Re: Solr Schema Question

2010-04-17 Thread Lance Norskog
Man you people are fast! There is a bug in Solr/Lucene. It keeps memory around from previous fields, so giant text files might run out of memory when they should not. This bug is fixed in the trunk. On 4/17/10, Lance Norskog goks...@gmail.com wrote: The DataImportHandler can let you fetch the

Re: DIH dataimport.properties with

2010-04-17 Thread Lance Norskog
The SolrEntityProcessor allows you to query a Solr instance and use the results as DIH properties. You would have to create your own regular query to do the delta-import instead of using the delta-import feature. https://issues.apache.org/jira/browse/SOLR-1499 On 4/16/10, Otis Gospodnetic

geometric distance

2010-04-17 Thread Dennis Gearon
How does solr/lucene do geometric distances? Does it use a GEOS point datum, or two columns one for latitude, one for longitude? Dennis Gearon Signature Warning EARTH has a Right To Life, otherwise we all die. Read 'Hot, Flat, and Crowded' Laugh at

Re: Kinda-sorta realtime?

2010-04-17 Thread Don Werve
2010/4/18 Lance Norskog goks...@gmail.com I've never seen mention of a 1/2 second guarantee in any Lucene project. Are there any such projects? Likely not; I think my goal is better stated as wanting an an average commit time of less than 500ms - 1s, which fits in pretty well with the 'near

RE: Solr Index Lock Issue

2010-04-17 Thread Sethi, Parampreet
Hi Otis, Thanks. I cleared the data folder and restarted the Solr server. We trigger solr ingestion through Java. We tried removing all incremental solr.commit() after every batch, and added a commit command using curl in the shell script file after the ingestion finishes. Content of .sh file