Re: how often do you boys restart your tomcat?

2011-07-27 Thread Bernd Fehling
Till now I used jetty and got 2 week as the longest uptime until OOM. I just switched to tomcat6 and will see how that one behaves but I think its not a problem of the servlet container. Solr is pretty unstable if having a huge database. Actually this can't be blamed directly to Solr it is a

Re: how often do you boys restart your tomcat?

2011-07-27 Thread Paul Libbrecht
On curriki.org, our solr's Tomcat saturates memory after 2-4 weeks. I am still investigating if I am accumulating something or something else is. To check it, I am running a query all, return num results every minute to measure the time it takes. It's generally when it meets a big GC that gives

Re: how often do you boys restart your tomcat?

2011-07-27 Thread Bernd Fehling
It is definately Lucenes fieldCache making the trouble. Restart your solr and monitor it with jvisualvm, especially OldGen heap. When it gets to 100 percent filled use jmap to dump heap of your system. Then use Eclipse Memory Analyzer http://www.eclipse.org/mat/ and open the heap dump. You will

Re: Conditional field values in DataImport

2011-07-27 Thread Gora Mohanty
On Wed, Jul 27, 2011 at 7:20 AM, solruser@9913 gunaranj...@yahoo.com wrote: This may be a trivial question - I am noob :). In the dataimport of a CSV file, am trying to assign a field based on a conditional check on another field. E.g.   field name=rawLine regex=CSV-splitting-regex

Re: Different options for autocomplete/autosuggestion

2011-07-27 Thread scorpking
HI Bell, i used autocomplete in solr 3.1. same this: searchComponent name=autocomplete class=solr.SpellCheckComponent lst name=spellchecker str name=nameautocomplete/str str name=classnameorg.apache.solr.spelling.suggest.Suggester/str str

Re: Solr vs ElasticSearch

2011-07-27 Thread Tarjei Huse
On 06/01/2011 08:22 AM, Jason Rutherglen wrote: Thanks Shashi, this is oddly coincidental with another issue being put into Solr (SOLR-2193) to help solve some of the NRT issues, the timing is impeccable. Hmm, does anyone have an idea on when this will be finished? I'm considering if I should

Problem starting solr on jetty

2011-07-27 Thread Anand.Nigam
Hi, I am new to solr. I have downloaded the solr 3.3.0 distribution and tryign to run it using java -jar start.jar from the apache-solr-3.3.0\example directory (start.jar is present here). But I am getting following error on running this command:

Re: Autocomplete with Solr 3.1

2011-07-27 Thread O. Klein
I know the solution, just not how to actually implement it, but maybe somebody can help with that :) From Wiki: If you want to use a dictionary file that contains phrases (actually, strings that can be split into multiple tokens by the default QueryConverter) then define a different

Re: How to make a valid date facet query?

2011-07-27 Thread Tomás Fernández Löbbe
Hi Floyd, yes, those queries are supported. Make sure you use the right encoding for the plus sign: facet.query=onlinedate:[NOW/YEAR-3YEARS TO NOW/YEAR%2B5YEARS] the result of this facet query will be the number of documents in the result set that match that range. You'll have to use different

what data type for geo fields?

2011-07-27 Thread Peter Wolanin
Looking at the example schema: http://svn.apache.org/repos/asf/lucene/dev/branches/lucene_solr_3_3/solr/example/solr/conf/schema.xml the solr.PointType field type uses double (is this just an example field, or used for geo search?), while the solr.LatLonType field uses tdouble and it's unclear

Delete by range query

2011-07-27 Thread Mohammad Shariq
Hi, I want to delete the bunch of docs from my solr using rangeQuery. I have one field called 'time' which is tint. I am deleting using the query : deletequerytime:[1296777600+TO+1296778000]/query/delete but solr is returning Error, Saying bad request. however I am able to delete one by one

Re: Solr vs ElasticSearch

2011-07-27 Thread Jeff Schmidt
You might also check out Solandra: https://github.com/tjake/Solandra With Solr's configuration and indexes in Cassandra, you can benefit from replication, distribution etc., and still have Cassandra available for non-Solr specific purposes. Cheers, Jeff On Jul 27, 2011, at 5:17 AM,

Re: what data type for geo fields?

2011-07-27 Thread Yonik Seeley
On Wed, Jul 27, 2011 at 9:01 AM, Peter Wolanin peter.wola...@acquia.com wrote: Looking at the example schema: http://svn.apache.org/repos/asf/lucene/dev/branches/lucene_solr_3_3/solr/example/solr/conf/schema.xml the solr.PointType field type uses double (is this just an example field, or

Re: Delete by range query

2011-07-27 Thread Koji Sekiguchi
deletequerytime:[1296777600+TO+1296778000]/query/delete Should be deletequerytime:[1296777600 TO 1296778000]/query/delete ? koji -- http://www.rondhuit.com/en/

Re: Solr vs ElasticSearch

2011-07-27 Thread Yonik Seeley
On Wed, Jul 27, 2011 at 7:17 AM, Tarjei Huse tar...@scanmine.com wrote: On 06/01/2011 08:22 AM, Jason Rutherglen wrote: Thanks Shashi, this is oddly coincidental with another issue being put into Solr (SOLR-2193) to help solve some of the NRT issues, the timing is impeccable. Hmm, does anyone

Re: Delete by range query

2011-07-27 Thread Mohammad Shariq
Thanks Koji Its working now. On 27 July 2011 19:30, Koji Sekiguchi k...@r.email.ne.jp wrote: deletequerytime:[**1296777600+TO+1296778000]/**query/delete Should be deletequerytime:[**1296777600 TO 1296778000]/query/delete ? koji -- http://www.rondhuit.com/en/ -- Thanks and Regards

Re: using distributed search with the suggest component

2011-07-27 Thread Tobias Rübner
Thanks, but this does not work. Looking at the log files, I see only one request, when executing a search. Executing a request to the default servlet (/select) with multiple shards, each core gets ask for the current query. Any other suggestions? Tobias On Tue, Jul 26, 2011 at 2:11 PM,

RE: Problem starting solr on jetty

2011-07-27 Thread Steven A Rowe
Hi Anand, Someone else reported this exact same error with Solr v1.4.0: http://www.lucidimagination.com/search/document/fd5b83f3595a1c6c/can_t_start_solr_by_java_jar_start_jar I downloaded the apache-solr-3.3.0.zip, unpacked it, then ran 'java -jar start.jar' from the cmdline. It worked.

Why Slop doens't match anything?

2011-07-27 Thread Alexander Ramos Jardim
Hello pals, Using solr 1.4.0. Trying to understand something. When I run the query *fieldA:nokia c3*, I get 5 results. All with nokia c3, as expected. But when I run fieldA:nokia c3~100, I don get any result! As far as I understand the ~100 should make my query bring even more results as not

Re: Dealing with keyword stuffing

2011-07-27 Thread Gora Mohanty
On Wed, Jul 27, 2011 at 7:15 PM, Pranav Prakash pra...@gmail.com wrote: I guess most of you have already handled and many of you might still be handling keyword stuffing. Here is my scenario. We have a huge index containing about 6m docs. (Not sure if that is huge :-) And every document

Re: Exact match not the first result returned

2011-07-27 Thread Brian Lamb
Thanks Emmanuel for that explanation. I implemented your solution but I'm not quite there yet. Suppose I also have a record: RECORD 3 arr name=myname strFred G. Anderson/str strFred Anderson/str /arr With your solution, RECORD 1 does appear at the top but I think thats just blind luck more

Re: Why Slop doens't match anything?

2011-07-27 Thread Gora Mohanty
On Wed, Jul 27, 2011 at 8:38 PM, Alexander Ramos Jardim alexander.ramos.jar...@gmail.com wrote: Hello pals, Using solr 1.4.0. Trying to understand something. When I run the query *fieldA:nokia c3*, I get 5 results. All with nokia c3, as expected. But when I run fieldA:nokia c3~100, I don get

Solr Master-slave master failover without data loss

2011-07-27 Thread Nagendraprasad
Suppose master goes down immediately after the index updates, while the updates haven't been replicated to the slaves, data loss seems to happen. Does Solr have any mechanism to deal with that? -- View this message in context:

Filter content upon indexing

2011-07-27 Thread Rafael Ribeiro
Hi all, I am trying to index html documents using Solr and I am having difficulties to extract certain parts of the main content of the document and store them sepparately into other fields. I saw on the docs that it is possible to achieve this using xpath but in my certain case I need to do a

Jetty Logs - Max line size?

2011-07-27 Thread alexander sulz
Hello I enabled Jetty Logs but my GET requests seem so long that they get truncated and without a line break, so in the end it looks like this: notice the logged ping and where it begins. How can i change this? thank you very much 000.000.000.000 - - [27/Jul/2011:17:38:04 +0100] GET

Re: Autocomplete with Solr 3.1

2011-07-27 Thread scorpking
Hi Klein, Thanks for your reply. But i tried some suggestion with solr, and results return is good. But i want to using search component with solr 3.1. Now i have had some problems with Suggester. i think my problem perhaps about in schema file. This is schema file: fieldType name=text

schema.xml changes, need re-indexing ?

2011-07-27 Thread Charles-Andre Martin
Hi, We currently have a big index in production. We would like to add 2 non-required fields to our schema.xml : field name=myfield type=boolean indexed=true stored=true required=false/ field name=myotherfield type=string indexed=true stored=true required=false

Re: Filter content upon indexing

2011-07-27 Thread Emmanuel Espina
If you can express what you want with a regular expression then the pattern Filter should work! I'm thinking that maybe you tokenized the field and that invalidated the structure of the html. I would use a contents field analized with a

Data Import Handler Architecture Diagram

2011-07-27 Thread solruser@9913
Maybe I am looking at the wrong version - the diagram (and the screenshot in the interactive dev mode section) don't show up in the WIKI page. http://wiki.apache.org/solr/DataImportHandler#Architecture Is this a wrong link? I did an inspect element and this is what I see ...

Data Import Handler Diagram

2011-07-27 Thread solruser@9913
Maybe I am looking at the wrong version - the diagram (and the screenshot in the interactive dev mode section) don't show up in the WIKI page. http://wiki.apache.org/solr/DataImportHandler#Architecture Is this a wrong link? I did an inspect element and this is what I see ... ... img

RE: schema.xml changes, need re-indexing ?

2011-07-27 Thread Michael Ryan
You should be fine - no need to re-index your data. Adding and removing fields is generally safe to do without a re-index. Changing a field (its type, analyzers, etc) requires more caution and generally does require a re-index. -Michael

Re: schema.xml changes, need re-indexing ?

2011-07-27 Thread Alexei Martchenko
I believe you're fine with that. Don't need to reindex all solr database. 2011/7/27 Charles-Andre Martin charles-andre.mar...@sunmedia.ca Hi, We currently have a big index in production. We would like to add 2 non-required fields to our schema.xml : field name=myfield type=boolean

Re: Filter content upon indexing

2011-07-27 Thread Emmanuel Espina
I want to add, that since the stored text (not the indexed) is not analyzed, if you retrieve the title you will get all the html. If you want to extract the title for storage in a separate field that will have to be done with a different tool not just with the analysis. My previous answer was

Indexing SharePoint from SolrJ

2011-07-27 Thread Twomey, David
Does anyone have examples of indexing SP content using the Google Connectors API and using SolrJ. I know Lucid Imagination has a Sharepoint connector and I have used that successfully. However, I would like to create a thumbnail image of PDF's and PPT docs and add that to my index and I

Solr Performance Tuning: -XX:+AggressiveOpts

2011-07-27 Thread Fuad Efendi
Anyone tried this? I can not start Solr-Tomcat with following options on Ubuntu: JAVA_OPTS=$JAVA_OPTS -Xms2048m -Xmx2048m -Xmn256m -XX:MaxPermSize=256m JAVA_OPTS=$JAVA_OPTS -Dsolr.solr.home=/data/solr -Dfile.encoding=UTF8 -Duser.timezone=GMT

Re: Solr Performance Tuning: -XX:+AggressiveOpts

2011-07-27 Thread Robert Muir
Don't use this option, these optimizations are buggy: http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7070134 On Wed, Jul 27, 2011 at 3:56 PM, Fuad Efendi f...@efendi.ca wrote: Anyone tried this? I can not start Solr-Tomcat with following options on Ubuntu: JAVA_OPTS=$JAVA_OPTS -Xms2048m

An idea for an intersection type of filter query

2011-07-27 Thread Shawn Heisey
I've been looking at the slow queries our Solr installation is receiving. They are dominated by queries with a simple q parameter (often *:* for all docs) and a VERY complicated fq parameter. The filter query is built by going through a set of rules for the user and putting together each

Re: schema.xml changes, need re-indexing ?

2011-07-27 Thread François Schiettecatte
I have not seen this mentioned anywhere, but I found a useful 'trick' to restart solr without having to restart tomcat. All you need to do is 'touch' the solr.xml in the solr.home directory. It can take a few seconds but solr will restart and reload any config. Cheers François On Jul 27,

Re: An idea for an intersection type of filter query

2011-07-27 Thread Shawn Heisey
On 7/27/2011 2:00 PM, Shawn Heisey wrote: I've seen a number of requests here for the ability to have multiple fq parameters ORed together. This is probably possible, but in the interests of compatibility between versions, very impractical. What if a new parameter was introduced? It could

Re: Solr Performance Tuning: -XX:+AggressiveOpts

2011-07-27 Thread Fuad Efendi
Thanks Robert!!! Submitted On 26-JUL-2011 - yesterday. This option was popular in HbaseŠ On 11-07-27 3:58 PM, Robert Muir rcm...@gmail.com wrote: Don't use this option, these optimizations are buggy: http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7070134 On Wed, Jul 27, 2011 at 3:56

RE: Spellcheck compounded words

2011-07-27 Thread Dyer, James
I could not reproduce the problem even with the two parameters you show below added to the Default handler. I tried using this default handler with different queries with correct incorrect terms. I made sure it would sometimes successfully create collations and other times try to create

Re: Solr Performance Tuning: -XX:+AggressiveOpts

2011-07-27 Thread Robert Muir
On Wed, Jul 27, 2011 at 4:12 PM, Fuad Efendi f...@efendi.ca wrote: Thanks Robert!!! Submitted On 26-JUL-2011 - yesterday. This option was popular in HbaseŠ Then you should tell them also, not to use it, if they want their loops to work. -- lucidimagination.com

Re: Indexing SharePoint from SolrJ

2011-07-27 Thread Glen Newton
+1 On 7/27/11, Twomey, David david.two...@novartis.com wrote: Does anyone have examples of indexing SP content using the Google Connectors API and using SolrJ. I know Lucid Imagination has a Sharepoint connector and I have used that successfully. However, I would like to create a

Re: An idea for an intersection type of filter query

2011-07-27 Thread Jonathan Rochkind
I don't know the answer to feasibilty either, but I'll just point out that boolean OR corresponds to set union, not set intersection. So I think you probably mean a 'union' type of filter query; 'intersection' does not seem to describe what you are describing; ordinary 'fq' values are

Re: Speeding up search by combining common sub-filters

2011-07-27 Thread Jonathan Rochkind
I'm pretty sure Solr/lucene have no such optimization already, but it's not clear to me that it would result in much of a performance benefit, just because of the way lucene works, it's not obvious to me that the second version of your query will be noticeably faster than the first version.

Re: An idea for an intersection type of filter query

2011-07-27 Thread Shawn Heisey
On 7/27/2011 3:49 PM, Jonathan Rochkind wrote: I don't know the answer to feasibilty either, but I'll just point out that boolean OR corresponds to set union, not set intersection. So I think you probably mean a 'union' type of filter query; 'intersection' does not seem to describe what you

Re: schema.xml changes, need re-indexing ?

2011-07-27 Thread Alexei Martchenko
I always run http://localhost:8983/solr/admin/cores?action=RELOADcore=corename in the browser when I wanna reload solr and see any changes in config xmls. 2011/7/27 François Schiettecatte fschietteca...@gmail.com I have not seen this mentioned anywhere, but I found a useful 'trick' to restart

colocated term stats

2011-07-27 Thread Twomey, David
Given a query term, is it possible to get from the index the top 10 collocated terms in the index. ie: return the top 10 terms that appear with this term based on doc count. A plus would be to add some constraints on how near the terms are in the docs.

Re: Data Import Handler Architecture Diagram

2011-07-27 Thread Chris Hostetter
: Maybe I am looking at the wrong version - the diagram (and the screenshot in : the interactive dev mode section) don't show up in the WIKI page. : : http://wiki.apache.org/solr/DataImportHandler#Architecture : : Is this a wrong link? Ugh. a while back the Infra team disabled attachments

Re: Exact match not the first result returned

2011-07-27 Thread Chris Hostetter
: With your solution, RECORD 1 does appear at the top but I think thats just : blind luck more than anything else because RECORD 3 shows as having the same : score. So what more can I do to push RECORD 1 up to the top. Ideally, I'd : like all three records returned with RECORD 1 being the first

Re: Dealing with keyword stuffing

2011-07-27 Thread Chris Hostetter
: Presumably, they are doing this by increasing tf (term frequency), : i.e., by repeating keywords multiple times. If so, you can use a custom : similarity class that caps term frequency, and/or ensures that the scoring : increases less than linearly with tf. Please see in paticular, using

RE: Problem starting solr on jetty

2011-07-27 Thread Anand.Nigam
Thanks for your reply Steve. My environment details: Java version: 1.6.0_24 System: Microsoft Windows XP Professional Version 2002 Service Pack 3 Interestingly my colleagues who have the same environment are not facing this problem. Thanks Regards Anand Nigam -Original Message-

RE: Problem starting solr on jetty

2011-07-27 Thread Anand.Nigam
Thanks for your reply Steve. My environment details: Java version: 1.6.0_24 System: Microsoft Windows XP Professional Version 2002 Service Pack 3 Interestingly my colleagues who have the same environment are not facing this problem. Thanks Regards Anand Nigam -Original Message-

Store complete XML record (DIH XPathEntityProcessor)

2011-07-27 Thread solruser@9913
I am trying to use DIH to import an XML based file with multiple XML records in it. Each record corresponds to one document in Lucene. I am using the DIH FileListEntityProcessor (to get file list) followed by the XPathEntityProcessor to create the entities. It works perfectly and I am able to

RE: Problem starting solr on jetty

2011-07-27 Thread Anand.Nigam
Hi All, I tried to debug the issue by runing start.jar in eclipse debuger and found that the root of the issue was that the jetty.home system property was not set. If I set the jetty.home property then the server starts properly. Thanks, Anand -Original Message- From: Nigam, Anand,

RE: Problem starting solr on jetty

2011-07-27 Thread Steven A Rowe
Hi Anand, Congrats! And thanks for letting us know. Steve -Original Message- From: anand.ni...@rbs.com [mailto:anand.ni...@rbs.com] Sent: Thursday, July 28, 2011 12:00 AM To: solr-user@lucene.apache.org Subject: RE: Problem starting solr on jetty Hi All, I tried to debug the issue

RE: Problem starting solr on jetty

2011-07-27 Thread Chris Hostetter
: I tried to debug the issue by runing start.jar in eclipse debuger and : found that the root of the issue was that the jetty.home system property : was not set. If I set the jetty.home property then the server starts : properly. H, weird ... that still doesn't really make much sense.