Re: running solr in debug through eclipse

2014-09-18 Thread Bernd Fehling
It depends on what you are going to do. If you are adding/modifying code and Junit tests use Junit test cases. If you are debugging runtime problems under load use remote debugging. If you are going for in deep debugging (even into Jetty and Java) use RunJettyRun for Eclipse. Regards Bernd Am

Re: FAST-like document vector data structures in Solr?

2014-09-08 Thread Bernd Fehling
t;> The weight is a float value between 0 and 1, where 1 indicates the highest >> relevance. >> >> The similarity vector is created during item processing and indicates the >> most important terms or concepts in the item and the corresponding >> weight.” >> >>

anyone besides Solr also using Elasticsearch?

2014-04-08 Thread Bernd Fehling
Hi list, as the title says, is anyone besides Solr also using Elasticsearch? If so, are you: - using JSON for ES search queries? - using the sparse URI search of ES for search queries? - having your own addon/plugin for turning Solr URI queries into JSON quries for ES? - having any other combina

Re: Solr Heap, MMaps and Garbage Collection

2014-03-02 Thread Bernd Fehling
per >>> collection.Are all the caches used by solr on or off heap ? >>> >>> >>> Given this scenario where GC is the primary bottleneck what is a good >>> recommended memory settings for solr? Should i increase the heap memory >>> (that will

Re: ANNOUNCE: Apache Solr Reference Guide 4.6

2013-12-02 Thread Bernd Fehling
But it still has the error about TrimFilterFactory in it, which I reported a couple of days back. http://www.mail-archive.com/solr-user@lucene.apache.org/msg92064.html So what it needs to correct the Reference Guide is to place a note like under StopFilter somewhere under TrimFilter: "As of Solr

TrimFilterFactory and IllegalArgumentException with Solr4.6

2013-11-27 Thread Bernd Fehling
Now this is strange, while using TrimFilterFactory with attribute "updateOffsets=true" as described in http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.TrimFilterFactory and https://cwiki.apache.org/confluence/display/solr/Filter+Descriptions#FilterDescriptions-TrimFilter I get "

Re: More on topic of Meta-search/Federated Search with Solr

2013-08-27 Thread Bernd Fehling
Years ago when "Federated Search" was a buzzword we did some development and testing with Lucene, FAST Search, Google and several other Search Engines according Federated Search in Library context. The results can be found here http://pub.uni-bielefeld.de/download/2516631/2516644 Some minor parts a

Re: [POLL] Who & how does use "admin-extra" ?

2013-08-08 Thread Bernd Fehling
I have a table of links to all my servers running SOLR. So I can jump from one admin page any other servers admin page. And also a link to my monitoring server. Not very innovative but better than an empty page. So, yes I'm using it. Regards, Bernd Am 08.08.2013 00:24, schrieb Stefan Matheis:

Re: Measuring SOLR performance

2013-08-01 Thread Bernd Fehling
Yes, UseNuma is only for Parallel Scavenger garbage collector and only for Solaris 9 and higher and Linux kernel 2.6.19 and glibc 2.6.1. And it performs with 64-bit better than 32-bit. So no effects for G1. With standard applications CMS is very slightly better than G1 but when it comes to huge he

Re: swap and GC

2013-07-29 Thread Bernd Fehling
Am 29.07.2013 14:46, schrieb Michael Ryan: > This is interesting... How are you measuring the heap size? This is displayed in jvisualvm and also logged with munin via JMX. Bernd > > -Michael > > -Original Message- > From: Bernd Fehling [mailto:bernd.fehl...@uni-bie

swap and GC

2013-07-29 Thread Bernd Fehling
Something interesting I have noticed today, after running my huge single index (49 mio. records / 137 GB index) for about a week and replicating today I recognized that the heap usage after replication did not go down as expected. Expected means if solr is started I have a heap size between 4 to 5

Re: dataconfig to index ZIP Files

2013-07-01 Thread Bernd Fehling
Try setting dataSource="null" for your toplevel entity and use filename="\.zip$" as filename selector. Am 28.06.2013 23:14, schrieb ericrs22: > unfortunately not. I had tried that before with the logs saying: > > Full Import failed:java.lang.RuntimeException: java.lang.RuntimeException: > java.

Re: Book progress (Solr 4.x Deep Dive) - see my blog

2013-06-24 Thread Bernd Fehling
Am 24.06.2013 16:37, schrieb Jack Krupansky: > I won’t continue to bore annoy anybody on this list with tedious comments > about my new Solr book on Lulu.com... please bookmark my blog, > http://basetechnology.blogspot.com/, for further updates on the book. > > The book itself is here: > http:/

Re: Informal poll on running Solr 4 on Java 7 with G1GC

2013-06-20 Thread Bernd Fehling
Hi Shawn, some good information about G1 tuning, may be something for the wiki about GC tuning. Am 20.06.2013 18:01, schrieb Shawn Heisey: > On 6/20/2013 8:02 AM, John Nielsen wrote: > ... > ... When you take a look at the overall stats and the memory graph over time, > G1 looks way better. > U

Re: Informal poll on running Solr 4 on Java 7 with G1GC

2013-06-20 Thread Bernd Fehling
Am 20.06.2013 00:18, schrieb Timothy Potter: > I'm sure there's some site to do this but wanted to get a feel for > who's running Solr 4 on Java 7 with G1 gc enabled? > > Cheers, > Tim > Currently using Solr 4.2.1 in production with Oracle Java(TM) SE Runtime Environment (build 1.7.0_07-b10) an

Re: UnInverted multi-valued field

2013-06-20 Thread Bernd Fehling
you have queryResultCache set to 10 which means IF you calculate with 100 qps (which is a lot) it will cache the last 1000 seconds (if all queries are unique) which is 16.6 minutes. Is that what you want and what your system should serve? And with the halfe of it (5) a new searcher

Re: Solr 4.2.1 higher memory footprint vs Solr 3.5

2013-06-06 Thread Bernd Fehling
May be I have just luck with it, but for big heaps it works fine. Regards Bernd Am 06.06.2013 16:23, schrieb Shawn Heisey: > On 6/6/2013 3:50 AM, Bernd Fehling wrote: >> What helped me a lot was switching to G1GC. >> Faster, smoother, very little ripple, nearly no sawtooth.

Re: Solr 4.2.1 higher memory footprint vs Solr 3.5

2013-06-06 Thread Bernd Fehling
Am 05.06.2013 18:09, schrieb SandeepM: > /So we see the jagged edge waveform which keeps climbing (GC cycles don't > completely collect memory over time). Our test has a short capture from > real traffic and we are replaying that via solrmeter./ > > Any idea why the memory climbs over time. Th

Re: different Solr Logging for CONSOLE and FILE

2013-06-05 Thread Bernd Fehling
Am 05.06.2013 11:28, schrieb Raheel Hasan: > Hi, > > I have a small question about solr logging. > > In resources>log4j.properties, we have > > *log4j.rootLogger=INFO, file, CONSOLE* > > However, what I want is: > *log4j.rootLogger=INFO, file > * > and > *log4j.rootLogger=WARN, CONSOLE* > (bo

Re: how are you handling killer queries?

2013-06-04 Thread Bernd Fehling
gt; also inside SolrIndexSearcher). > My understanding is that it will stop your server from consuming > unnecessary resources. > > --roman > > > On Mon, Jun 3, 2013 at 4:39 AM, Bernd Fehling < > bernd.fehl...@uni-bielefeld.de> wrote: > >> How are you handl

Re: how are you handling killer queries?

2013-06-03 Thread Bernd Fehling
er (broken pipe)? And the container has no way to simulate a "browser stop button" in case of a timeout to get a sane termination? Bernd Am 03.06.2013 16:20, schrieb Shawn Heisey: > On 6/3/2013 2:39 AM, Bernd Fehling wrote: >> How are you handling "killer queries&q

how are you handling killer queries?

2013-06-03 Thread Bernd Fehling
How are you handling "killer queries" with solr? While solr/lucene (currently 4.2.1) is trying to do its best I see sometimes stupid queries in my logs, located with extremly long query time. Example: q=???+and+??+and+???+and++and+???+and+?? I even get hits for this (hits=34

Re: error while switching from log4j back to slf4j with solr 4.3

2013-05-16 Thread Bernd Fehling
Am 16.05.2013 17:19, schrieb Shawn Heisey: > On 5/16/2013 3:24 AM, Bernd Fehling wrote: >> OK, solved. >> I have now run-jetty-run with log4j running. >> Just copied log4j libs from example/lib/ext to webapp/WEB-INF/classes and >> set -Dlog4j.configuration in run-jetty

Re: error while switching from log4j back to slf4j with solr 4.3

2013-05-16 Thread Bernd Fehling
OK, solved. I have now run-jetty-run with log4j running. Just copied log4j libs from example/lib/ext to webapp/WEB-INF/classes and set -Dlog4j.configuration in run-jetty-run VM classpath. Thanks, Bernd Am 15.05.2013 16:31, schrieb Shawn Heisey: > On 5/15/2013 12:52 AM, Bernd Fehling wr

error while switching from log4j back to slf4j with solr 4.3

2013-05-14 Thread Bernd Fehling
Hi list, while I can't get solr 4.3 with run-jetty-run up and running under eclipse for debugging I tried to switch back to slf4j and followed the steps of http://wiki.apache.org/solr/SolrLogging Unfortunately eclipse bothers me with an error: The import org.apache.log4j.AppenderSkeleton cannot be

CJK question

2013-05-13 Thread Bernd Fehling
A question about CJK, how will U+3000 be handled? U+3000 belongs to "CJK Symbols and Punctuation" and is named "IDEOGRAPHIC SPACE". Is it wrong if I just map it to U+0020 (SPACE)? What is CJK Analyzer doing with U+3000? If "two CJK words" have U+3000 inside, does it mean these "two CJK words"

Re: solr 4.2.1 still has problems with index version and index generation

2013-04-09 Thread Bernd Fehling
rom the core and then tries to get via the IndexReader the IndexCommit and then the commitData. I think I should use remote debugging on master. At least I now know that it is the master. Regards Bernd Am 09.04.2013 08:35, schrieb Bernd Fehling: > Hi Hoss, > > we don't u

Re: solr 4.2.1 still has problems with index version and index generation

2013-04-08 Thread Bernd Fehling
Hi Hoss, we don't use autoCommit and autoSoftCommit. We don't use openSearcher. We don't use transaction log. I can see it in the AdminGUI and with http://master_host:port/solr/replication?command=indexversion All files are replicated from master to slave, nothing lost. It is just that the gen/v

solr 4.2.1 still has problems with index version and index generation

2013-04-08 Thread Bernd Fehling
I know there was some effort to fix this but I must report that solr 4.2.1 has still problems with index version and index generation numbering in master/slave mode with replication. Test was: 1. installed solr 4.2.1 on master and build index from scratch 2. installed solr 4.2.1 on slave with empt

solr 4.2.1 and docValues

2013-04-08 Thread Bernd Fehling
Hi list, I want to try docValues for my facets and sorting with solr 4.2.1 and have already seen many papers, examples and source code about and around docValues, but there are still some questions. The example schema.xml has fields: and it has a comment for docValues: For popularity with do

Re: OutOfMemoryError

2013-03-25 Thread Bernd Fehling
tten in the tomcat logs as > well or will I only see it in the memory graphs? > > BR, > Arkadi > On 03/25/2013 03:50 PM, Bernd Fehling wrote: >> We use munin with jmx plugin for monitoring all server and Solr >> installations. >> (http://munin-monitoring.org/) &g

Re: OutOfMemoryError

2013-03-25 Thread Bernd Fehling
d java from 6 to 7... > How exactly do you monitor the memory usage and the affect of the garbage > collector? > > > On 03/25/2013 01:18 PM, Bernd Fehling wrote: >> The of UseG1GC yes, >> but with Solr 4.x, Jetty 8.1.8 and Java HotSpot(TM) 64-Bit Server VM >> (1.

Re: OutOfMemoryError

2013-03-25 Thread Bernd Fehling
util.zip.ZipFile.open(Native Method) >>>>>> at java.util.zip.ZipFile.(ZipFile.java:127) >>>>>> at java.util.zip.ZipFile.(ZipFile.java:144) >>>>>> at >>>>>> org.apache.poi.openxml4j.opc.internal.ZipHelper.openZipFile(ZipHelper.java:157) >>>> [...] >&g

Re: Solr 4.2 - Slave Index version is higher than Master

2013-03-22 Thread Bernd Fehling
That issue was already with solr 4.1. http://lucene.472066.n3.nabble.com/replication-problems-with-solr4-1-td4039647.html Nice to know that it is still there in 4.2. With some luck it will make it to 4.2.1 ;-) Regards Bernd Am 21.03.2013 21:08, schrieb Uomesh: > Hi, > > I am seeing an issue af

Re: Known memory leaks in 4.0?

2013-03-15 Thread Bernd Fehling
Am 15.03.2013 12:24, schrieb Per Steffensen: > On 3/15/13 9:13 AM, Bernd Fehling wrote: >> How do you know that it is Solr and nothing else? > It is memory usage inside the Jetty/Solr JVM we monitor, so by definition it > is Solr (or Jetty, but I couldnt imagine). The lower borde

Re: Out of Memory doing a query Solr 4.2

2013-03-15 Thread Bernd Fehling
We are currently using Oracle Corporation Java HotSpot(TM) 64-Bit Server VM (1.7.0_07 23.3-b01) Runs excellent and also no memory parameter tweaking neccessary. Give enough physical and JVM memory, use "-XX:+UseG1GC" and thats it. Also no "saw tooth" and GC timeouts from JVM as with earlier versi

Re: Known memory leaks in 4.0?

2013-03-15 Thread Bernd Fehling
How do you know that it is Solr and nothing else? Have you check with MemoryAnalyzer? http://wiki.eclipse.org/index.php/MemoryAnalyzer As we are always using the most recent released version we have never seen any memory leaks with Solr so far. Regards Bernd Am 15.03.2013 08:21, schrieb Per Ste

Re: Slaves always replicate entire index & Index versions

2013-02-27 Thread Bernd Fehling
May be the info about index version is pulled from the repeaters "data/replication.properties" file and the content of that file is wrong. Had something similar and only solution for me was deleting the replication.properties file. But no guarantee about this. Actually the replication is pretty mu

Re: replication problems with solr4.1

2013-02-13 Thread Bernd Fehling
OK then index generation and index version are out of count when it comes to verify that master and slave index are in sync. What else is possible? The strange thing is if master is 2 or more generations ahead of slave then it works! With your logic the slave must _always_ be one generation ahea

Re: replication problems with solr4.1

2013-02-12 Thread Bernd Fehling
to slave, more like a sync? Am 11.02.2013 09:29, schrieb Bernd Fehling: > Hi list, > > after upgrading from solr4.0 to solr4.1 and running it for two weeks now > it turns out that replication has problems and unpredictable results. > My installation is single index 41 mio. docs

replication problems with solr4.1

2013-02-11 Thread Bernd Fehling
Hi list, after upgrading from solr4.0 to solr4.1 and running it for two weeks now it turns out that replication has problems and unpredictable results. My installation is single index 41 mio. docs / 115 GB index size / 1 master / 3 slaves. - the master builds a new index from scratch once a week

Re: expert question about SolrReplication

2013-02-03 Thread Bernd Fehling
Am 02.02.2013 03:48, schrieb Yonik Seeley: > On Fri, Feb 1, 2013 at 4:13 AM, Bernd Fehling > wrote: >> A question to the experts, >> >> why is the replicated index copied from its temporary location >> (index.x) >> to the real index directory and NOT

expert question about SolrReplication

2013-02-01 Thread Bernd Fehling
A question to the experts, why is the replicated index copied from its temporary location (index.x) to the real index directory and NOT moved? Copying over 100s of gigs takes some time, moving is just changing the file system link. Also, instead of first deleting the old index, why not -

Solr4.1 changing result order FIFO to LIFO

2013-01-31 Thread Bernd Fehling
Hi list, I recognized that the result order is FIFO if documents have the same score. I think this is due to the fact that documents which are indexed later get a higher internal document ID and the output for documents with the same score starts with the lowest internal document ID and raises. I

thanks for solr 4.1

2013-01-29 Thread Bernd Fehling
Now this must be said, thanks for solr 4.1 (and lucene 4.1)! Great improvements compared to 4.0. After building the first 4.1 index I thought the index was broken, but had no error messages anywhere. Why I thought it was damaged? The index size went down from 167 GB (solr 4.0) to 115 GB (solr 4.

Re: jconsole over jmx - should threads be visible?

2012-12-19 Thread Bernd Fehling
Hi Shawn, actually I use munin for monitoring but just checked with jvisualvm which also runs fine for remote monitoring. You might try the following: http://www.codefactorycr.com/java-visualvm-to-profile-a-remote-server.html You have to: - generate a policy file on the server to be monitored -

Re: OutOfMemoryError | While Faceting Query

2012-12-07 Thread Bernd Fehling
Hi Uwe, sorting should be well prepared. First rough check is fieldCache. You can see it with SolrAdmin Stats. The "insanity_count" there should be 0 (zero). Only sort on fields which are prepared for sorting and make sense to be sorted. Do only faceting on fields which make sense. I've seen syst

Re: DefaultSolrParams ?

2012-12-02 Thread Bernd Fehling
Hi Hoss, my config has definately not changed and it worked with 3.6 and 3.6.1. Yes I have a custom plugin and if q was empty with 3.6 it picked automatically q.alt from solrconfig.xml. This all was done with params.get() With 4.x this is gone due to some changes in DefaultSolrParams(?). Which is

DefaultSolrParams ?

2012-11-30 Thread Bernd Fehling
Dear list, after going from 3.6 to 4.0 I see exceptions in my logs. It turned out that somehow the "q"-parameter was empty. With 3.6 the "q.alt" in the solrconfig.xml worked as fallback but now with 4.0 I get exceptions. I use it like this: SolrParams params = req.getParams(); String q = params.

Re: Multi word synonyms

2012-11-29 Thread Bernd Fehling
There are also other solutions: Multi-word synonym filter (synonym expansion) https://issues.apache.org/jira/browse/LUCENE-4499 Since Solr 3.4 i have my own solution which might be obsolete if LUCENE-4499 will be in a released version. http://www.ub.uni-bielefeld.de/~befehl/base/solr/eurovoc.html

Re: error opening index solr 4.0 with lukeall-4.0.0-ALPHA.jar

2012-11-19 Thread Bernd Fehling
I just downloaded, compiled and opened an optimized solr 4.0 index in read only without problems. Could browse through the docs, search with different analyzers, ... Looks good. Am 19.11.2012 08:49, schrieb Toke Eskildsen: > On Mon, 2012-11-19 at 08:10 +0100, Bernd Fehling wrote: >>

Re: error opening index solr 4.0 with lukeall-4.0.0-ALPHA.jar

2012-11-18 Thread Bernd Fehling
ispatchThread.java:188) > at java.awt.EventDispatchThread.run(EventDispatchThread.java:122) > o > > > any ideas? > > > I,ve created another index with lucene 4.0 and this luke open the index > well. > > thanks in advance > -- * Bernd F

Re: Out Of Memory =( Too many cores on one server?

2012-11-16 Thread Bernd Fehling
I guess you should give JVM more memory. When starting to find a good value for -Xmx I "oversized" and set it to Xmx20G and Xms20G. Then I monitored the system and saw that JVM is between 5G and 10G (java7 with G1 GC). Now it is finally set to Xmx11G and Xms11G for my system with 1 core and 38 m

Re: Multiword synonym and query expansion

2012-10-17 Thread Bernd Fehling
Have a look at the report about EuroVoc integration into Solr which gives you an idea about the problems and solutions with multiword synonyms and query expansion. http://www.ub.uni-bielefeld.de/~befehl/base/solr/eurovoc.html Regards Bernd Fehling Am 18.10.2012 02:36, schrieb Nicholas Ding

differences of LockFactory between solr 3.6.1 and 4.0.0?

2012-10-17 Thread Bernd Fehling
Hi list, while checking the runtime behavior of solr 4.0.0 I recognized that the handling of write.lock seams to be different. With solr 3.6.1 after calling optimize the index is optimzed and write.lock removed. This tells me everything is flushed to disk and its save to copy the index. With so

Re: exception when starting single instance solr-4.0.0

2012-10-16 Thread Bernd Fehling
l the paths with a regex or something > and see if something jumps out and backtrack. > > I really, really, _hate_ having to deal with this kind of thing > > Best > Erick > > On Tue, Oct 16, 2012 at 2:12 AM, Bernd Fehling > wrote: >> The solr home dir is as s

Re: exception when starting single instance solr-4.0.0

2012-10-15 Thread Bernd Fehling
The solr home dir is as suggested for solr 4.0 to be located below jetty. So my directory structure is: /srv/www/solr/solr-4.0.0/ -- dist ** has all apache solr and lucene libs not in .war -- lib** has all other libs not in .war and not in dist, but required -- jetty ** the jetty copied from

exception when starting single instance solr-4.0.0

2012-10-15 Thread Bernd Fehling
Hi, while starting solr-4.0.0 I get the following exception: SEVERE: null:java.lang.IllegalAccessError: class org.apache.lucene.codecs.lucene3x.PreFlexRWPostingsFormat cannot access its superclass org.apache.lucene.codecs.lucene3x.Lucene3xPostingsFormat Very strange, because some lines earlier i

Re: Query foreign language "synonyms" / words of equivalent meaning?

2012-10-09 Thread Bernd Fehling
7;d need to include foreign language > versions of (not to mention needing to know which languages to include). > This isn't trivial either. > > I'm assuming there's no built-in functionality that supports the foreign > language translation on the fly, so what do people

Re: Synonyms Phrase not working

2012-10-01 Thread Bernd Fehling
uery: > > > +DisjunctionMaxQuery(((produto_nome:preservativo produto_nome:vaselina > produto_nome:viagra produto_nome:lubrificante intimo) )) > > > I Dont undersant why it brings no results. > Any ideas? > > > > > -- > View this messag

Re: what happends with slave during repliacation?

2012-09-23 Thread Bernd Fehling
Hi Amanda, we don't use solr cloud jet, just 3 dedicated server. When it comes to distribution the choice will be either solr cloud or elastic search. But currently we use unix shell scripts with ssh for switching. Easy, simple, stable :-) Regards, Bernd Am 21.09.2012 16:03, schrieb yangqian_nj

Re: SOLR memory usage jump in JVM

2012-09-20 Thread Bernd Fehling
iteup about GC and memory in Solr/Lucene: > > http://searchhub.org/dev/2011/03/27/garbage-collection-bootcamp-1-0/ > > Best > Erick > > On Thu, Sep 20, 2012 at 5:49 AM, Robert Muir wrote: >> On Thu, Sep 20, 2012 at 3:09 AM, Bernd Fehling >> wrote: >> >&g

Re: what happends with slave during repliacation?

2012-09-20 Thread Bernd Fehling
Hi Alex, during replication the slave is still available and serving requests but as you can imagine the responses will be slower because of disk usage, even with 15k rpm disks. We have one master and two slaves. Master only for indexing, slaves for searching. Only one slave is online the other i

Re: SOLR memory usage jump in JVM

2012-09-20 Thread Bernd Fehling
That is the problem with a jvm, it is a virtual machine. Ask 10 experts about a good jvm settings and you get 15 answers. May be a tradeoff of the flexibility of jvm's. There is always a right setting for any application running on a jvm but you just have to find it. How about a Solr Wiki page abo

Re: SOLR memory usage jump in JVM

2012-09-18 Thread Bernd Fehling
ue, Sep 18, 2012 at 3:09 AM, Bernd Fehling > wrote: >> Hi Otis, >> >> not really a problem because I have plenty of memory ;-) >> -Xmx25g -Xms25g -Xmn6g > > Good. > >> I'm just interested into this. >> Can you report similar jumps within JVM w

Re: SOLR memory usage jump in JVM

2012-09-18 Thread Bernd Fehling
- Original Message - > | From: "Yonik Seeley" > | To: solr-user@lucene.apache.org > | Sent: Tuesday, September 18, 2012 7:38:41 AM > | Subject: Re: SOLR memory usage jump in JVM > | > | On Tue, Sep 18, 2012 at 7:45 AM, Bernd Fehling > | wrote: > | > I use

Re: SOLR memory usage jump in JVM

2012-09-18 Thread Bernd Fehling
measure after that. > > Uwe has an interesting blog about memory, he recommends using as > little as possible, > see: > http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html > > Best > Erick > > On Tue, Sep 18, 2012 at 3:09 AM, Bernd Fehli

Re: SOLR memory usage jump in JVM

2012-09-18 Thread Bernd Fehling
ch-analytics/index.html > Performance Monitoring - http://sematext.com/spm/index.html > > > On Tue, Sep 18, 2012 at 2:50 AM, Bernd Fehling > wrote: >> Hi list, >> >> while monitoring my systems I see a jump in memory consumption in JVM >> after 2 to 5 days of running o

SOLR memory usage jump in JVM

2012-09-17 Thread Bernd Fehling
Hi list, while monitoring my systems I see a jump in memory consumption in JVM after 2 to 5 days of running of about 5GB. After starting the system (search node only, no replication during search) SOLR uses between 6.5GB to 10.3GB of JVM when idle. If the search node is online and serves requests

Re: Solr - Add Single node from XPathEntityProcessor in multiple fields

2012-09-13 Thread Bernd Fehling
gt;> /> >>> >>> >>> but by doing this way , only last mentioned filed (in this case >>> 'article_time') get inserted in solr index , but no data inserted for >>> article_time_DT field. >>> >>> Can any body please suggest

Re: Website (crawler for) indexing

2012-09-10 Thread Bernd Fehling
Some month ago I have tested YaCy, this works pretty well. http://yacy.net/en/ You can install it as stand-alone and setup your own crawler (single or cluster). Very nice admin and control surface. After installation disable the internal database and enable the feed to SOLR, thats it. Regards,

Re: display SOLR Query in web page

2012-08-22 Thread Bernd Fehling
;> Ouch, not to mention the potential for XSS. >> >> I'll see if I can get in touch with someone. >> >> Michael Della Bitta >> >> >> Appinions | 18 East 41st St., Suite 1806 | New York, NY 10017 >&g

display SOLR Query in web page

2012-08-22 Thread Bernd Fehling
Now this is very scary, while searching for "solr direct access per docid" I got a hit from US Homeland Security Digital Library. Interested in what they have to tell me about my search I clicked on the link to the page. First the page had nothing unusual about it, but why I get the hit? http://

Re: too many instances of "org.tartarus.snowball.Among" in the heap

2012-07-27 Thread Bernd Fehling
StringBuilder > 43: 424801 20390448 > org.apache.lucene.analysis.miscellaneous.WordDelimiterIterator > 44: 424801 20390448 > org.apache.lucene.analysis.core.StopFilter > 45: 424801 20390448 > org.apache.lucene.analysis.miscellaneous.KeywordMarkerFilter > 46: 424801 20390448

Re: [Announce] Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 with Realtime NRT available for download

2012-07-23 Thread Bernd Fehling
+1 What would be if ALL external projects using lucene and/or solr are announcing on this list that they have stepped up to the next higher release after a release change? Also "Realtime NRT", if NRT stands for "Near_Real_Time" he has a "Realtime Near_Real_Time" Algorithm. Regards, Bernd Am

Re: Solr Monitoring Tool

2012-07-20 Thread Bernd Fehling
Hi, I started with SysUsage http://sysusage.darold.net/ which grabs all system activities using unix Sar and system commands. Pretty easy and simple. I also tried zabbix. Very powerful but for me to much to configure. Have now munin 2.0.2 installed for testing. Needs some perl knowledge to get

DIH is doubling field entries

2012-07-19 Thread Bernd Fehling
While porting from 3.6.1 to 4.x I noticed the doubling content of some fields in my index. Didn't have this with 3.6.1. This can also be seen with luke. I could trace it down to DIH so far. Anyone seen this? I'm using XPathEntityProcessor with RegexTransformer. Will look into this closer tomorro

change of API Javadoc interface funtionality in 4.0.x

2012-07-18 Thread Bernd Fehling
Dear developers, while upgrading from 3.6.x to 4.x I have to rewrite some of my code and search for the new methods and/or classes. In 3.6.x and older versions the API Javadoc interface had an "Index" which made it easy to find the appropriate methods. The button to call the "Index" was located in

leaks in solr

2012-06-29 Thread Bernd Fehling
Hi list, while monitoring my solr 3.6.1 installation I recognized an increase of memory usage in OldGen JVM heap on my slave. I decided to force Full GC from jvisualvm and send optimize to the already optimized slave index. Normally this helps because I have monitored this issue over the past. Bu

Re: Issues with whitespace tokenization in QueryParser

2012-06-11 Thread Bernd Fehling
Because we use in many cases mutli-term search together with synonyms as thesaurus we had to develop a solution for this. There is a whole chain of pitfalls through the system and you have to be careful. The thesaurus (synonym.txt) solves not only single-terms to multi-terms but also multi-terms t

defaultSearchField and param df are messed up in 3.6.x

2012-06-08 Thread Bernd Fehling
Unfortunately I must see that defaultSearchField and param df are pretty much messed up in solr 3.6.x Yes, I have seen issue SOLR-2724 and SOLR-3292. So if defaultSearchField has been removed (deprecated) from schema.xml then why are the still calls to "org.apache.solr.schema.IndexSchema.getDefau

Re: Multi-words synonyms matching

2012-06-05 Thread Bernd Fehling
ord synonyms work better if you use LUCENE_33 is because > then Solr uses the SlowSynonymFilter instead of SynonymFilterFactory > (FSTSynonymFilterFactory). > > But I don't know if the difference between them is a bug or not. Maybe > someone has more insight? > >

Re: Multi-words synonyms matching

2012-05-31 Thread Bernd Fehling
Are you sure with LUCENE_33 (Use of BitVector)? Am 31.05.2012 17:20, schrieb O. Klein: > I have been struggling with this as well and found that using LUCENE_33 gives > the best results. > > But as it will be deprecated this is no everlasting solution. May somebody > knows one? >

Re: Multi-words synonyms matching

2012-05-29 Thread Bernd Fehling
otel de ville > > mairie => mairie, hotel\ de\ ville > > but nothing prevents mairie from matching with "hotel"... > > The only way I found is to use > tokenizerFactory="solr.KeywordTokenizerFactory" in my synonyms declaration > in schema.xml, bu

Re: Multi-words synonyms matching

2012-05-15 Thread Bernd Fehling
>>>>>>> --Jeevanandam >>>>>>>>> >>>>>>>>> On Apr 11, 2012, at 12:30 PM, elisabeth benoit wrote: >>>>>>>>> >>>>>>>>>> <' mapping instead? Something >>>>>>>>>&

debugging junit test with eclipse

2012-04-24 Thread Bernd Fehling
I have tried all hints from internet for debugging a junit test of solr 3.6 under eclipse but didn't succeed. eclipse and everything is running, compiling, debugging with runjettyrun. Tests have no errors. Ant from command line ist also running with ivy, e.g. ant -Dtestmethod=testUserFields -Dtest

Problems with edismax parser and solr3.6

2012-04-18 Thread Bernd Fehling
I just looked through my logs of solr 3.6 and saw several "0 hits" which were not seen with solr 3.5. While tracing this down it turned out that edismax don't like queries of type "...&q=(text:ide)&..." any more. If parentheses around the query term the edismax fails with solr 3.6. Can anyone

HowTo getDefaultOperator with solr3.6?

2012-04-16 Thread Bernd Fehling
I'm trying to get the default operator of a schema in solr 3.6 but unfortunately everything is deprecated. The API solr 3.6 says: getQueryParserDefaultOperator() - Method in class org.apache.solr.schema.IndexSchema Deprecated. use getSolrQueryParser().getDefaultOperator() getSolrQueryP

Re: Lexical analysis tools for German language data

2012-04-12 Thread Bernd Fehling
Paul, nearly two years ago I requested an evaluation license and tested BASIS Tech Rosette for Lucene & Solr. Was working excellent but the price much much to high. Yes, they also have compound analysis for several languages including German. Just configure your pipeline in solr and setup the pr

Re: Lexical analysis tools for German language data

2012-04-12 Thread Bernd Fehling
You might have a look at: http://www.basistech.com/lucene/ Am 12.04.2012 11:52, schrieb Michael Ludwig: > Given an input of "Windjacke" (probably "wind jacket" in English), I'd > like the code that prepares the data for the index (tokenizer etc) to > understand that this is a "Jacke" ("jacket")

Re: solr 3.5 taking long to index

2012-04-11 Thread Bernd Fehling
There were some changes in solrconfig.xml between solr3.1 and solr3.5. Always read CHANGES.txt when switching to a new version. Also helpful is comparing both versions of solrconfig.xml from the examples. Are you sure you need a MaxPermSize of 5g? Use jvisualvm to see what you really need. This i

Re: [Announce] Solr 4.0 with RankingAlgorithm 1.4.1, NRT now supports both RankingAlgorithm and Lucene

2012-03-29 Thread Bernd Fehling
Nothing against RankingAlgorithm and your work, which sounds great, but I think that YOUR "Solr 4.0" might confuse some Solr users and/or newbees. As far as I know the next official release will be 3.6. So your "Solr 4.0" is a trunk snapshot or what? If so, which revision number? Or have you do

CLOSE_WAIT connections

2012-03-27 Thread Bernd Fehling
Hi list, I have looked into the CLOSE_WAIT problem and created an issue with a patch to fix this. A search for CLOSE_WAIT shows that there are many Apache projects hit by this problem. https://issues.apache.org/jira/browse/SOLR-3280 Can someone recheck the patch (it belongs to SnapPuller) and

Re: [SoldCloud] leaking file descriptors

2012-03-01 Thread Bernd Fehling
What is netstat telling you about the connections on the servers? Any connections in "CLOSE_WAIT" (passive close) hanging? Saw this on my servers last week. Used a little proggi to spoof a local connection on those servers ports and was able to fake the TCP-stack to close those connections. It a

Re: need to support bi-directional synonyms

2012-02-22 Thread Bernd Fehling
Use sprayer, washer http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.SynonymFilterFactory Regards Bernd Am 23.02.2012 07:03, schrieb remi tassing: Same question here... On Wednesday, February 22, 2012, geeky2 wrote: hello all, i need to support the following: if the user

Re: usage of /etc/jetty.xml when debugging Solr in Eclipse

2012-02-08 Thread Bernd Fehling
Hi, run-jetty-run issue #9: ... In the VM Arguments of your launch configuration set -Drjrxml=./jetty.xml If jetty.xml is in the root of your project it will be used (you can also use a fully qualified path name). The UI port, context and WebApp dir are ignored, since you can define them in j

SOLVED: SolrException with branch_3x

2012-01-31 Thread Bernd Fehling
After changing the below suggested lines and compiling the branch_3x runs fine now. SolrException is gone. Regards, Bernd Am 31.01.2012 14:21, schrieb Bernd Fehling: On January 11th I downloaded branch_3x with svn into eclipse (indigo). Compiled and tested it without problems. Today I updated

SolrException with branch_3x

2012-01-31 Thread Bernd Fehling
On January 11th I downloaded branch_3x with svn into eclipse (indigo). Compiled and tested it without problems. Today I updated my branch_3x from repository. Compiled fine but get now SolrException when starting. Jan 31, 2012 1:50:15 PM org.apache.solr.core.SolrCore initListeners INFO: [] Added S

Re: Synonym configuration not working?

2012-01-15 Thread Bernd Fehling
Yes and No. If using Synonyms funtionality out of the box you have to do it at index time. But if using it at query time, like we do, you have to do some programming. We have connected a thesaurus which is actually using synonyms functionality at query time. There are some pitfalls to take care

Re: exception while loading with DIH multi-threaded

2012-01-11 Thread Bernd Fehling
Hi Mikhail, thanks for pointing me to the issue. Regards, Bernd Am 11.01.2012 21:47, schrieb Mikhail Khludnev: FYI, it's https://issues.apache.org/jira/browse/SOLR-2804 I'm trying to address it. On Wed, Jan 11, 2012 at 5:49 PM, Bernd Fehling< bernd.fehl...@uni-bielefeld.de>

<    1   2   3   4   5   >