DIH stopping an action
Hey there, I would like to know if there is any way to stop a dela-import or a full-import in the middle of the ejecution and free Tomcats memory. In case not... is there any way to tell Solr to stop all actions and free all memory that is using? Is it possible to do one of both things without restarting Tomcat?? Thanks in advance. -- View this message in context: http://www.nabble.com/DIH-stopping-an-action-tp21805669p21805669.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: DIH stopping an action
On Tue, Feb 3, 2009 at 2:01 PM, Marc Sturlese marc.sturl...@gmail.comwrote: Hey there, I would like to know if there is any way to stop a dela-import or a full-import in the middle of the ejecution and free Tomcats memory. There is an 'abort' command for DIH which should do what you want. Most of the DIH related objects should go out of scope once import is aborted. Then it is upto the garbage collector to free the memory. -- Regards, Shalin Shekhar Mangar.
Re: DIH stopping an action
Thanks, that's exactly what I need. Shalin Shekhar Mangar wrote: On Tue, Feb 3, 2009 at 2:01 PM, Marc Sturlese marc.sturl...@gmail.comwrote: Hey there, I would like to know if there is any way to stop a dela-import or a full-import in the middle of the ejecution and free Tomcats memory. There is an 'abort' command for DIH which should do what you want. Most of the DIH related objects should go out of scope once import is aborted. Then it is upto the garbage collector to free the memory. -- Regards, Shalin Shekhar Mangar. -- View this message in context: http://www.nabble.com/DIH-stopping-an-action-tp21805669p21805823.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: DIH stopping an action
I just opened an issue explaining my solution: https://issues.apache.org/jira/browse/SOLR-1004 Shalin Shekhar Mangar wrote: On Tue, Feb 3, 2009 at 4:06 PM, Marc Sturlese marc.sturl...@gmail.comwrote: Doing that, once a doc is aborted in DocBuilder, it will not keep checking all other docs and abort will finish soon. I think it could be done in the function deleteAll(deletedKeys); in case the amount of docs to delete is huge aswell. Has to do that any bad consequence? In case not... do you think it would be useful to add it in dataimporthandler for other use cases? You are right Marc. We should be getting out of that loop (in buildDocument as well as in collectDelta) if abort is called. Can you please a raise an issue in jira? -- View this message in context: http://www.nabble.com/DIH-stopping-an-action-tp21805669p21807365.html Sent from the Solr - User mailing list archive at Nabble.com. -- Regards, Shalin Shekhar Mangar. -- View this message in context: http://www.nabble.com/DIH-stopping-an-action---possible-improvement-tp21805669p21808760.html Sent from the Solr - User mailing list archive at Nabble.com.
Problem with setting solr.solr.home property
Hi, Till now I was working with the jetty server bundled with the SOLR distribution. But I want to deploy solr.war to another jetty server. Here I am facing some problem with solr/home. Whenever I start the jetty server, I get the following error - 2009-02-03 17:45:48.900::INFO: Extract jar:file:/C:/jetty-6.1.3/jetty-6.1.3/webapps/solr.war!/ to C:\DOCUME~1\MANUP 0_8080_solr.war__solr__7k9npr\webapp Feb 3, 2009 5:45:53 PM org.apache.solr.servlet.SolrDispatchFilter init INFO: SolrDispatchFilter.init() Feb 3, 2009 5:45:53 PM org.apache.solr.core.SolrResourceLoader locateInstanceDir INFO: No /solr/home in JNDI Feb 3, 2009 5:45:53 PM org.apache.solr.core.SolrResourceLoader locateInstanceDir INFO: solr home defaulted to 'solr/' (could not find system property or JNDI) Feb 3, 2009 5:45:53 PM org.apache.solr.core.CoreContainer$Initializer initialize INFO: looking for solr.xml: C:\jetty-6.1.3\jetty-6.1.3\solr\solr.xml Feb 3, 2009 5:45:53 PM org.apache.solr.core.SolrResourceLoader init INFO: Solr home set to 'solr/' Feb 3, 2009 5:45:53 PM org.apache.solr.core.SolrResourceLoader createClassLoader INFO: Reusing parent classloader Feb 3, 2009 5:45:53 PM org.apache.solr.core.SolrResourceLoader locateInstanceDir INFO: No /solr/home in JNDI Feb 3, 2009 5:45:53 PM org.apache.solr.core.SolrResourceLoader locateInstanceDir INFO: solr home defaulted to 'solr/' (could not find system property or JNDI) Feb 3, 2009 5:45:53 PM org.apache.solr.core.SolrResourceLoader init INFO: Solr home set to 'solr/' Feb 3, 2009 5:45:53 PM org.apache.solr.core.SolrResourceLoader createClassLoader INFO: Reusing parent classloader Feb 3, 2009 5:45:53 PM org.apache.solr.servlet.SolrDispatchFilter init SEVERE: Could not start SOLR. Check solr/home property java.lang.RuntimeException: Can't find resource 'solrconfig.xml' in classpath or 'solr/conf/', cwd=C:\jetty-6.1.3\je at org.apache.solr.core.SolrResourceLoader.openResource(SolrResourceLoader.java:194) at org.apache.solr.core.SolrResourceLoader.openConfig(SolrResourceLoader.java:162) at org.apache.solr.core.Config.init(Config.java:100) at org.apache.solr.core.SolrConfig.init(SolrConfig.java:113) at org.apache.solr.core.SolrConfig.init(SolrConfig.java:70) at org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:117) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:69) at org.mortbay.jetty.servlet.FilterHolder.doStart(FilterHolder.java:99) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40) at org.mortbay.jetty.servlet.ServletHandler.initialize(ServletHandler.java:594) at org.mortbay.jetty.servlet.Context.startContext(Context.java:139) at org.mortbay.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1218) at org.mortbay.jetty.handler.ContextHandler.doStart(ContextHandler.java:500) at org.mortbay.jetty.webapp.WebAppContext.doStart(WebAppContext.java:448) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40) at org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:147) at org.mortbay.jetty.handler.ContextHandlerCollection.doStart(ContextHandlerCollection.java:161) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40) at org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:147) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40) at org.mortbay.jetty.handler.HandlerWrapper.doStart(HandlerWrapper.java:117) at org.mortbay.jetty.Server.doStart(Server.java:210) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40) at org.mortbay.xml.XmlConfiguration.main(XmlConfiguration.java:929) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) at java.lang.reflect.Method.invoke(Unknown Source) at org.mortbay.start.Main.invokeMain(Main.java:183) at org.mortbay.start.Main.start(Main.java:497) at org.mortbay.start.Main.main(Main.java:115) Feb 3, 2009 5:45:53 PM org.apache.solr.common.SolrException log SEVERE: java.lang.RuntimeException: Can't find resource 'solrconfig.xml' in classpath or 'solr/conf/', cwd=C:\jetty- at org.apache.solr.core.SolrResourceLoader.openResource(SolrResourceLoader.java:194) at org.apache.solr.core.SolrResourceLoader.openConfig(SolrResourceLoader.java:162) at org.apache.solr.core.Config.init(Config.java:100) at org.apache.solr.core.SolrConfig.init(SolrConfig.java:113) at org.apache.solr.core.SolrConfig.init(SolrConfig.java:70) at
Re: Problem with setting solr.solr.home property
Manupriya schrieb: Hi, Till now I was working with the jetty server bundled with the SOLR distribution. But I want to deploy solr.war to another jetty server. Here I am facing some problem with solr/home. Whenever I start the jetty server, I try to extract the solr.war and edit the web.xml ! Greets -Ralf-
Re: Problem with setting solr.solr.home property
Thanks Ralf, Yeah... I can add the system preoprty through web.xml. But as I am deploying my application for a production environment, I dont want to make changes to web.xml. :confused: Kraus, Ralf | pixelhouse GmbH wrote: Manupriya schrieb: Hi, Till now I was working with the jetty server bundled with the SOLR distribution. But I want to deploy solr.war to another jetty server. Here I am facing some problem with solr/home. Whenever I start the jetty server, I try to extract the solr.war and edit the web.xml ! Greets -Ralf- -- View this message in context: http://www.nabble.com/Problem-with-setting-solr.solr.home-property-tp21808987p21809093.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: DIH stopping an action
On Tue, Feb 3, 2009 at 4:06 PM, Marc Sturlese marc.sturl...@gmail.comwrote: Doing that, once a doc is aborted in DocBuilder, it will not keep checking all other docs and abort will finish soon. I think it could be done in the function deleteAll(deletedKeys); in case the amount of docs to delete is huge aswell. Has to do that any bad consequence? In case not... do you think it would be useful to add it in dataimporthandler for other use cases? You are right Marc. We should be getting out of that loop (in buildDocument as well as in collectDelta) if abort is called. Can you please a raise an issue in jira? -- View this message in context: http://www.nabble.com/DIH-stopping-an-action-tp21805669p21807365.html Sent from the Solr - User mailing list archive at Nabble.com. -- Regards, Shalin Shekhar Mangar.
Re: field range (min and max term)
Hi Ben, Look at this: http://wiki.apache.org/solr/StatsComponent Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Ben Incani ben.inc...@datacomit.com.au To: solr-user@lucene.apache.org Sent: Tuesday, February 3, 2009 1:52:05 AM Subject: field range (min and max term) Hi Solr users, Is there a method of retrieving a field range i.e. the min and max values of that fields term enum. For example I would like to know the first and last date entry of N documents. Regards, -Ben
Total count of facets
Hi, I would like to know if is there a way to get the total number of different facets returned by a faceted search? I see already that I can paginate through the facets with the facet.offset and facet.limit, but there is a way to know how many facets are found in total? For instance, NameSurname Peter Smith John Smith Anne Baker Mary York ... 1 million records more with 100.000 distinct surnames For instance, now I search for people with names starting with A, and I retrieve 5000 results. I would like to know the distinct number of surnames (facets) for the result set if possible, so I could show in my app something like this: 5000 people found with 1440 distinct surnames. Any ideas? Is this possible to implement? Any pointers would be greatly appreciated, Thanks! Bruno
Re: Total count of facets
Hello, Searching for ?q=*:* with facetting turned on gives me the total number of available constraints, if that is what you mean. Cheers, On Tue, 2009-02-03 at 16:03 +, Bruno Aranda wrote: Hi, I would like to know if is there a way to get the total number of different facets returned by a faceted search? I see already that I can paginate through the facets with the facet.offset and facet.limit, but there is a way to know how many facets are found in total? For instance, NameSurname Peter Smith John Smith Anne Baker Mary York ... 1 million records more with 100.000 distinct surnames For instance, now I search for people with names starting with A, and I retrieve 5000 results. I would like to know the distinct number of surnames (facets) for the result set if possible, so I could show in my app something like this: 5000 people found with 1440 distinct surnames. Any ideas? Is this possible to implement? Any pointers would be greatly appreciated, Thanks! Bruno
Re: Performance dead-zone due to garbage collection
I noticed your wiki post about sorting with a function query instead of the Lucene sort mechanism. Did you see a significantly reduced memory footprint by doing this? Did you reduce the number of fields you allowed users to sort by? Lance Norskog-2 wrote: Sorting creates a large array with roughly an entry for every document in the index. If it is not on an 'integer' field it takes even more memory. If you do a sorted request and then don't sort for a while, that will drop the sort structures and trigger a giant GC. We went through some serious craziness with sorting. -- View this message in context: http://www.nabble.com/Performance-%22dead-zone%22-due-to-garbage-collection-tp21588427p21814038.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Performance dead-zone due to garbage collection
On Tue, Feb 3, 2009 at 11:58 AM, wojtekpia wojte...@hotmail.com wrote: I noticed your wiki post about sorting with a function query instead of the Lucene sort mechanism. Did you see a significantly reduced memory footprint by doing this? FunctionQuery derives field values from the FieldCache... so it would use the same amount of memory as sorting. -Yonik
Re: Total count of facets
But as far as I understand the total number of constraints is limited (there is a default value), so I cannot know the total if I don't set the facet.limit to a really big number and then the request takes a long time. I was wondering if there was a way to get the total number (e.g. 100.000 constraints) to show it to the user, and then paginate using facet.offset and facet.limit until I reach that total. Does this make sense? Thanks! Bruno 2009/2/3 Markus Jelsma - Buyways B.V. mar...@buyways.nl Hello, Searching for ?q=*:* with facetting turned on gives me the total number of available constraints, if that is what you mean. Cheers, On Tue, 2009-02-03 at 16:03 +, Bruno Aranda wrote: Hi, I would like to know if is there a way to get the total number of different facets returned by a faceted search? I see already that I can paginate through the facets with the facet.offset and facet.limit, but there is a way to know how many facets are found in total? For instance, NameSurname Peter Smith John Smith Anne Baker Mary York ... 1 million records more with 100.000 distinct surnames For instance, now I search for people with names starting with A, and I retrieve 5000 results. I would like to know the distinct number of surnames (facets) for the result set if possible, so I could show in my app something like this: 5000 people found with 1440 distinct surnames. Any ideas? Is this possible to implement? Any pointers would be greatly appreciated, Thanks! Bruno
Re: DIH stopping an action
Hey Shalin, I have been testing de abort command and for full-import there's no problem. In delta-import, at DocBuilder.java I have seen it checks for if (stop.get()) before executing deleteAll and inside collectDelta (in doDelta function). The problem is that once you have the SetMapString, Object with all de data to modify, it will just check for if (stop.get()) inside the function BuilDocuement. In my case, I have 300.000 docs to modifiy so, as BuildDocuement in doDelta is called inside a while it will pass for all 300.000 aborting all of them. What I have done is check if there is abortion inside the while: while (pkIter.hasNext()) { MapString, Object map = pkIter.next(); vri.addNamespace(DataConfig.IMPORTER_NS + .delta, map); buildDocument(vri, null, map, root, true, null,true); pkIter.remove(); //#patch checking if abortion if (stop.get()) { return; } } This part of code is from doDelta in DocBuilder. Doing that, once a doc is aborted in DocBuilder, it will not keep checking all other docs and abort will finish soon. I think it could be done in the function deleteAll(deletedKeys); in case the amount of docs to delete is huge aswell. Has to do that any bad consequence? In case not... do you think it would be useful to add it in dataimporthandler for other use cases? Marc Sturlese wrote: Thanks, that's exactly what I need. Shalin Shekhar Mangar wrote: On Tue, Feb 3, 2009 at 2:01 PM, Marc Sturlese marc.sturl...@gmail.comwrote: Hey there, I would like to know if there is any way to stop a dela-import or a full-import in the middle of the ejecution and free Tomcats memory. There is an 'abort' command for DIH which should do what you want. Most of the DIH related objects should go out of scope once import is aborted. Then it is upto the garbage collector to free the memory. -- Regards, Shalin Shekhar Mangar. -- View this message in context: http://www.nabble.com/DIH-stopping-an-action-tp21805669p21807365.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: newbie question --- multiple schemas
: Is it possible to define more than one schema? I'm reading the example : schema.xml. It seems that we can only define one schema? What about if I : want to define one schema for document type A and another schema for : document type B? there are lots of ways to tackle a problem like this, depending on your specific needs, some starting points are... http://wiki.apache.org/solr/MultipleIndexes -Hoss
Re: DIH, assigning multiple xpaths to the same solr field
On Wed, Feb 4, 2009 at 1:35 AM, Fergus McMenemie fer...@twig.me.uk wrote: entity name=x dataSource=myfilereader processor=XPathEntityProcessor url=${jc.fileAbsolutePath} stream=false forEach=/record field column=para xpath=/record/sect1/para / field column=para xpath=/record/list/listitem/para / field column=para xpath=/a/b/c/para / field column=para xpath=/d/e/f/g/para / Below is the line from my schema.xml field name=para type=text indexed=true stored=true multiValued=true/ Now a given document will only have one style of layout, and of course the /a/b/c /d/e/f/g stuff is made up. For a document that has a single paraHello world/para element I see search results as follows, the one para string seems to have been entered into the index four times. I only saw duplicate results before adding the extra made-up stuff. I think there is something fishy with the XPathEntityProcessor. For now, I think you can work around by giving each field a different 'column' and attribute 'name=para' on each of them. -- Regards, Shalin Shekhar Mangar.
Re: Query Performance while updating teh index
: Just to clarify - we do not optimize on the slaves at all. We only optimize : on the master. that doesn't change anything about hte comments that i made before. it *really* wouldn't make sense to optimize on a slave right before pulling a new snapshot, but it still doesn't make any more sense to optimize on a master right before doing some updates and then pulling a new snapshot. my second comment also still applies: a snappull after an optimize is always going to be involve more churn on the disk... : : We do optimize the index before updates but we get tehse performance : issues : : even when we pull an empty snapshot. Thus even when our update is tiny, : the : : performance issues still happen. : : FWIW: this behavior doesn't make a lot of sense -- optimizing just : before you are about to make updates/additions ot your data, is a complete : waste. the main value in optimizing your index is that you have one : segment, as soon as you add a docment that changes. : : the other thing to keep in mind is that an optimized index is a completley : new segment as a new file with a new name, so there is going to be added : overhead on the slave machines as the OS purges the old index files and : replaces them with the new optimized index files -- more overhead then if : you had just done your additions w/o optimizing first. -Hoss
Re: Recent document boosting with dismax
: Hi, no the data_added field was one per document. i would suggest adding multiValued=false to your date fieldType so that Solr can enforce that for you -- otherwise we can't be 100% sure. if it really is only a single valued field, then i suspect you're right about the index corruption being the source of your problem, but it's not neccessarily a permenant problem. try optimizing your index, that should merge all the segments and purge any terms that aren't actually part of live documents (i think) ... if that doesn't work, rebuilding will be your best bet (and with that multiValued=false will error if you are inadvertantly sending multiple values per document) : I'm having lots of other problems (un-related) with corrupt indices - : could : it be that in running the org.apache.lucene.index.CheckIndex utility, and : losing some documents in the process, the ordinal part of my boost : function : is permanently broken? -Hoss
Re: Should I extend DIH to handle POST too?
: I guess I got the wrong impression initially. These classes extend the : RequestHandlerBase. your confusion is totally understandable, and stemms from some confusion legacy naming convention. there is an UpdateHandler API which is a low level API for dictating how changes are made to the underlying IndexWriter -- there is *NO* reason for anyone to ever do anything with this API (in my opinion) there is also a SolrRequestHandler which dictates how Solr deals with external requests, and what kind of input parsing it does. Some of these Request Handlers are designed for making Updates and many people (who aren't even aware of the UpdateHandler API mentioned above) informally refer to them as Update Handlers ... hence a lot of confusion. http://wiki.apache.org/solr/SolrPlugins -Hoss
RE: Unsubscribing
Nothing in the Junk folder, but that reminded me that our company is using a 3rd party spam filter (i.e., Lanlogic)... which sure enough had snagged the confirmation emails. Since the list emails were going through I never thought to check the filtering systems. Thanks for jogging my memory. :-) Ross From: Chris Hostetter [mailto:hossman_luc...@fucit.org] Sent: Tue 2/3/2009 2:18 PM To: solr-user@lucene.apache.org Subject: Re: Unsubscribing : Subject: Unsubscribing : : I've tried multiple times to unsubscribe from this list using the proper : method (mailto:solr-user-unsubscr...@lucene.apache.org), but it's not : working! Can anyone help with that? Did you get a confirmation email from the mailing list software asking you to verify that you really wanted to unsubscribe? (is it in a Junk Mail or Spam folder that you didn't think to check?) did you reply to it according to the instructions? see also... http://www.nabble.com/Re%3A-PLEASE-REMOVE-ME-FROM-THIS-EMAIL-LIST!-p10879673.html -Hoss
Re: Unsubscribing
: Subject: Unsubscribing : : I've tried multiple times to unsubscribe from this list using the proper : method (mailto:solr-user-unsubscr...@lucene.apache.org), but it's not : working! Can anyone help with that? Did you get a confirmation email from the mailing list software asking you to verify that you really wanted to unsubscribe? (is it in a Junk Mail or Spam folder that you didn't think to check?) did you reply to it according to the instructions? see also... http://www.nabble.com/Re%3A-PLEASE-REMOVE-ME-FROM-THIS-EMAIL-LIST!-p10879673.html -Hoss
Re: ranged query on multivalued field doesnt seem to work
: I am still struggling with this... but I guess would it be because for some : data there are maximum interger values for the fields start_year : end_year, like 2.14748365E9, which solr does not recognise as sfloat, : because there is a E letter? when you say you are using sfloat, that fieldType is using the SortableFloatField class correct? SortableFloatField uses Float.parseFloat to get the float value from your input string, if that fails it will through an exception -- so you should have gotten an error if the value was unparsable ... i'm not sure what might be causing your problem. -Hoss
Re: DIH using values from solrconfig.xml inside data-config.xml
: The solr data field is populated properly. So I guess that bit works. : I really wish I could use xpath=//para : The limitation comes from streaming the XML instead of creating a DOM. : XPathRecordReader is a custom streaming XPath parser implementation and : streaming is easy only because we limit the syntax. You can use : PlainTextEntityProcessor which gives the XML as a string to a custom : Transformer. This Transformer can create a DOM, run your XPath query and : populate the fields. It's more expensive but it is an option. Maybe it's just me, but it seems like i'm noticing that as DIH gets used more, many people are noting that the XPath processing in DIH doesn't work the way they expect because it's a custom XPath parser/engine designed for streaming. It seems like it would be helpful to have an alternate processor for people who don't need the streaming support (ie: are dealing with small enough docs that they can load the full DOM tree into memory) that would use the default Java XPath engine (and have less caveats/suprises) ... i wou think it would probably even make sense for this new XPath processor to be the one we suggest for new users, and only suggest the existing (stream based) processor if they have really big xml docs to deal with. (In hindsight XPathEntityProcessor and XPathRecordReader should probably have been named StreamingXPathEntityProcessor and StreamingXPathRecordReader) thoughts? -Hoss
Re: MASTER / SLAVES numdoc
: I've one server and several slaves and I would like to know if I go to the : host.name/solr/admin/stat.jsp if there is a way to know the difference of : the numDoc per server? i don't really understand your question -- sure you can go to that page on each server and compare the number of docs ... ok, now what? what is your goal? if i had to guess, i would suspect that this URL (on your master) might be of use to you... http://localhost:8983/solr/admin/distributiondump.jsp ...but that's just a guess, and it only works if you are using the replication scripts (i'm not sure if DIH has a similar feature) http://people.apache.org/~hossman/#xyproblem XY Problem Your question appears to be an XY Problem ... that is: you are dealing with X, you are assuming Y will help you, and you are asking about Y without giving more details about the X so that we can understand the full issue. Perhaps the best solution doesn't involve Y at all? See Also: http://www.perlmonks.org/index.pl?node_id=542341 -Hoss
Re: Problem with setting solr.solr.home property
: Till now I was working with the jetty server bundled with the SOLR : distribution. But I want to deploy solr.war to another jetty server. Here I : am facing some problem with solr/home. Whenever I start the jetty server, I : get the following error - ... : INFO: solr home defaulted to 'solr/' (could not find system property or : JNDI) ... : SEVERE: Could not start SOLR. Check solr/home property : java.lang.RuntimeException: Can't find resource 'solrconfig.xml' in : classpath or 'solr/conf/', cwd=C:\jetty-6.1.3\je ... : I have tried following options - : : 1. Added system property on windows as 'solr.solr.home'. I am able to get : its value when I check through command prompt. : http://www.nabble.com/file/p21808987/cmd.gif i don't know much about windows, but i don't think that's the same thing as a java system property (that looks like an enviornment variable to me) : 2. I also tried adding vm argument through command prompt as follows - : set : JAVA_OPTS=-Dsolr.solr.home=C:\SOLR\apache-solr-1.3.0\apache-solr-1.3.0\example\solr : : But in all the case, I am getting the above exception. what about the INFO line i quoted above (solr home defaulted to 'solr/'...) are you seeing that line even when you modify the JAVA_OPTS this way? (i'm wondering if perhaps you are setting the system property but maybe the quotes or formating or soemthing is confusing it when trying to find that directory) ... it would be helpful to see the *exact* logs and error messages you get when trying the JAVA_OPTS method ... i'm suspicious that maybe it's a slightly different error. : 3. I tried to retrieve the System property through java code (It is the : similar code that is triggered by Solr, SolrResourceLoader.java : locateInstanceDir() method). I get the value of system property in the code. your code looks right, but i don't understand exactly what you're saying -- do you in fact see the path in your logging output? if so then i'm more confident it's a problem with formating the path correctly so java understands it. FYI: in my opinion the best way to set solr home is using JNDI, but you didn't mention trying that... http://wiki.apache.org/solr/SolrJetty -Hoss
RE: Problem with setting solr.solr.home property
For what it's worth, I bumped into http://jira.codehaus.org/browse/JETTY-854 on a recent Jetty installation when trying to set up Solr for a test run, so setting via JNDI may end up causing even more heartburn. I ended up just using Tomcat. V/R, Nicholas Piasecki Software Developer Skiviez, Inc. n...@skiviez.com 804-550-9406 -Original Message- From: Chris Hostetter [mailto:hossman_luc...@fucit.org] Sent: Tuesday, February 03, 2009 8:31 PM To: solr-user@lucene.apache.org Subject: Re: Problem with setting solr.solr.home property : Till now I was working with the jetty server bundled with the SOLR : distribution. But I want to deploy solr.war to another jetty server. Here I : am facing some problem with solr/home. Whenever I start the jetty server, I : get the following error - ... : INFO: solr home defaulted to 'solr/' (could not find system property or : JNDI) ... : SEVERE: Could not start SOLR. Check solr/home property : java.lang.RuntimeException: Can't find resource 'solrconfig.xml' in : classpath or 'solr/conf/', cwd=C:\jetty-6.1.3\je ... : I have tried following options - : : 1. Added system property on windows as 'solr.solr.home'. I am able to get : its value when I check through command prompt. : http://www.nabble.com/file/p21808987/cmd.gif i don't know much about windows, but i don't think that's the same thing as a java system property (that looks like an enviornment variable to me) : 2. I also tried adding vm argument through command prompt as follows - : set : JAVA_OPTS=-Dsolr.solr.home=C:\SOLR\apache-solr-1.3.0\apache-solr-1.3.0\ example\solr : : But in all the case, I am getting the above exception. what about the INFO line i quoted above (solr home defaulted to 'solr/'...) are you seeing that line even when you modify the JAVA_OPTS this way? (i'm wondering if perhaps you are setting the system property but maybe the quotes or formating or soemthing is confusing it when trying to find that directory) ... it would be helpful to see the *exact* logs and error messages you get when trying the JAVA_OPTS method ... i'm suspicious that maybe it's a slightly different error. : 3. I tried to retrieve the System property through java code (It is the : similar code that is triggered by Solr, SolrResourceLoader.java : locateInstanceDir() method). I get the value of system property in the code. your code looks right, but i don't understand exactly what you're saying -- do you in fact see the path in your logging output? if so then i'm more confident it's a problem with formating the path correctly so java understands it. FYI: in my opinion the best way to set solr home is using JNDI, but you didn't mention trying that... http://wiki.apache.org/solr/SolrJetty -Hoss
Re: Recent document boosting with dismax
Great, thanks for that, Chris! 2009/2/3 Chris Hostetter hossman_luc...@fucit.org : Hi, no the data_added field was one per document. i would suggest adding multiValued=false to your date fieldType so that Solr can enforce that for you -- otherwise we can't be 100% sure. if it really is only a single valued field, then i suspect you're right about the index corruption being the source of your problem, but it's not neccessarily a permenant problem. try optimizing your index, that should merge all the segments and purge any terms that aren't actually part of live documents (i think) ... if that doesn't work, rebuilding will be your best bet (and with that multiValued=false will error if you are inadvertantly sending multiple values per document) : I'm having lots of other problems (un-related) with corrupt indices - : could : it be that in running the org.apache.lucene.index.CheckIndex utility, and : losing some documents in the process, the ordinal part of my boost : function : is permanently broken? -Hoss
Re: DIH using values from solrconfig.xml inside data-config.xml
On Wed, Feb 4, 2009 at 6:13 AM, Chris Hostetter hossman_luc...@fucit.org wrote: : The solr data field is populated properly. So I guess that bit works. : I really wish I could use xpath=//para : The limitation comes from streaming the XML instead of creating a DOM. : XPathRecordReader is a custom streaming XPath parser implementation and : streaming is easy only because we limit the syntax. You can use : PlainTextEntityProcessor which gives the XML as a string to a custom : Transformer. This Transformer can create a DOM, run your XPath query and : populate the fields. It's more expensive but it is an option. Maybe it's just me, but it seems like i'm noticing that as DIH gets used more, many people are noting that the XPath processing in DIH doesn't work the way they expect because it's a custom XPath parser/engine designed for streaming. It seems like it would be helpful to have an alternate processor for people who don't need the streaming support (ie: are dealing with small enough docs that they can load the full DOM tree into memory) that would use the default Java XPath engine (and have less caveats/suprises) ... i wou think it would probably even make sense for this new XPath processor to be the one we suggest for new users, and only suggest the existing (stream based) processor if they have really big xml docs to deal with. I guess the current XPathEntityProcessor must be able to switch between the streaming xpath(XPathRecordReader) and the default java XPath engine . I am just hoping that all the current syntax and semantics will be applicable for the Java Xpath engine. If not ,we will need a new EntityProcessor. I also would like to explore if the current XPathRecordReader can implement more XPath syntax with streaming. The java xpath engine is not at all efficient for large scale data processing (In hindsight XPathEntityProcessor and XPathRecordReader should probably have been named StreamingXPathEntityProcessor and StreamingXPathRecordReader) thoughts? -Hoss -- --Noble Paul
Re: DIH, assigning multiple xpaths to the same solr field
it is safe to use different column names as Shalin suggested. After all a row is a map with the column value as the key. If you map multiple values to the same column it may overwrite each other. use explicit 'name' attributes On Wed, Feb 4, 2009 at 2:17 AM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: On Wed, Feb 4, 2009 at 1:35 AM, Fergus McMenemie fer...@twig.me.uk wrote: entity name=x dataSource=myfilereader processor=XPathEntityProcessor url=${jc.fileAbsolutePath} stream=false forEach=/record field column=para xpath=/record/sect1/para / field column=para xpath=/record/list/listitem/para / field column=para xpath=/a/b/c/para / field column=para xpath=/d/e/f/g/para / Below is the line from my schema.xml field name=para type=text indexed=true stored=true multiValued=true/ Now a given document will only have one style of layout, and of course the /a/b/c /d/e/f/g stuff is made up. For a document that has a single paraHello world/para element I see search results as follows, the one para string seems to have been entered into the index four times. I only saw duplicate results before adding the extra made-up stuff. I think there is something fishy with the XPathEntityProcessor. For now, I think you can work around by giving each field a different 'column' and attribute 'name=para' on each of them. -- Regards, Shalin Shekhar Mangar. -- --Noble Paul
RE: Problem with setting solr.solr.home property
Thanks everyone!! Finally got a solution for this problem on Jetty Server. Instead of setting Java system variables like JAVA_OPTS=-Dsolr.solr.home=C:\SOLR\apache-solr-1.3.0\apache-solr-1.3.0\example\solr, we can provide the vm arguments directly while starting the jetty server. I am running jetty as follows - java -Dsolr.solr.home=PATH_TO_SOLR_HOME -jar start.jar After this I am not getting any error. :-D Thanks, Manu Nicholas Piasecki-2 wrote: For what it's worth, I bumped into http://jira.codehaus.org/browse/JETTY-854 on a recent Jetty installation when trying to set up Solr for a test run, so setting via JNDI may end up causing even more heartburn. I ended up just using Tomcat. V/R, Nicholas Piasecki Software Developer Skiviez, Inc. n...@skiviez.com 804-550-9406 -Original Message- From: Chris Hostetter [mailto:hossman_luc...@fucit.org] Sent: Tuesday, February 03, 2009 8:31 PM To: solr-user@lucene.apache.org Subject: Re: Problem with setting solr.solr.home property : Till now I was working with the jetty server bundled with the SOLR : distribution. But I want to deploy solr.war to another jetty server. Here I : am facing some problem with solr/home. Whenever I start the jetty server, I : get the following error - ... : INFO: solr home defaulted to 'solr/' (could not find system property or : JNDI) ... : SEVERE: Could not start SOLR. Check solr/home property : java.lang.RuntimeException: Can't find resource 'solrconfig.xml' in : classpath or 'solr/conf/', cwd=C:\jetty-6.1.3\je ... : I have tried following options - : : 1. Added system property on windows as 'solr.solr.home'. I am able to get : its value when I check through command prompt. : http://www.nabble.com/file/p21808987/cmd.gif i don't know much about windows, but i don't think that's the same thing as a java system property (that looks like an enviornment variable to me) : 2. I also tried adding vm argument through command prompt as follows - : set : JAVA_OPTS=-Dsolr.solr.home=C:\SOLR\apache-solr-1.3.0\apache-solr-1.3.0\ example\solr : : But in all the case, I am getting the above exception. what about the INFO line i quoted above (solr home defaulted to 'solr/'...) are you seeing that line even when you modify the JAVA_OPTS this way? (i'm wondering if perhaps you are setting the system property but maybe the quotes or formating or soemthing is confusing it when trying to find that directory) ... it would be helpful to see the *exact* logs and error messages you get when trying the JAVA_OPTS method ... i'm suspicious that maybe it's a slightly different error. : 3. I tried to retrieve the System property through java code (It is the : similar code that is triggered by Solr, SolrResourceLoader.java : locateInstanceDir() method). I get the value of system property in the code. your code looks right, but i don't understand exactly what you're saying -- do you in fact see the path in your logging output? if so then i'm more confident it's a problem with formating the path correctly so java understands it. FYI: in my opinion the best way to set solr home is using JNDI, but you didn't mention trying that... http://wiki.apache.org/solr/SolrJetty -Hoss -- View this message in context: http://www.nabble.com/Problem-with-setting-solr.solr.home-property-tp21808987p21825052.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: WebLogic 10 Compatibility Issue - StackOverflowError
We believe that the filters/forward issue is likely something specific to weblogic. Specifically that other containers have filters disabled on forward by default, where as weblogic has them enabled. We dont think the small modification we had to make to headers.jsp are weblogic specific. On 1/30/09 8:15 AM, Feak, Todd wrote: Are the issues ran into due to non-standard code in Solr, or is there some WebLogic inconsistency? -Todd Feak -Original Message- From: news [mailto:n...@ger.gmane.org] On Behalf Of Ilan Rabinovitch Sent: Friday, January 30, 2009 1:11 AM To: solr-user@lucene.apache.org Subject: Re: WebLogic 10 Compatibility Issue - StackOverflowError I created a wiki page shortly after posting to the list: http://wiki.apache.org/solr/SolrWeblogic From what we could tell Solr itself was fully functional, it was only the admin tools that were failing. Regards, Ilan Rabinovitch --- SCALE 7x: 2009 Southern California Linux Expo Los Angeles, CA http://www.socallinuxexpo.org On 1/29/09 4:34 AM, Mark Miller wrote: We should get this on the wiki. - Mark Ilan Rabinovitch wrote: We were able to deploy Solr 1.3 on Weblogic 10.0 earlier today. Doing so required two changes: 1) Creating a weblogic.xml file in solr.war's WEB-INF directory. The weblogic.xml file is required to disable Solr's filter on FORWARD. The contents of weblogic.xml should be: ?xml version='1.0' encoding='UTF-8'? weblogic-web-app xmlns=http://www.bea.com/ns/weblogic/90; xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance; xsi:schemaLocation=http://www.bea.com/ns/weblogic/90 http://www.bea.com/ns/weblogic/90/weblogic-web-app.xsd; container-descriptor filter-dispatched-requests-enabledfalse/filter-dispatched-requests-en abled /container-descriptor /weblogic-web-app 2) Remove the pageEncoding attribute from line 1 of solr/admin/header.jsp On 1/17/09 2:02 PM, KSY wrote: I hit a major roadblock while trying to get Solr 1.3 running on WebLogic 10.0. A similar message was posted before - ( http://www.nabble.com/Solr-1.3-stack-overflow-when-accessing-solr-admin- page-td20157873.html http://www.nabble.com/Solr-1.3-stack-overflow-when-accessing-solr-admin- page-td20157873.html ) - but it seems like it hasn't been resolved yet, so I'm re-posting here. I am sure I configured everything correctly because it's working fine on Resin. Has anyone successfully run Solr 1.3 on WebLogic 10.0 or higher? Thanks. SUMMARY: When accessing /solr/admin page, StackOverflowError occurs due to an infinite recursion in SolrDispatchFilter ENVIRONMENT SETTING: Solr 1.3.0 WebLogic 10.0 JRockit JVM 1.5 ERROR MESSAGE: SEVERE: javax.servlet.ServletException: java.lang.StackOverflowError at weblogic.servlet.internal.RequestDispatcherImpl.forward(RequestDispatche rImpl.java:276) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.j ava:273) at weblogic.servlet.internal.FilterChainImpl.doFilter(FilterChainImpl.java: 42) at weblogic.servlet.internal.RequestDispatcherImpl.invokeServlet(RequestDis patcherImpl.java:526) at weblogic.servlet.internal.RequestDispatcherImpl.forward(RequestDispatche rImpl.java:261) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.j ava:273) at weblogic.servlet.internal.FilterChainImpl.doFilter(FilterChainImpl.java: 42) at weblogic.servlet.internal.RequestDispatcherImpl.invokeServlet(RequestDis patcherImpl.java:526) at weblogic.servlet.internal.RequestDispatcherImpl.forward(RequestDispatche rImpl.java:261) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.j ava:273) at weblogic.servlet.internal.FilterChainImpl.doFilter(FilterChainImpl.java: 42) at weblogic.servlet.internal.RequestDispatcherImpl.invokeServlet(RequestDis patcherImpl.java:526) at weblogic.servlet.internal.RequestDispatcherImpl.forward(RequestDispatche rImpl.java:261) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.j ava:273) -- Ilan Rabinovitch i...@fonz.net --- SCALE 7x: 2009 Southern California Linux Expo Los Angeles, CA http://www.socallinuxexpo.org
Re: Custom Sorting
Thanks Erik, that helped me a lot ... but still have somthing, i am not sure about: If i am using a custom sort - like the DistanceComparator example described in oh your book - and i debugged the code and seem to understand that the the distances-array is created for all indexed documents - not only for the search result. The compare-function is then called only for the docs of the search result, right? My problem is now, that i wonder, if it is not possible to compute only the distances from the documents of the search result (that should help the performance, if there are a lot of documents, but the search result is mostly very small, right?) Another point: Of course it also could be interesting to compute all distances for all documents the first time a new start location is given, in the case, that you want do a lot of queries from the same location. But this would then only make sense, if all distances are cached together with the location value. I am not sure how things are actually handled in lucene/solr. What and at which timer things are cached? To compute distances only for the search result, i could - store the reader instance in a variable - for every doc-id called in the compare function the first time, i could compute the distance at this moment - and then compare Would this work? Or is there a better way to compute the distances only on the search result? A lot of questions, i know, Thanks for the good book, Markus Erik Hatcher wrote: * QueryComponent - this is where results are generated, it uses a SortSpec from the QParser. * QParser#getSort - creating a custom QParser you'll be able to wire in your own custom sort You can write your own QParserPlugin and QParser, and configure it into solrconfig.xml and should be good to go. Subclassing existing classes, this should only be a handful of lines of code to do. -- View this message in context: http://www.nabble.com/Custom-Sorting-tp1659p21825900.html Sent from the Solr - User mailing list archive at Nabble.com.