Special characters
Hi, Can anyone point me to the thread if it exists on indexing special characters in solr. Regards Sujatha
Re: Special characters
You forgot to tell us what do you want to do with special characters? 1. Remove them from the documents while indexing? 2. Don't remove them while indexing? 3. Query with terms containing a special character? On Tue, Jan 6, 2009 at 2:55 PM, Sujatha Arun suja.a...@gmail.com wrote: Hi, Can anyone point me to the thread if it exists on indexing special characters in solr. Regards Sujatha -- Regards, Shalin Shekhar Mangar.
Re: Special characters
Hi, I would like to query terms containing special chars . Regards Sujatha On Tue, Jan 6, 2009 at 2:59 PM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: You forgot to tell us what do you want to do with special characters? 1. Remove them from the documents while indexing? 2. Don't remove them while indexing? 3. Query with terms containing a special character? On Tue, Jan 6, 2009 at 2:55 PM, Sujatha Arun suja.a...@gmail.com wrote: Hi, Can anyone point me to the thread if it exists on indexing special characters in solr. Regards Sujatha -- Regards, Shalin Shekhar Mangar.
Default Solr Query
Hi All, I want to fetch all the data from database. so what my Solr query should be to get all documents from database? like in mysql syntex is : SELECT * FROM table; so what will be the syntex of this query in solr ? Please reply ASAP. Thanks in Advance. Thanks: Bhawani Sharma -- View this message in context: http://www.nabble.com/Default-Solr-Query-tp21307309p21307309.html Sent from the Solr - User mailing list archive at Nabble.com.
Query about NOT (-) operator
Hi, The query 1. NOT(IBA60019_l:1) AND NOT(IBA60019_l:0) AND businessType:wt.doc.WTDocument works But below query does not work 2. (NOT(IBA60019_l:1) AND NOT(IBA60019_l:0)) AND businessType:wt.doc.WTDocument Query number 1 shows the records but Query number 2 does not show any records 3. (NOT(IBA60019_l:1) OR NOT(IBA60019_l:0)) AND businessType:wt.doc.WTDocument The Query no 3 also does not show any records where as it should show all the records for which businessType is wt.doc.WTDocument 4. NOT(IBA60019_l:1) OR NOT(IBA60019_l:0) AND businessType:wt.doc.WTDocument Query number 4 works as if it is query number 1 i.e. OR is working as AND Can someone comment on this? Thanks, Ajit
RE: Special characters
Filtering of special characters depends on the filters you use for the fields in your schema.xml. If you are using WordDelimiterFilterFactory in your analyzer then the special characters get removed during the processing of your field. But the WordDelimiterFilterFactory does a lot of other things too than just removing the special characters. If you feel that you can do away with the other features provided by the filter then you can remove it from your schema.xml file. In any other case, I guess you will have to customize the WordDelimiterFilter.java class to suit your purpose. -Kumar -Original Message- From: Sujatha Arun [mailto:suja.a...@gmail.com] Sent: Tuesday, January 06, 2009 3:05 PM To: solr-user@lucene.apache.org Subject: Re: Special characters Hi, I would like to query terms containing special chars . Regards Sujatha On Tue, Jan 6, 2009 at 2:59 PM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: You forgot to tell us what do you want to do with special characters? 1. Remove them from the documents while indexing? 2. Don't remove them while indexing? 3. Query with terms containing a special character? On Tue, Jan 6, 2009 at 2:55 PM, Sujatha Arun suja.a...@gmail.com wrote: Hi, Can anyone point me to the thread if it exists on indexing special characters in solr. Regards Sujatha -- Regards, Shalin Shekhar Mangar.
Re: Default Solr Query
On Tue, Jan 6, 2009 at 3:09 PM, Bhawani Sharma bhawanisha...@aol.comwrote: Hi All, I want to fetch all the data from database. so what my Solr query should be to get all documents from database? like in mysql syntex is : SELECT * FROM table; so what will be the syntex of this query in solr ? If you meant that you want all documents (i.e. without any queries or filters), you should use q=*:* If however you meant that you want *all* the documents inside Solr at once (a full dump), it is probably a bad idea due to the performance issues. -- View this message in context: http://www.nabble.com/Default-Solr-Query-tp21307309p21307309.html Sent from the Solr - User mailing list archive at Nabble.com. -- Regards, Shalin Shekhar Mangar.
Re: Special characters
Kumar's advice is sound. You must make sure you are actually indexing the special symbols. To make a query with special characters you must make sure you urlencode the parameters before sending them to Solr. There are some symbols which have a special meaning in the lucene query syntax are '+', '-', ':' which you will have to escape by adding a backslash in front of it. On Tue, Jan 6, 2009 at 3:15 PM, Jana, Kumar Raja kj...@ptc.com wrote: Filtering of special characters depends on the filters you use for the fields in your schema.xml. If you are using WordDelimiterFilterFactory in your analyzer then the special characters get removed during the processing of your field. But the WordDelimiterFilterFactory does a lot of other things too than just removing the special characters. If you feel that you can do away with the other features provided by the filter then you can remove it from your schema.xml file. In any other case, I guess you will have to customize the WordDelimiterFilter.java class to suit your purpose. -Kumar -Original Message- From: Sujatha Arun [mailto:suja.a...@gmail.com] Sent: Tuesday, January 06, 2009 3:05 PM To: solr-user@lucene.apache.org Subject: Re: Special characters Hi, I would like to query terms containing special chars . Regards Sujatha On Tue, Jan 6, 2009 at 2:59 PM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: You forgot to tell us what do you want to do with special characters? 1. Remove them from the documents while indexing? 2. Don't remove them while indexing? 3. Query with terms containing a special character? On Tue, Jan 6, 2009 at 2:55 PM, Sujatha Arun suja.a...@gmail.com wrote: Hi, Can anyone point me to the thread if it exists on indexing special characters in solr. Regards Sujatha -- Regards, Shalin Shekhar Mangar. -- Regards, Shalin Shekhar Mangar.
Re: Default Solr Query
Hi Bhawani' Ur Query should be *:*.Try this and have fun.! Bhawani Sharma wrote: Hi All, I want to fetch all the data from database. so what my Solr query should be to get all documents from database? like in mysql syntex is : SELECT * FROM table; so what will be the syntex of this query in solr ? Please reply ASAP. Thanks in Advance. Thanks: Bhawani Sharma -- View this message in context: http://www.nabble.com/Default-Solr-Query-tp21307309p21308455.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: DataImport
Paul, Thanks for the feedback and it does work. So if I understand this the app server code (Jetty) is not reading in the environment variables for the other libraries I need. How do I add the JDBC files to the path so that I don't need to copy the files into the directory? Does jetty have a config file I should look at? Noble Paul നോബിള് नोब्ळ् wrote: The driver can be put directly into the WEB-INF/lib of the solr web app or it can be put into ${solr.home}/lib dir. or if something is really screwed up you can try the old fashioned way of putting your driver jar into JAVA_HOME/lib/ext --Noble On Tue, Jan 6, 2009 at 7:05 AM, Performance dcr...@crossview.com wrote: I have been following this tutorial but I can't seem to get past an error related to not being able to load the DB2 Driver. The user has all the right config to load the JDBC driver and Squirrel works fine. Do I need to update and path within Solr? muxa wrote: Looked through the tutorial on data import, section Full Import Example. 1) Where is this dataimport.jar? There is no such file in the extracted example-solr-home.jar. 2) Use the solr folder inside example-data-config folder as your solr home. What does this mean? Anyway, there is no folder example-data-config. Ar cieņu, Mihails -- View this message in context: http://www.nabble.com/DataImport-tp17730791p21301571.html Sent from the Solr - User mailing list archive at Nabble.com. -- --Noble Paul -- View this message in context: http://www.nabble.com/DataImport-tp17730791p21309725.html Sent from the Solr - User mailing list archive at Nabble.com.
delta index produces multiple results?
Hi, I use the DIH with RDBMS for indexing a large mysql database with about 7 mill. entries. Full index is working fine, in schema.xml I implemented a uniqueKey field (which is of the type 'text'). I start queries with the dismax query handler, and get my results as an php array. Now, since the database entries change every second, I use the delta query property to a) delete documents from the index that have been deleted in the database (there´s a table for deleted items) and b) update documents in the index that have changed since the last index (there´s a last_modified-column in a table for that). From my understanding, when I start a delta-import, the DIH checks the deletedPkQuery first and deletes the documents that should be deleted (identified by the uniqueKey-field?). Seems to work - the catalina.out says INFO: deleted from document to Solr: 1851010 for example. Next thing would be the deltaQuery. This seems to work, too - when finished, a query returns the new database entries. But (and here comes the problem): The dataimport status always says Added / Changed x-hundred documents, deleted 0 documents - no deletes? Everytime I change an item in the database, and do a delta-import after that, my next query will return that item *twice*. After the next change and next delta-import solr will return *three* result documents, and so on. As I mentioned before, I get my search results as an array, consisting of many arrays (= solr documents) with the fields I set in schema.xml. After changing some documents and delta-indexing them, I get lots of identical arrays (even the uniqueKey-field is absolutely identical). I have read somewhere in the wiki, that an update is a delete of the old document plus a new document. I guess the problem could be that something fails with the delete- process, but I don´t have a clue why. Any ideas? Thanks in advance Chris
RE: Using query functions against a type field
:It should be fairly predictible, can you elaborate on what problems you :have just adding boost queries for the specific types? The boost queries are true queries, so the amount boost can be affected by things like term frequency for the query. The functions aren't affected by this and therefore more predictable over the life of the index. If I want to boost documents via multiple factors, their interaction is very important. If that interaction slowly changes over the life of the index, I lose that control. :a generic Parser/ValueSource that let you specific term=float mappings in :it's init params would certianly make a cool patch for Solr. I do believe I will work on this (may take me a bit). Once I nail it down, I've got a couple of other easier query functions I would like to add as well, if they hold value for the community. -Hoss
Re: Special characters
Thanks. When i give the uriencoding=utf8 in tomcat's server.xml file some of the special chars are indexed and searchable ,while others are not. eg: Bernhard Schölkopf ,János Kornai These are indexed and searchable after the above change.On the browser however some others display as junk chars .The encoding of browser is utf-8 Regards Sujatha On Tue, Jan 6, 2009 at 3:28 PM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: Kumar's advice is sound. You must make sure you are actually indexing the special symbols. To make a query with special characters you must make sure you urlencode the parameters before sending them to Solr. There are some symbols which have a special meaning in the lucene query syntax are '+', '-', ':' which you will have to escape by adding a backslash in front of it. On Tue, Jan 6, 2009 at 3:15 PM, Jana, Kumar Raja kj...@ptc.com wrote: Filtering of special characters depends on the filters you use for the fields in your schema.xml. If you are using WordDelimiterFilterFactory in your analyzer then the special characters get removed during the processing of your field. But the WordDelimiterFilterFactory does a lot of other things too than just removing the special characters. If you feel that you can do away with the other features provided by the filter then you can remove it from your schema.xml file. In any other case, I guess you will have to customize the WordDelimiterFilter.java class to suit your purpose. -Kumar -Original Message- From: Sujatha Arun [mailto:suja.a...@gmail.com] Sent: Tuesday, January 06, 2009 3:05 PM To: solr-user@lucene.apache.org Subject: Re: Special characters Hi, I would like to query terms containing special chars . Regards Sujatha On Tue, Jan 6, 2009 at 2:59 PM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: You forgot to tell us what do you want to do with special characters? 1. Remove them from the documents while indexing? 2. Don't remove them while indexing? 3. Query with terms containing a special character? On Tue, Jan 6, 2009 at 2:55 PM, Sujatha Arun suja.a...@gmail.com wrote: Hi, Can anyone point me to the thread if it exists on indexing special characters in solr. Regards Sujatha -- Regards, Shalin Shekhar Mangar. -- Regards, Shalin Shekhar Mangar.
Re: Date Range Search
For date range search, use alldate:[date1 T23:59:59Z TO date2 T23:59:59Z]. Thanks, Sourabh Gavin-39 wrote: Hi, Can some one tell me how I can achieve date range searches? For instance if I save the DOB as a solr date field how can I do a search to get the people, 1. Who are older than 30 years 2. Who were born in 1975 etc. Greatly appreciate your help. Thanks, -- Gavin Selvaratnam, Project Leader hSenid Mobile Solutions Phone: +94-11-2446623/4 Fax: +94-11-2307579 Web: http://www.hSenidMobile.com Make it happen Disclaimer: This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to which they are addressed. The content and opinions contained in this email are not necessarily those of hSenid Software International. If you have received this email in error please contact the sender. -- View this message in context: http://www.nabble.com/Date-Range-Search-tp15305477p21314862.html Sent from the Solr - User mailing list archive at Nabble.com.
Snapinstaller vs Solr Restart
I'm running load tests against my Solr instance. I find that it typically takes ~10 minutes for my Solr setup to warm-up while I throw my test queries at it. Also, I have the same two warm-up queries specified for the firstSearcher and newSearcher event listeners. I'm now benchmarking the affect of updating an index under load. I'm finding that after running snapinstaller, Solr takes ~1 hour to get back to the same performance numbers I was getting 10 minutes after a restart. If I can justify being offline for a few moments, it seems like I'll be better off restarting Solr rather than running Snapinstaller. Any ideas why? Thanks. -- View this message in context: http://www.nabble.com/Snapinstaller-vs-Solr-Restart-tp21315273p21315273.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: Snapinstaller vs Solr Restart
First suspect would be Filter Cache settings and Query Cache settings. If they are auto-warming at all, then there is a definite difference between the first start behavior and the post-commit behavior. This affects what's in memory, caches, etc. -Todd Feak -Original Message- From: wojtekpia [mailto:wojte...@hotmail.com] Sent: Tuesday, January 06, 2009 9:46 AM To: solr-user@lucene.apache.org Subject: Snapinstaller vs Solr Restart I'm running load tests against my Solr instance. I find that it typically takes ~10 minutes for my Solr setup to warm-up while I throw my test queries at it. Also, I have the same two warm-up queries specified for the firstSearcher and newSearcher event listeners. I'm now benchmarking the affect of updating an index under load. I'm finding that after running snapinstaller, Solr takes ~1 hour to get back to the same performance numbers I was getting 10 minutes after a restart. If I can justify being offline for a few moments, it seems like I'll be better off restarting Solr rather than running Snapinstaller. Any ideas why? Thanks. -- View this message in context: http://www.nabble.com/Snapinstaller-vs-Solr-Restart-tp21315273p21315273. html Sent from the Solr - User mailing list archive at Nabble.com.
RE: Using query functions against a type field
I'm not sure I followed all that Yonik. Are you saying that I can achieve this affect now with a bq setting in my DisMax query instead of via a bf setting? -Todd Feak -Original Message- From: Yonik Seeley [mailto:ysee...@gmail.com] Sent: Tuesday, January 06, 2009 9:46 AM To: solr-user@lucene.apache.org Subject: Re: Using query functions against a type field On Tue, Jan 6, 2009 at 10:41 AM, Feak, Todd todd.f...@smss.sony.com wrote: The boost queries are true queries, so the amount boost can be affected by things like term frequency for the query. Sounds like a constant score query is a general way to do this. Possible QParser syntax: {!const}tag:FOO OR tag:BAR Could be implemented via ConstantScoreQuery(QueryWrapperFilter(theQuery)) The value could be the boost, optionally set within this QParser... {!const v=2.0}tag:FOO OR tag:BAR -Yonik
RE: Snapinstaller vs Solr Restart
Sorry, I forgot to include that. All my autowarmcount's are set to 0. Feak, Todd wrote: First suspect would be Filter Cache settings and Query Cache settings. If they are auto-warming at all, then there is a definite difference between the first start behavior and the post-commit behavior. This affects what's in memory, caches, etc. -Todd Feak -- View this message in context: http://www.nabble.com/Snapinstaller-vs-Solr-Restart-tp21315273p21315654.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Using query functions against a type field
On Tue, Jan 6, 2009 at 10:41 AM, Feak, Todd todd.f...@smss.sony.com wrote: The boost queries are true queries, so the amount boost can be affected by things like term frequency for the query. Sounds like a constant score query is a general way to do this. Possible QParser syntax: {!const}tag:FOO OR tag:BAR Could be implemented via ConstantScoreQuery(QueryWrapperFilter(theQuery)) The value could be the boost, optionally set within this QParser... {!const v=2.0}tag:FOO OR tag:BAR -Yonik
DataImportHandler (reading XML w/ paging)
Hi, Anyone have a quick, clever way of dealing w/ paged XML for DataImportHandler? I have metadata like this: paging pageNumber1/pageNumber totalPages3/totalPages count15/count /paging I unfortunately can not get all the data in one shot so I need to maybe a number of requests obtained from the paging meta, but can't figure out if this is dynamically possible w/ the current DIH setup. Any tips? Thanks. - Jon
Re: Using query functions against a type field
On Tue, Jan 6, 2009 at 1:05 PM, Feak, Todd todd.f...@smss.sony.com wrote: I'm not sure I followed all that Yonik. Are you saying that I can achieve this affect now with a bq setting in my DisMax query instead of via a bf setting? Yep, a const QParser would enable that. bq={!const}foo:bar -Yonik
RE: Using query functions against a type field
Thanks Yonik! I still may investigate the query function stuff that was discussed, as Hoss indicated it may hold value. -Todd Feak -Original Message- From: Yonik Seeley [mailto:ysee...@gmail.com] Sent: Tuesday, January 06, 2009 10:19 AM To: solr-user@lucene.apache.org Subject: Re: Using query functions against a type field On Tue, Jan 6, 2009 at 1:05 PM, Feak, Todd todd.f...@smss.sony.com wrote: I'm not sure I followed all that Yonik. Are you saying that I can achieve this affect now with a bq setting in my DisMax query instead of via a bf setting? Yep, a const QParser would enable that. bq={!const}foo:bar -Yonik
Re: Snapinstaller vs Solr Restart
Is autowarm count of 0 a good idea, though? If you don't want to autowarm any caches, doesn't that imply that you have very low hit rate and therefore don't care to autowarm? And if you have a very low hit rate, then perhaps caches are not needed at all? How about this. Do you optimize your index at any point? Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: wojtekpia wojte...@hotmail.com To: solr-user@lucene.apache.org Sent: Tuesday, January 6, 2009 1:07:28 PM Subject: RE: Snapinstaller vs Solr Restart Sorry, I forgot to include that. All my autowarmcount's are set to 0. Feak, Todd wrote: First suspect would be Filter Cache settings and Query Cache settings. If they are auto-warming at all, then there is a definite difference between the first start behavior and the post-commit behavior. This affects what's in memory, caches, etc. -Todd Feak -- View this message in context: http://www.nabble.com/Snapinstaller-vs-Solr-Restart-tp21315273p21315654.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Snapinstaller vs Solr Restart
I use my warm up queries to fill the field cache (or at least that's the idea). My filterCache hit rate is ~99% queryResultCache is ~65%. I update my index several times a day with no 'optimize', and performance is seemless. I also update my index once nightly with an 'optimize', and that's where I see the performance drop. I'll try turning autowarming on. Could this have to do with file caching by the OS? Otis Gospodnetic wrote: Is autowarm count of 0 a good idea, though? If you don't want to autowarm any caches, doesn't that imply that you have very low hit rate and therefore don't care to autowarm? And if you have a very low hit rate, then perhaps caches are not needed at all? How about this. Do you optimize your index at any point? -- View this message in context: http://www.nabble.com/Snapinstaller-vs-Solr-Restart-tp21315273p21319344.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Snapinstaller vs Solr Restart
OK, so that question/answer seems to have hit the nail on the head. :) When you optimize your index, all index files get rewritten. This means that everything that the OS cached up to that point goes out the window and the OS has to slowly re-cache the hot parts of the index. If you don't optimize, this won't happen. Do you really need to optimize? Or maybe a more direct question: why are you optimizing? Regarding autowarming, with such high fq hit rate, I'd make good use of fq autowarming. The result cache rate is lower, but still decent. I wouldn't turn off autowarming the way you have. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: wojtekpia wojte...@hotmail.com To: solr-user@lucene.apache.org Sent: Tuesday, January 6, 2009 4:20:18 PM Subject: Re: Snapinstaller vs Solr Restart I use my warm up queries to fill the field cache (or at least that's the idea). My filterCache hit rate is ~99% queryResultCache is ~65%. I update my index several times a day with no 'optimize', and performance is seemless. I also update my index once nightly with an 'optimize', and that's where I see the performance drop. I'll try turning autowarming on. Could this have to do with file caching by the OS? Otis Gospodnetic wrote: Is autowarm count of 0 a good idea, though? If you don't want to autowarm any caches, doesn't that imply that you have very low hit rate and therefore don't care to autowarm? And if you have a very low hit rate, then perhaps caches are not needed at all? How about this. Do you optimize your index at any point? -- View this message in context: http://www.nabble.com/Snapinstaller-vs-Solr-Restart-tp21315273p21319344.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: Snapinstaller vs Solr Restart
Kind of a side-note, but I think it may be worth your while. If your queryResultCache hit rate is 65%, consider putting a reverse proxy in front of Solr. It can give performance boosts over the query cache in Solr, as it doesn't have to pay the cost of reformulating the response. I've used Varnish with great results. Squid is another option. -Todd Feak -Original Message- From: wojtekpia [mailto:wojte...@hotmail.com] Sent: Tuesday, January 06, 2009 1:20 PM To: solr-user@lucene.apache.org Subject: Re: Snapinstaller vs Solr Restart I use my warm up queries to fill the field cache (or at least that's the idea). My filterCache hit rate is ~99% queryResultCache is ~65%. I update my index several times a day with no 'optimize', and performance is seemless. I also update my index once nightly with an 'optimize', and that's where I see the performance drop. I'll try turning autowarming on. Could this have to do with file caching by the OS? Otis Gospodnetic wrote: Is autowarm count of 0 a good idea, though? If you don't want to autowarm any caches, doesn't that imply that you have very low hit rate and therefore don't care to autowarm? And if you have a very low hit rate, then perhaps caches are not needed at all? How about this. Do you optimize your index at any point? -- View this message in context: http://www.nabble.com/Snapinstaller-vs-Solr-Restart-tp21315273p21319344. html Sent from the Solr - User mailing list archive at Nabble.com.
Unable to choose request handler
Hi, In my solrconfig.xml file there are two request handlers configured: one uses defType=dismax, and the other doesn't. However, it seems that when the dismax request handler is set as my default, I have no way of using the standard request handler . Here is the relevant part of my solrconfig.xml: requestHandler name=standard class=solr.SearchHandler !-- default values for query parameters -- lst name=defaults str name=echoParamsexplicit/str /lst /requestHandler requestHandler name=dismax class=solr.SearchHandler default=true lst name=defaults str name=defTypedismax/str str name=echoParamsexplicit/str /lst /requestHandler When I run a query with the parameters qt=standarddebugQuery=true, I can see that it is still using the DismaxQueryParser. There doesn't seem to be any way to use the standard request handler. On the other hand, when I set the standard request handler as my default, the behaviour is equally strange. When I specify no qt parameter at all, it uses the standard request handler as it should. However, when I enter either qt=standard or qt=dismax, it uses the dismax request handler! So it appears that the only way I can choose the request handler I want is to make the standard request handler my default, then specify no qt parameter if I want to use it. Has anyone else tried this? Mark
Re: Unable to choose request handler
It seems that the problem is related to the defType parameter. When I specify defType=, it uses the correct request handler. It seems that it is using the correct request handler, but it is defaulting to defType=dismax, even though I have not specified that parameter in the standard request handler configuration. On Tue, Jan 6, 2009 at 2:57 PM, Mark Ferguson mark.a.fergu...@gmail.comwrote: Hi, In my solrconfig.xml file there are two request handlers configured: one uses defType=dismax, and the other doesn't. However, it seems that when the dismax request handler is set as my default, I have no way of using the standard request handler . Here is the relevant part of my solrconfig.xml: requestHandler name=standard class=solr.SearchHandler !-- default values for query parameters -- lst name=defaults str name=echoParamsexplicit/str /lst /requestHandler requestHandler name=dismax class=solr.SearchHandler default=true lst name=defaults str name=defTypedismax/str str name=echoParamsexplicit/str /lst /requestHandler When I run a query with the parameters qt=standarddebugQuery=true, I can see that it is still using the DismaxQueryParser. There doesn't seem to be any way to use the standard request handler. On the other hand, when I set the standard request handler as my default, the behaviour is equally strange. When I specify no qt parameter at all, it uses the standard request handler as it should. However, when I enter either qt=standard or qt=dismax, it uses the dismax request handler! So it appears that the only way I can choose the request handler I want is to make the standard request handler my default, then specify no qt parameter if I want to use it. Has anyone else tried this? Mark
Re: Snapinstaller vs Solr Restart
I'm optimizing because I thought I should. I'll be updating my index somewhere between every 15 minutes, and every 2 hours. That means between 12 and 96 updates per day. That seems like a lot of index files (and it scared me a little), so that's my second reason for wanting to optimize nightly. I haven't benchmarked the performance hit for not optimizing. That'll be my next step. If the hit isn't too bad, I'll look into optimizing less frequently (weekly, ...). Thanks Otis! Otis Gospodnetic wrote: OK, so that question/answer seems to have hit the nail on the head. :) When you optimize your index, all index files get rewritten. This means that everything that the OS cached up to that point goes out the window and the OS has to slowly re-cache the hot parts of the index. If you don't optimize, this won't happen. Do you really need to optimize? Or maybe a more direct question: why are you optimizing? Regarding autowarming, with such high fq hit rate, I'd make good use of fq autowarming. The result cache rate is lower, but still decent. I wouldn't turn off autowarming the way you have. -- View this message in context: http://www.nabble.com/Snapinstaller-vs-Solr-Restart-tp21315273p21320334.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Unable to choose request handler
I apologize, entering the defType parameter explicitly has nothing to do with it, this was a caching issue. I tested the different configurations thoroughly, and this is what I've come up with: - When using 'dismax' request handler as default: - Queries are always parsed using the dismax parser, whether I use qt=standard, qt=dismax, or qt=. It _does_ use the correct request handler, because the echo'd params are correct for that handler. However, it seems to always be using defType=dismax. I can tell this because when I use the parameter debugQuery=true, I can see that it is creating a DisjunctionMaxQuery. - When using 'standard' request handler as default: - The behaviour is as expected. When I enter no qt parameter or qt=standard, it uses the standard request handler and doesn't use dismax for the defType. When I use qt=dismax, it uses the dismax request handler and dismax for the defType. So the problem is when setting the default request handler to dismax, it always uses defType=dismax (even though it uses the 'standard' request handler). defType=dismax does not show up in the echo'd parameters, but I can tell by using debugQuery=true (and the fact that I get no results when I specify a field). Can someone try reproducing this using the configuration I specified in my first post? Sorry again for being confusing, I got sidetracked by the caching issue. Mark On Tue, Jan 6, 2009 at 3:01 PM, Mark Ferguson mark.a.fergu...@gmail.comwrote: It seems that the problem is related to the defType parameter. When I specify defType=, it uses the correct request handler. It seems that it is using the correct request handler, but it is defaulting to defType=dismax, even though I have not specified that parameter in the standard request handler configuration. On Tue, Jan 6, 2009 at 2:57 PM, Mark Ferguson mark.a.fergu...@gmail.comwrote: Hi, In my solrconfig.xml file there are two request handlers configured: one uses defType=dismax, and the other doesn't. However, it seems that when the dismax request handler is set as my default, I have no way of using the standard request handler . Here is the relevant part of my solrconfig.xml: requestHandler name=standard class=solr.SearchHandler !-- default values for query parameters -- lst name=defaults str name=echoParamsexplicit/str /lst /requestHandler requestHandler name=dismax class=solr.SearchHandler default=true lst name=defaults str name=defTypedismax/str str name=echoParamsexplicit/str /lst /requestHandler When I run a query with the parameters qt=standarddebugQuery=true, I can see that it is still using the DismaxQueryParser. There doesn't seem to be any way to use the standard request handler. On the other hand, when I set the standard request handler as my default, the behaviour is equally strange. When I specify no qt parameter at all, it uses the standard request handler as it should. However, when I enter either qt=standard or qt=dismax, it uses the dismax request handler! So it appears that the only way I can choose the request handler I want is to make the standard request handler my default, then specify no qt parameter if I want to use it. Has anyone else tried this? Mark
Re: Snapinstaller vs Solr Restart
Lower your mergeFactor and Lucene will merge segments(i.e. fewer index files) and purge deletes more often for you at the expense of somewhat slower indexing. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: wojtekpia wojte...@hotmail.com To: solr-user@lucene.apache.org Sent: Tuesday, January 6, 2009 5:18:26 PM Subject: Re: Snapinstaller vs Solr Restart I'm optimizing because I thought I should. I'll be updating my index somewhere between every 15 minutes, and every 2 hours. That means between 12 and 96 updates per day. That seems like a lot of index files (and it scared me a little), so that's my second reason for wanting to optimize nightly. I haven't benchmarked the performance hit for not optimizing. That'll be my next step. If the hit isn't too bad, I'll look into optimizing less frequently (weekly, ...). Thanks Otis! Otis Gospodnetic wrote: OK, so that question/answer seems to have hit the nail on the head. :) When you optimize your index, all index files get rewritten. This means that everything that the OS cached up to that point goes out the window and the OS has to slowly re-cache the hot parts of the index. If you don't optimize, this won't happen. Do you really need to optimize? Or maybe a more direct question: why are you optimizing? Regarding autowarming, with such high fq hit rate, I'd make good use of fq autowarming. The result cache rate is lower, but still decent. I wouldn't turn off autowarming the way you have. -- View this message in context: http://www.nabble.com/Snapinstaller-vs-Solr-Restart-tp21315273p21320334.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Unable to choose request handler
On Tue, Jan 6, 2009 at 5:01 PM, Mark Ferguson mark.a.fergu...@gmail.com wrote: It seems that the problem is related to the defType parameter. When I specify defType=, it uses the correct request handler. It seems that it is using the correct request handler, but it is defaulting to defType=dismax, even though I have not specified that parameter in the standard request handler configuration. defType only controls the default type of the main query (not the whole handler). Try defType=lucene -Yonik On Tue, Jan 6, 2009 at 2:57 PM, Mark Ferguson mark.a.fergu...@gmail.comwrote: Hi, In my solrconfig.xml file there are two request handlers configured: one uses defType=dismax, and the other doesn't. However, it seems that when the dismax request handler is set as my default, I have no way of using the standard request handler . Here is the relevant part of my solrconfig.xml: requestHandler name=standard class=solr.SearchHandler !-- default values for query parameters -- lst name=defaults str name=echoParamsexplicit/str /lst /requestHandler requestHandler name=dismax class=solr.SearchHandler default=true lst name=defaults str name=defTypedismax/str str name=echoParamsexplicit/str /lst /requestHandler When I run a query with the parameters qt=standarddebugQuery=true, I can see that it is still using the DismaxQueryParser. There doesn't seem to be any way to use the standard request handler. On the other hand, when I set the standard request handler as my default, the behaviour is equally strange. When I specify no qt parameter at all, it uses the standard request handler as it should. However, when I enter either qt=standard or qt=dismax, it uses the dismax request handler! So it appears that the only way I can choose the request handler I want is to make the standard request handler my default, then specify no qt parameter if I want to use it. Has anyone else tried this? Mark
Re: Unable to choose request handler
Thanks, this fixed the problem. Maybe this parameter could be added to the standard request handler in the sample solrconfig.xml, as it is confusing that it uses the default request handler's defType even when not using that handler. I didn't completely understand your explanation, though. Thanks for the fix. Mark On Tue, Jan 6, 2009 at 3:40 PM, Yonik Seeley ysee...@gmail.com wrote: On Tue, Jan 6, 2009 at 5:01 PM, Mark Ferguson mark.a.fergu...@gmail.com wrote: It seems that the problem is related to the defType parameter. When I specify defType=, it uses the correct request handler. It seems that it is using the correct request handler, but it is defaulting to defType=dismax, even though I have not specified that parameter in the standard request handler configuration. defType only controls the default type of the main query (not the whole handler). Try defType=lucene -Yonik On Tue, Jan 6, 2009 at 2:57 PM, Mark Ferguson mark.a.fergu...@gmail.com wrote: Hi, In my solrconfig.xml file there are two request handlers configured: one uses defType=dismax, and the other doesn't. However, it seems that when the dismax request handler is set as my default, I have no way of using the standard request handler . Here is the relevant part of my solrconfig.xml: requestHandler name=standard class=solr.SearchHandler !-- default values for query parameters -- lst name=defaults str name=echoParamsexplicit/str /lst /requestHandler requestHandler name=dismax class=solr.SearchHandler default=true lst name=defaults str name=defTypedismax/str str name=echoParamsexplicit/str /lst /requestHandler When I run a query with the parameters qt=standarddebugQuery=true, I can see that it is still using the DismaxQueryParser. There doesn't seem to be any way to use the standard request handler. On the other hand, when I set the standard request handler as my default, the behaviour is equally strange. When I specify no qt parameter at all, it uses the standard request handler as it should. However, when I enter either qt=standard or qt=dismax, it uses the dismax request handler! So it appears that the only way I can choose the request handler I want is to make the standard request handler my default, then specify no qt parameter if I want to use it. Has anyone else tried this? Mark
Re: Setting up DataImportHandler for Oracle datasource on JBoss
If add the document tag and an entity, I still get the same error when starting up JBoss. Here is my full data-config.xml dataconf dataSource type=JdbcDataSource driver=oracle.jdbc.OracleDriver url=jdbc:oracle:thin:@host:port:service user=pctadm password=pctadm/ document name=products entity name=product query=select prd_id from pct_product field column=prd_id name=id/ /entity /document /dataconf I also have this field one field in my schema.xml nested under fields field name=id type=string indexed=true stored=true required=true / When I restart Jboss I get the same stacktrace. ... 2009-01-07 08:41:40,428 ERROR [STDERR] 7/01/2009 08:41:40 org.apache.solr.handler.dataimport.DataImportHandler inform SEVERE: Exception while loading DataImporter org.apache.solr.handler.dataimport.DataImportHandlerException: Exception occurred while initializing context Processing Document # at org.apache.solr.handler.dataimport.DataImporter.loadDataConfig(DataImporter.java:176) at org.apache.solr.handler.dataimport.DataImporter.init(DataImporter.java:93) at org.apache.solr.handler.dataimport.DataImportHandler.inform(DataImportHandler.java:106) at org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:311) at org.apache.solr.core.SolrCore.init(SolrCore.java:480) ... Caused by: java.lang.NullPointerException at org.apache.solr.handler.dataimport.DataConfig.getChildNodes(DataConfig.java:324) at org.apache.solr.handler.dataimport.DataConfig.readFromXml(DataConfig.java:236) at org.apache.solr.handler.dataimport.DataImporter.loadDataConfig(DataImporter.java:170) ... 140 more Am I missing anything else? Noble Paul നോബിള് नोब्ळ् wrote: the document tag and the rest of the stuff is missing in your data-config file On Tue, Jan 6, 2009 at 12:50 PM, The Flight Captain jason_sheph...@flightcentre.com wrote: I am having trouble setting up an Oracle datasource. Can anyone help me connect to the datasource? My solrconfig.xml: ... requestHandler name=/dataimport class=org.apache.solr.handler.dataimport.DataImportHandler lst name=defaults str name=configdata-config.xml/str /lst /requestHandler ... My data-config.xml dataconf dataSource type=JdbcDataSource driver=oracle.jdbc.OracleDriver url=jdbc:oracle:thin:@hostname:port:service user=username password=password/ /dataSource /dataconf I have placed the oracle driver on the classpath of JBoss. I am getting the following errors in the server.log on startup: 2009-01-06 17:03:12,756 ERROR [STDERR] 6/01/2009 17:03:12 org.apache.solr.handler.dataimport.DataImportHandler inform SEVERE: Exception while loading DataImporter org.apache.solr.handler.dataimport.DataImportHandlerException: Exception occurred while initializing context Processing Document # at org.apache.solr.handler.dataimport.DataImporter.loadDataConfig(DataImporter.java:176) at org.apache.solr.handler.dataimport.DataImporter.init(DataImporter.java:93) at org.apache.solr.handler.dataimport.DataImportHandler.inform(DataImportHandler.java:106) at org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:311) at org.apache.solr.core.SolrCore.init(SolrCore.java:480) at org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:119) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:69) at org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:275) at org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:397) at org.apache.catalina.core.ApplicationFilterConfig.init(ApplicationFilterConfig.java:108) at org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3720) at org.apache.catalina.core.StandardContext.start(StandardContext.java:4358) at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:752) at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:732) at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:553) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.tomcat.util.modeler.BaseModelMBean.invoke(BaseModelMBean.java:297) at org.jboss.mx.server.RawDynamicInvoker.invoke(RawDynamicInvoker.java:164) at org.jboss.mx.server.MBeanServerImpl.invoke(MBeanServerImpl.java:659) at
Re: Error during indexing.
What's the XML you're sending it? It's got something invalid in it, obviously. How are you indexing? Via SolrJ? Or some other POST way? Erik On Jan 6, 2009, at 2:27 PM, Tushar_Gandhi wrote: Hi, I am getting an error whenever I am going to index specifically photo objects. For other objects it is working. Error is :- SEVERE: com.ctc.wstx.exc.WstxUnexpectedCharException: Illegal character (NULL, unicode 0) encountered: not valid in any content at [row,col {unknown-source}]: [1,3127] at com .ctc .wstx.sr.StreamScanner.constructNullCharException(StreamScanner.java: 640) at com.ctc.wstx.sr.StreamScanner.throwInvalidSpace(StreamScanner.java: 669) at com.ctc.wstx.sr.StreamScanner.throwInvalidSpace(StreamScanner.java: 660) at com .ctc .wstx.sr.BasicStreamReader.readCDataPrimary(BasicStreamReader.java: 4240) at com .ctc .wstx .sr .BasicStreamReader.nextFromTreeCommentOrCData(BasicStreamReader.java: 3280) at com .ctc.wstx.sr.BasicStreamReader.nextFromTree(BasicStreamReader.java: 2824) at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1019) at org .apache .solr .handler .XmlUpdateRequestHandler.readDoc(XmlUpdateRequestHandler.java:321) at org .apache .solr .handler .XmlUpdateRequestHandler.processUpdate(XmlUpdateRequestHandler.java: 195) at org .apache .solr .handler .XmlUpdateRequestHandler .handleRequestBody(XmlUpdateRequestHandler.java:123) at org .apache .solr .handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1204) at org .apache .solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:303) at org .apache .solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:232) at org .apache .catalina .core .ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java: 202) at org .apache .catalina .core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:173) at org .apache .catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java: 213) at org .apache .catalina.core.StandardContextValve.invoke(StandardContextValve.java: 178) at org .apache .catalina.core.StandardHostValve.invoke(StandardHostValve.java:126) at org .apache .catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:105) at org .apache .catalina.core.StandardEngineValve.invoke(StandardEngineValve.java: 107) at org .apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java: 148) at org.apache.jk.server.JkCoyoteHandler.invoke(JkCoyoteHandler.java: 199) at org.apache.jk.common.HandlerRequest.invoke(HandlerRequest.java:282) at org.apache.jk.common.ChannelSocket.invoke(ChannelSocket.java:754) at org .apache.jk.common.ChannelSocket.processConnection(ChannelSocket.java: 684) at org.apache.jk.common.ChannelSocket $SocketConnection.runIt(ChannelSocket.java:876) at org.apache.tomcat.util.threads.ThreadPool $ControlRunnable.run(ThreadPool.java:684) at java.lang.Thread.run(Thread.java:595) Anyone can help me out? Thanks, Tushar -- View this message in context: http://www.nabble.com/Error-during-indexing.-tp21317294p21317294.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: Solr FAQ entry about Dynamically calculated range facet topic
So did anyone put together a FAQ on this subject? I am also interested in seeing the different ways to get dynamic faceting to work. In this post, Chris Hostetter dropped a piece of handler code. Is it still the right path to take for those generated ranges: $0..$20 (3) $20..$75 (15) $75..$123 (8) Re: Dynamically calculated range facet http://www.mail-archive.com/solr-user@lucene.apache.org/msg04727.html
Re: date range query performance
Can someone explain what this means to me? I'm having a similar performance issue - it's an index with only 1 million records or so, but when trying to search on a date range it takes 30 seconds! Yes, this date is one with hours, minutes, seconds in them -- do I need to create an additional field without the time component and reindex all my documents so I can get decent search performance? Or can I tell Solr Please ignore the time and do something in a reasonable timeframe (GRIN) Thanks. On Fri, Oct 31, 2008 at 10:28 PM, Michael Lackhoff mich...@lackhoff.dewrote: On 01.11.2008 06:10 Erik Hatcher wrote: Yeah, this should work fine: field name=timestamp type=date indexed=true stored=true default=NOW/DAY multiValued=false/ Wow, that was fast, thanks! -Michael
Re: how large can the index be?
Why is NFS mounting such a bad idea? Some solutions for high available disks suggest that you DO mount the disks NFS to the boxes that need the data. On Mon, Dec 29, 2008 at 7:42 PM, Otis Gospodnetic otis_gospodne...@yahoo.com wrote: What you have below is not really what we call Distributed Search, but more of Query Load Balancing. Yes, the diagram below will work IF a single Solr box (A or B) can really handle a full 50M doc index. Of course handle can be fuzzy. That is, you could have a large index on a Solr box and it will handle it - nothing will crash, nothing will die, it's just that it may not be able to handle it well enough - that is, the queries may take longer than you'd like. NFS mounting an index directory is a separate story and very often a bad idea, again because of performance. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Antonio Eggberg antonio_eggb...@yahoo.se To: solr-user@lucene.apache.org Sent: Monday, December 29, 2008 4:19:23 PM Subject: Re: how large can the index be? Thanks you very much for your answer. I was afraid of that the each document has about 20 fields.. As you pointed out it will slow down. Anyway I am thinking is it not possible to do the following: Load Balancer | Solr A, Solr B, ... | one index So I send 50% query to Solr A, 50% to Solr B and so forth.. is this not good? Also to add The index will be like a mounted drive to the solr boxes... On the above do I really need to worry about Solr Master, Solr Slave? It probably solve my load but I think query speed will be slow... Just curious anyone using distributed search in production? Cheers --- Den mån 2008-12-29 skrev Otis Gospodnetic : Från: Otis Gospodnetic Ämne: Re: how large can the index be? Till: solr-user@lucene.apache.org Datum: måndag 29 december 2008 21.53 Hi Antonio, Besides thinking in terms of documents, you also need to think in terms of index size on the file system vs. the amount of RAM your search application/server can use. 50M documents may be doable on a single server if those documents are not too large and you have sufficient RAM. It gets even better if your index doesn't change very often and if you can get decent hit ratios on the various Solr caches. If you are indexing largish documents, or even something as small as an average web page, 50M docs may be too much on a commodity box (say dual core 8 GB RAM box) Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Antonio Eggberg To: solr-user@lucene.apache.org Sent: Monday, December 29, 2008 3:41:48 PM Subject: how large can the index be? Hi, We are running successfully a solr index of 3 million docs. I have just been informed that our index size will increase to 50 million. I been going through the doc http://wiki.apache.org/solr/DistributedSearch Seems like we will loose out on the date facet and some more other stuff that we use. which is important to us. So far we been using 1 index and 1 machine. Can I still stick with my 1 index but have many query servers? We don't update our index so often this are rather static data. Over the past year we have updated the index data a total of 3 times and about 300 records :) Can someone provide some idea how/what should I do to deal with new datasets?. Thanks for your help. __ Går det långsamt? Skaffa dig en snabbare bredbandsuppkoppling. Sök och jämför priser hos Kelkoo. http://www.kelkoo.se/c-100015813-bredband.html?partnerId=96914325 __ Låna pengar utan säkerhet. Jämför vilkor online hos Kelkoo. http://www.kelkoo.se/c-100390123-lan-utan-sakerhet.html?partnerId=96915014
Re: Partitioning the index
Are there any particular suggestions on memory size for a machine? I have a box that has only 1 million records on it - yet I'm finding that date searches are already unacceptable (30 seconds) slow. Other searches seem okay though. Thanks! On Thu, Dec 18, 2008 at 2:02 PM, Yonik Seeley ysee...@gmail.com wrote: It's more related to how much memory you have on your boxes, how resource intensive your queries are, how many fields you are trying to facet on, what acceptable response times are, etc. Anyway... a single box is normally good for between 5M and 50M docs, but can fall out of that range (both up and down) depending on the specifics. -Yonik On Wed, Dec 17, 2008 at 9:34 PM, s d s.d.sau...@gmail.com wrote: Hi,Is there a recommended index size (on disk, number of documents) for when to start partitioning it to ensure good response time? Thanks, S
Re: Partitioning the index
On Tue, Jan 6, 2009 at 10:06 PM, Jim Adams jasolru...@gmail.com wrote: Are there any particular suggestions on memory size for a machine? I have a box that has only 1 million records on it - yet I'm finding that date searches are already unacceptable (30 seconds) slow. Other searches seem okay though. I assume this is a date range query (or date faceting)? Range queries with many unique terms in the range is a known limitation, and we should hopefully fix this in 1.4. In the meantime, limiting the precision of dates could help a great deal. -Yonik
Re: DataImportHandler (reading XML w/ paging)
Paging is possible w/ XPathEntityProcessor look at the $hasMore and $nextUrl in the documentation if you can explain better and I may be able to give a better solution eg: where is the metadata coming from and what is the datasource On Tue, Jan 6, 2009 at 11:17 PM, Jon Baer jonb...@gmail.com wrote: Hi, Anyone have a quick, clever way of dealing w/ paged XML for DataImportHandler? I have metadata like this: paging pageNumber1/pageNumber totalPages3/totalPages count15/count /paging I unfortunately can not get all the data in one shot so I need to maybe a number of requests obtained from the paging meta, but can't figure out if this is dynamically possible w/ the current DIH setup. Any tips? Thanks. - Jon -- --Noble Paul
Re: Setting up DataImportHandler for Oracle datasource on JBoss
the root node is not dataconf it should be dataConfig On Wed, Jan 7, 2009 at 4:23 AM, The Flight Captain jason_sheph...@flightcentre.com wrote: If add the document tag and an entity, I still get the same error when starting up JBoss. Here is my full data-config.xml dataconf dataSource type=JdbcDataSource driver=oracle.jdbc.OracleDriver url=jdbc:oracle:thin:@host:port:service user=pctadm password=pctadm/ document name=products entity name=product query=select prd_id from pct_product field column=prd_id name=id/ /entity /document /dataconf I also have this field one field in my schema.xml nested under fields field name=id type=string indexed=true stored=true required=true / When I restart Jboss I get the same stacktrace. ... 2009-01-07 08:41:40,428 ERROR [STDERR] 7/01/2009 08:41:40 org.apache.solr.handler.dataimport.DataImportHandler inform SEVERE: Exception while loading DataImporter org.apache.solr.handler.dataimport.DataImportHandlerException: Exception occurred while initializing context Processing Document # at org.apache.solr.handler.dataimport.DataImporter.loadDataConfig(DataImporter.java:176) at org.apache.solr.handler.dataimport.DataImporter.init(DataImporter.java:93) at org.apache.solr.handler.dataimport.DataImportHandler.inform(DataImportHandler.java:106) at org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:311) at org.apache.solr.core.SolrCore.init(SolrCore.java:480) ... Caused by: java.lang.NullPointerException at org.apache.solr.handler.dataimport.DataConfig.getChildNodes(DataConfig.java:324) at org.apache.solr.handler.dataimport.DataConfig.readFromXml(DataConfig.java:236) at org.apache.solr.handler.dataimport.DataImporter.loadDataConfig(DataImporter.java:170) ... 140 more Am I missing anything else? Noble Paul നോബിള് नोब्ळ् wrote: the document tag and the rest of the stuff is missing in your data-config file On Tue, Jan 6, 2009 at 12:50 PM, The Flight Captain jason_sheph...@flightcentre.com wrote: I am having trouble setting up an Oracle datasource. Can anyone help me connect to the datasource? My solrconfig.xml: ... requestHandler name=/dataimport class=org.apache.solr.handler.dataimport.DataImportHandler lst name=defaults str name=configdata-config.xml/str /lst /requestHandler ... My data-config.xml dataconf dataSource type=JdbcDataSource driver=oracle.jdbc.OracleDriver url=jdbc:oracle:thin:@hostname:port:service user=username password=password/ /dataSource /dataconf I have placed the oracle driver on the classpath of JBoss. I am getting the following errors in the server.log on startup: 2009-01-06 17:03:12,756 ERROR [STDERR] 6/01/2009 17:03:12 org.apache.solr.handler.dataimport.DataImportHandler inform SEVERE: Exception while loading DataImporter org.apache.solr.handler.dataimport.DataImportHandlerException: Exception occurred while initializing context Processing Document # at org.apache.solr.handler.dataimport.DataImporter.loadDataConfig(DataImporter.java:176) at org.apache.solr.handler.dataimport.DataImporter.init(DataImporter.java:93) at org.apache.solr.handler.dataimport.DataImportHandler.inform(DataImportHandler.java:106) at org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:311) at org.apache.solr.core.SolrCore.init(SolrCore.java:480) at org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:119) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:69) at org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:275) at org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:397) at org.apache.catalina.core.ApplicationFilterConfig.init(ApplicationFilterConfig.java:108) at org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3720) at org.apache.catalina.core.StandardContext.start(StandardContext.java:4358) at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:752) at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:732) at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:553) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.tomcat.util.modeler.BaseModelMBean.invoke(BaseModelMBean.java:297) at
Re: Partitioning the index
It's a range query. I don't have any faceted data. Can I limit the precision of the existing field, or must I re-index? Thanks. On Tue, Jan 6, 2009 at 8:41 PM, Yonik Seeley ysee...@gmail.com wrote: On Tue, Jan 6, 2009 at 10:06 PM, Jim Adams jasolru...@gmail.com wrote: Are there any particular suggestions on memory size for a machine? I have a box that has only 1 million records on it - yet I'm finding that date searches are already unacceptable (30 seconds) slow. Other searches seem okay though. I assume this is a date range query (or date faceting)? Range queries with many unique terms in the range is a known limitation, and we should hopefully fix this in 1.4. In the meantime, limiting the precision of dates could help a great deal. -Yonik
Re: DataImport
Which approach worked? I suggested three Jetty automatically loads jars in WEB-INF/lib it is the responsibility of Solr to load jars from solr.ome/lib it is the responsibility of the JRE to load jars from JAVA_HOME/lib/ext On Tue, Jan 6, 2009 at 6:18 PM, Performance dcr...@crossview.com wrote: Paul, Thanks for the feedback and it does work. So if I understand this the app server code (Jetty) is not reading in the environment variables for the other libraries I need. How do I add the JDBC files to the path so that I don't need to copy the files into the directory? Does jetty have a config file I should look at? Noble Paul നോബിള് नोब्ळ् wrote: The driver can be put directly into the WEB-INF/lib of the solr web app or it can be put into ${solr.home}/lib dir. or if something is really screwed up you can try the old fashioned way of putting your driver jar into JAVA_HOME/lib/ext --Noble On Tue, Jan 6, 2009 at 7:05 AM, Performance dcr...@crossview.com wrote: I have been following this tutorial but I can't seem to get past an error related to not being able to load the DB2 Driver. The user has all the right config to load the JDBC driver and Squirrel works fine. Do I need to update and path within Solr? muxa wrote: Looked through the tutorial on data import, section Full Import Example. 1) Where is this dataimport.jar? There is no such file in the extracted example-solr-home.jar. 2) Use the solr folder inside example-data-config folder as your solr home. What does this mean? Anyway, there is no folder example-data-config. Ar cieņu, Mihails -- View this message in context: http://www.nabble.com/DataImport-tp17730791p21301571.html Sent from the Solr - User mailing list archive at Nabble.com. -- --Noble Paul -- View this message in context: http://www.nabble.com/DataImport-tp17730791p21309725.html Sent from the Solr - User mailing list archive at Nabble.com. -- --Noble Paul
Re: Partitioning the index
You'll need to re-index. On Wed, Jan 7, 2009 at 9:49 AM, Jim Adams jasolru...@gmail.com wrote: It's a range query. I don't have any faceted data. Can I limit the precision of the existing field, or must I re-index? Thanks. On Tue, Jan 6, 2009 at 8:41 PM, Yonik Seeley ysee...@gmail.com wrote: On Tue, Jan 6, 2009 at 10:06 PM, Jim Adams jasolru...@gmail.com wrote: Are there any particular suggestions on memory size for a machine? I have a box that has only 1 million records on it - yet I'm finding that date searches are already unacceptable (30 seconds) slow. Other searches seem okay though. I assume this is a date range query (or date faceting)? Range queries with many unique terms in the range is a known limitation, and we should hopefully fix this in 1.4. In the meantime, limiting the precision of dates could help a great deal. -Yonik -- Regards, Shalin Shekhar Mangar.
Re: how large can the index be?
On Wed, Jan 7, 2009 at 8:27 AM, Jim Adams jasolru...@gmail.com wrote: Why is NFS mounting such a bad idea? Some solutions for high available disks suggest that you DO mount the disks NFS to the boxes that need the data. Network requests for each read/write? You can do some benchmarks yourself and if you find the performance acceptable, go ahead. You should consider a master/slave replicated setup if you want high availability. -- Regards, Shalin Shekhar Mangar.
Re: Error during indexing.
Photo objects? Is it binary data you are trying to send in an XML request? On Wed, Jan 7, 2009 at 12:57 AM, Tushar_Gandhi tushar_gan...@neovasolutions.com wrote: Hi, I am getting an error whenever I am going to index specifically photo objects. For other objects it is working. Error is :- SEVERE: com.ctc.wstx.exc.WstxUnexpectedCharException: Illegal character (NULL, unicode 0) encountered: not valid in any content at [row,col {unknown-source}]: [1,3127] at com.ctc.wstx.sr.StreamScanner.constructNullCharException(StreamScanner.java:640) at com.ctc.wstx.sr.StreamScanner.throwInvalidSpace(StreamScanner.java:669) at com.ctc.wstx.sr.StreamScanner.throwInvalidSpace(StreamScanner.java:660) at com.ctc.wstx.sr.BasicStreamReader.readCDataPrimary(BasicStreamReader.java:4240) at com.ctc.wstx.sr.BasicStreamReader.nextFromTreeCommentOrCData(BasicStreamReader.java:3280) at com.ctc.wstx.sr.BasicStreamReader.nextFromTree(BasicStreamReader.java:2824) at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1019) at org.apache.solr.handler.XmlUpdateRequestHandler.readDoc(XmlUpdateRequestHandler.java:321) at org.apache.solr.handler.XmlUpdateRequestHandler.processUpdate(XmlUpdateRequestHandler.java:195) at org.apache.solr.handler.XmlUpdateRequestHandler.handleRequestBody(XmlUpdateRequestHandler.java:123) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1204) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:303) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:232) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:202) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:173) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:213) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:178) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:126) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:105) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:107) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:148) at org.apache.jk.server.JkCoyoteHandler.invoke(JkCoyoteHandler.java:199) at org.apache.jk.common.HandlerRequest.invoke(HandlerRequest.java:282) at org.apache.jk.common.ChannelSocket.invoke(ChannelSocket.java:754) at org.apache.jk.common.ChannelSocket.processConnection(ChannelSocket.java:684) at org.apache.jk.common.ChannelSocket$SocketConnection.runIt(ChannelSocket.java:876) at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:684) at java.lang.Thread.run(Thread.java:595) Anyone can help me out? Thanks, Tushar -- View this message in context: http://www.nabble.com/Error-during-indexing.-tp21317294p21317294.html Sent from the Solr - User mailing list archive at Nabble.com. -- Regards, Shalin Shekhar Mangar.
Re: date range query performance
On Wed, Jan 7, 2009 at 7:47 AM, Jim Adams jasolru...@gmail.com wrote: Can someone explain what this means to me? I'm having a similar performance issue - it's an index with only 1 million records or so, but when trying to search on a date range it takes 30 seconds! Yes, this date is one with hours, minutes, seconds in them -- do I need to create an additional field without the time component and reindex all my documents so I can get decent search performance? Or can I tell Solr Please ignore the time and do something in a reasonable timeframe (GRIN) Range queries are slow if you have a large number of unique terms. With dates it is especially a problem because the more precise they are, the more number of terms you've got in that field. The easy solution is to round off your dates to minimum precision acceptable to your use-case. You'll need to re-index. -- Regards, Shalin Shekhar Mangar.