RE: Indexing from Nutch crawl
Hi Ramires, I have been using Solr 1.4.1 My understanding from the example solrconfig.xml is that jar's will be loaded from the /lib directory. I do not have a /dist directory as I have copied the example directory as my solr home directory therefore I have commented out these entires in the solrconfig.xml. Can you elaborate any on your comment below please as I may be missing your point. Thank you Lewis From: ramires [uy...@beriltech.com] Sent: 18 April 2011 13:40 To: solr-user@lucene.apache.org Subject: Re: Indexing from Nutch crawl This is a problem of these files in nutch lib. You can easily change these files with in solr dist directory. apache-solr-core-1.4.0.jar apache-solr-solrj-1.4.0.jar -- View this message in context: http://lucene.472066.n3.nabble.com/Indexing-from-Nutch-crawl-tp2833862p2834270.html Sent from the Solr - User mailing list archive at Nabble.com. Email has been scanned for viruses by Altman Technologies' email management service - www.altman.co.uk/emailsystems Glasgow Caledonian University is a registered Scottish charity, number SC021474 Winner: Times Higher Education’s Widening Participation Initiative of the Year 2009 and Herald Society’s Education Initiative of the Year 2009. http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html Winner: Times Higher Education’s Outstanding Support for Early Career Researchers of the Year 2010, GCU as a lead with Universities Scotland partners. http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,15691,en.html
RE: Indexing from Nutch crawl
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216) 2011-04-18 11:27:11,033 ERROR solr.SolrIndexer - java.io.IOException: Job failed! 2011-04-18 11:27:11,869 INFO solr.SolrDeleteDuplicates - SolrDeleteDuplicates: starting at 2011-04-18 11:27:11 2011-04-18 11:27:11,870 INFO solr.SolrDeleteDuplicates - SolrDeleteDuplicates: Solr url: http://localhost:8080/wombra/data 2011-04-18 11:27:13,048 INFO solr.SolrClean - SolrClean: starting at 2011-04-18 11:27:13 2011-04-18 11:27:13,888 INFO solr.SolrClean - SolrClean: deleting 5 documents 2011-04-18 11:27:13,992 WARN mapred.LocalJobRunner - job_local_0001 org.apache.solr.common.SolrException: Not Found Not Found request: http://localhost:8080/wombra/data/update?wt=javabin&version=1 at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:435) at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:244) at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105) at org.apache.nutch.indexer.solr.SolrClean$SolrDeleter.close(SolrClean.java:115) at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:473) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:411) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216) From: Markus Jelsma [markus.jel...@openindex.io] Sent: 18 April 2011 11:59 To: solr-user@lucene.apache.org Cc: McGibbney, Lewis John Subject: Re: Indexing from Nutch crawl Can you include hadoop.log output? Likely the other commands fail as well but don't write the exception to stdout. Glasgow Caledonian University is a registered Scottish charity, number SC021474 Winner: Times Higher Education’s Widening Participation Initiative of the Year 2009 and Herald Society’s Education Initiative of the Year 2009. http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html Winner: Times Higher Education’s Outstanding Support for Early Career Researchers of the Year 2010, GCU as a lead with Universities Scotland partners. http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,15691,en.html
Indexing from Nutch crawl
Hi list, I am using Nutch-1.3 branch, which I checked out today to crawl a couple of urls in local mode. I have been using Solr Solr 1.4.1 within my web app but I am running into some problems during the indexing stages. I have three commands getting sent to Solr these are echo "- SolrIndex (Step 4 of $steps) -" $NUTCH_HOME/bin/nutch solrindex http://localhost:8080/wombra/data crawl/crawldb crawl/linkdb crawl/segments/* echo "- SolrDedup (Step 5 of $steps) -" $NUTCH_HOME/bin/nutch solrdedup http://localhost:8080/wombra/data echo "- SolrClean (Step 6 of $steps) -" $NUTCH_HOME/bin/nutch solrclean crawl/crawldb http://localhost:8080/wombra/data The solrindex command is failing with SolrException: No Found solrdedup appears to be working fine, the same could be said for solrclean I have been monitoring threads on the Nutch list, but thought I would have a crack at the Solr list for any suggestions to how I can solve the errors I am seeing from my log output. Thank you Lewis Here is my hadoop.log output 2011-04-18 11:27:05,480 INFO solr.SolrIndexer - SolrIndexer: starting at 2011-04-18 11:27:05 2011-04-18 11:27:05,562 INFO indexer.IndexerMapReduce - IndexerMapReduce: crawldb: crawl/crawldb 2011-04-18 11:27:05,562 INFO indexer.IndexerMapReduce - IndexerMapReduce: linkdb: crawl/linkdb 2011-04-18 11:27:05,562 INFO indexer.IndexerMapReduce - IndexerMapReduces: adding segment: crawl/segments/20110418111549 2011-04-18 11:27:05,656 INFO indexer.IndexerMapReduce - IndexerMapReduces: adding segment: crawl/segments/20110418111603 ... some more ... 2011-04-18 11:27:09,966 INFO indexer.IndexingFilters - Adding org.apache.nutch.indexer.basic.BasicIndexingFilter 2011-04-18 11:27:09,966 INFO indexer.IndexingFilters - Adding org.apache.nutch.indexer.anchor.AnchorIndexingFilter 2011-04-18 11:27:10,021 INFO solr.SolrMappingReader - source: content dest: content 2011-04-18 11:27:10,021 INFO solr.SolrMappingReader - source: site dest: site 2011-04-18 11:27:10,021 INFO solr.SolrMappingReader - source: title dest: title 2011-04-18 11:27:10,021 INFO solr.SolrMappingReader - source: host dest: host 2011-04-18 11:27:10,021 INFO solr.SolrMappingReader - source: segment dest: segment 2011-04-18 11:27:10,021 INFO solr.SolrMappingReader - source: boost dest: boost 2011-04-18 11:27:10,021 INFO solr.SolrMappingReader - source: digest dest: digest 2011-04-18 11:27:10,021 INFO solr.SolrMappingReader - source: tstamp dest: tstamp 2011-04-18 11:27:10,021 INFO solr.SolrMappingReader - source: url dest: id 2011-04-18 11:27:10,021 INFO solr.SolrMappingReader - source: url dest: url 2011-04-18 11:27:10,394 WARN mapred.LocalJobRunner - job_local_0001 org.apache.solr.common.SolrException: Not Found Not Found request: http://localhost:8080/wombra/data/update?wt=javabin&version=1 at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:435) at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:244) at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105) at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:49) at org.apache.nutch.indexer.solr.SolrWriter.close(SolrWriter.java:75) at org.apache.nutch.indexer.IndexerOutputFormat$1.close(IndexerOutputFormat.java:48) at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:474) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:411) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216) 2011-04-18 11:27:11,033 ERROR solr.SolrIndexer - java.io.IOException: Job failed! 2011-04-18 11:27:11,869 INFO solr.SolrDeleteDuplicates - SolrDeleteDuplicates: starting at 2011-04-18 11:27:11 2011-04-18 11:27:11,870 INFO solr.SolrDeleteDuplicates - SolrDeleteDuplicates: Solr url: http://localhost:8080/wombra/data 2011-04-18 11:27:13,048 INFO solr.SolrClean - SolrClean: starting at 2011-04-18 11:27:13 2011-04-18 11:27:13,888 INFO solr.SolrClean - SolrClean: deleting 5 documents 2011-04-18 11:27:13,992 WARN mapred.LocalJobRunner - job_local_0001 org.apache.solr.common.SolrException: Not Found Not Found request: http://localhost:8080/wombra/data/update?wt=javabin&version=1 at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:435) at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:244) at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105) at org.apache.nutch.indexer.solr.SolrClean$SolrDeleter.close(SolrClean.java:115) at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:473) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:411) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJob
RE: Implementing Facets
Hi Ahmet, Yes this is the case. I have changed it to reflect your suggestion thank you for this. After reloading the app I still get the error, here is the full stack trace from catalina.out INFO: [] Registered new searcher Searcher@8af0b0 main 21-Mar-2011 20:28:53 org.apache.solr.common.SolrException log SEVERE: Exception during facet counts:org.apache.solr.common.SolrException: undefined field topics at org.apache.solr.schema.IndexSchema.getField(IndexSchema.java:1077) at org.apache.solr.request.SimpleFacets.getTermCounts(SimpleFacets.java:226) at org.apache.solr.request.SimpleFacets.getFacetFieldCounts(SimpleFacets.java:283) at org.apache.solr.request.SimpleFacets.getFacetCounts(SimpleFacets.java:166) at org.apache.solr.handler.component.FacetComponent.process(FacetComponent.java:72) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:240) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:164) at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:498) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:164) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:100) at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:562) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:394) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:243) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:188) at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:302) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) 21-Mar-2011 20:28:53 org.apache.solr.core.SolrCore execute INFO: [] webapp=/wombra path=/select params={json.wrf=jsonp1300739332983&facet.date.start=1987-02-26T00:00:00.000Z/DAY&facet=true&facet.mincount=1&facet.limit=20&facet.date=date&f.topics.facet.limit=50&json.nl=map&wt=json&q=*:*&_=1300739333613&facet.field=topics&facet.field=organisations&facet.field=exchanges&facet.field=countryCodes&facet.date.gap=%2B1DAY&f.countryCodes.facet.limit=-1&facet.date.end=1987-10-20T00:00:00.000Z/DAY%2B1DAY} hits=21 status=0 QTime=60 From: Ahmet Arslan [iori...@yahoo.com] Sent: 21 March 2011 20:25 To: solr-user@lucene.apache.org Subject: Re: Implementing Facets Could it be missing dot in -1? -1? Glasgow Caledonian University is a registered Scottish charity, number SC021474 Winner: Times Higher Education’s Widening Participation Initiative of the Year 2009 and Herald Society’s Education Initiative of the Year 2009. http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html Winner: Times Higher Education’s Outstanding Support for Early Career Researchers of the Year 2010, GCU as a lead with Universities Scotland partners. http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,15691,en.html
Implementing Facets
Hi list, I am working with a Ajax-Solr GUI but I am getting the following error from Firebug when launching the web app on Tomcat 7.0.11. The web app uses Solr version 1.4.1 >HTTP Status 400 - undefined field linksnoshade="noshade">type Status reportmessage >undefined field linksdescription The request sent by >the client was syntactically incorrect (undefined field links). The facet details, as configured in Ajax-Solr are as follows 'facet.field': [ 'topics', 'organisations', 'exchanges', 'countryCodes' ], 'facet.limit': 20, 'facet.mincount': 1, 'f.topics.facet.limit': 50, 'f.countryCodes.facet.limit': -1, 'facet.date': 'date', 'facet.date.start': '1987-02-26T00:00:00.000Z/DAY', 'facet.date.end': '1987-10-20T00:00:00.000Z/DAY+1DAY', 'facet.date.gap': '+1DAY', 'json.nl': 'map' I tried configuring the above by adding the following snippet to the dismax requestHandler in solrconfig.xml as follows name regex topics organisations exchanges countryCodes 20 1 50 -1 date 2000-01-01T00:00:00.000Z/DAY 2011-03-21T00:00:00.000Z/DAY+1DAY +1DAY But I am still getting the error. I am not clear about how and where to configure the facet details. Can anyone suggest how I can properly implement the facets that I want as I am unsure. Thank you kindly Lewis Glasgow Caledonian University is a registered Scottish charity, number SC021474 Winner: Times Higher Education’s Widening Participation Initiative of the Year 2009 and Herald Society’s Education Initiative of the Year 2009. http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html Winner: Times Higher Education’s Outstanding Support for Early Career Researchers of the Year 2010, GCU as a lead with Universities Scotland partners. http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,15691,en.html
RE: Using Solr 1.4.1 on most recent Tomcat 7.0.11
Hi François, Thank you for your reply. I had made a simple mistake of including comments before '', therefore I was getting a SAX error. As you have correctly pointed out, it is not essential to include the snippet as above in the context file (if using one), however it might be useful to know that Tomcat 7 now validates XML files by default. In time I will get round to editing the wiki accordingly to mitigate against this in the future. Thanks for looking in to this. Lewis ___ From: François Schiettecatte [fschietteca...@gmail.com] Sent: 17 March 2011 13:47 To: solr-user@lucene.apache.org Subject: Re: Using Solr 1.4.1 on most recent Tomcat 7.0.11 Lewis My update from tomcat 7.0.8 to 7.0.11 went with no hitches, I checked my context file and it does not have the xml preamble your has, specifically: '', Here is my context file: --- Hope this helps. Cheers François Glasgow Caledonian University is a registered Scottish charity, number SC021474 Winner: Times Higher Education’s Widening Participation Initiative of the Year 2009 and Herald Society’s Education Initiative of the Year 2009. http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html Winner: Times Higher Education’s Outstanding Support for Early Career Researchers of the Year 2010, GCU as a lead with Universities Scotland partners. http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,15691,en.html
RE: hierarchical faceting, SOLR-792 - confused on config
Hi Erik, I have been reading about the progression of SOLR-792 into pivot faceting, however can you expand to comment on where it is committed. Are you referring to trunk? The reason I am asking is that I have been using 1.4.1 for some time now and have been thinking of upgrading to trunk... or branch Thank you Lewis From: Erik Hatcher [erik.hatc...@gmail.com] Sent: 16 March 2011 17:36 To: solr-user@lucene.apache.org Subject: Re: hierarchical faceting, SOLR-792 - confused on config Sorry, I missed the original mail on this thread I put together that hierarchical faceting wiki page a couple of years ago when helping a customer evaluate SOLR-64 vs. SOLR-792 vs.other approaches. Since then, SOLR-792 morphed and is committed as pivot faceting. SOLR-64 spawned a PathTokenizer which is part of Solr now too. Recently Toke updated that page with some additional info. It's definitely not a "how to" page, and perhaps should get renamed/moved/revamped? Toke? Erik Glasgow Caledonian University is a registered Scottish charity, number SC021474 Winner: Times Higher Education’s Widening Participation Initiative of the Year 2009 and Herald Society’s Education Initiative of the Year 2009. http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html Winner: Times Higher Education’s Outstanding Support for Early Career Researchers of the Year 2010, GCU as a lead with Universities Scotland partners. http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,15691,en.html
Using Solr 1.4.1 on most recent Tomcat 7.0.11
Hello list, Is anyone running Solr (in my case 1.4.1) on above Tomcat dist? In the past I have been using guidance in accordance with http://wiki.apache.org/solr/SolrTomcat#Installing_Solr_instances_under_Tomcat but having upgraded from Tomcat 7.0.8 to 7.0.11 I am having problems E.g. INFO: Deploying configuration descriptor wombra.xml < This is my context fragment from /home/lewis/Downloads/apache-tomcat-7.0.11/conf/Catalina/localhost 16-Mar-2011 16:57:36 org.apache.tomcat.util.digester.Digester fatalError SEVERE: Parse Fatal Error at line 4 column 6: The processing instruction target matching "[xX][mM][lL]" is not allowed. org.xml.sax.SAXParseException: The processing instruction target matching "[xX][mM][lL]" is not allowed. ... 16-Mar-2011 16:57:36 org.apache.catalina.startup.HostConfig deployDescriptor SEVERE: Error deploying configuration descriptor wombra.xml org.xml.sax.SAXParseException: The processing instruction target matching "[xX][mM][lL]" is not allowed. ... some more ... My configuration descriptor is as follows Preferably I would upload a WAR file, but I have been working well with the configuration I have been using up until now therefore I didn't question change. I am unfamiliar with the above errors. Can anyone please point me in the right direction? Thank you Lewis Glasgow Caledonian University is a registered Scottish charity, number SC021474 Winner: Times Higher Education’s Widening Participation Initiative of the Year 2009 and Herald Society’s Education Initiative of the Year 2009. http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html Winner: Times Higher Education’s Outstanding Support for Early Career Researchers of the Year 2010, GCU as a lead with Universities Scotland partners. http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,15691,en.html
RE: hierarchical faceting, SOLR-792 - confused on config
Hi, This is also where I am having problems. I have not been able to understand very much on the wiki. I do not understand how to configure the faceting we are referring to. Although I know very little about this, I can't help but think that the wiki is quite clearly unaccurate by some way! Any comments please Lewis From: kmf [kfole...@gmail.com] Sent: 23 February 2011 17:10 To: solr-user@lucene.apache.org Subject: Re: hierarchical faceting, SOLR-792 - confused on config I'm really confused now. Is this page completely out of date - http://wiki.apache.org/solr/HierarchicalFaceting - as it seems to imply that solr-792 is a form of hierarchical faceting. "There are currently two similar, non-competing, approaches to generating tree/hierarchical facets from Solr: SOLR-64 and SOLR-792" To achieve hierarchical faceting, is the rule then that you form the hierarchical facets using a transformer in the DIH and do nothing in schema.xml or solrconfig.xml? I seem to recall reading somewhere that creating a copyField is needed. Sorry for the entry level question but, I'm still trying to understand how to configure solr to do hierarchical faceting. Thanks, kmf -- View this message in context: http://lucene.472066.n3.nabble.com/hierarchical-faceting-SOLR-792-confused-on-config-tp2556394p2561445.html Sent from the Solr - User mailing list archive at Nabble.com. Email has been scanned for viruses by Altman Technologies' email management service - www.altman.co.uk/emailsystems Glasgow Caledonian University is a registered Scottish charity, number SC021474 Winner: Times Higher Education’s Widening Participation Initiative of the Year 2009 and Herald Society’s Education Initiative of the Year 2009. http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html Winner: Times Higher Education’s Outstanding Support for Early Career Researchers of the Year 2010, GCU as a lead with Universities Scotland partners. http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,15691,en.html
RE: Faceting help
Hi Upayavira, I use the term constraint to define additional options for a user to refine search with under each facet. If we could think of them as sub facet's then maybe this would explain in slightly better terms. I didn't add additional document source types in my original email but if I knew that there would be xls and doc contained within the Solr index then these would also be added as sub facet's allowing a user to select prior to entering a search query. Can you point me towards documentation or something similar in order to implement the above. I am aware that I have a lot more to learn on faceted search, namely how to properly implement it! Thank you Lewis From: Upayavira [u...@odoko.co.uk] Sent: 15 March 2011 22:42 To: solr-user@lucene.apache.org Subject: Re: Faceting help I'm not sure if I get what you are trying to achieve. What do you mean by "constraint"? Are you saying that you effectively want to filter the facets that are returned? e.g. for source field, you want to show html/pdf/email, but not, say xls or doc? Upayavira > Topics < field > Legislation < constraint > Guidance/Policies < constraint > Customer Service information/complaints procedure < constraint > financial information < constraint > etc etc > > Source < field > html < constraint < constraint > pdf < constraint > email < constraint > etc etc > > Date < field >< constraint > > Basically I need resources to understand how to implement the above > instead of the example I currently have. > Some guidance would be great > Thank you kindly > > Lewis > > Glasgow Caledonian University is a registered Scottish charity, number > SC021474 > > Winner: Times Higher Education’s Widening Participation Initiative of the > Year 2009 and Herald Society’s Education Initiative of the Year 2009. > http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html > > Winner: Times Higher Education’s Outstanding Support for Early Career > Researchers of the Year 2010, GCU as a lead with Universities Scotland > partners. > http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,15691,en.html > --- Enterprise Search Consultant at Sourcesense UK, Making Sense of Open Source Email has been scanned for viruses by Altman Technologies' email management service - www.altman.co.uk/emailsystems Glasgow Caledonian University is a registered Scottish charity, number SC021474 Winner: Times Higher Education’s Widening Participation Initiative of the Year 2009 and Herald Society’s Education Initiative of the Year 2009. http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html Winner: Times Higher Education’s Outstanding Support for Early Career Researchers of the Year 2010, GCU as a lead with Universities Scotland partners. http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,15691,en.html
Faceting help
Hello list, I'm trying to use facet's via widget's within Ajax-Solr. I have tried the wiki for general help on configuring facets and constraints and also attended the recent Lucidworks webinar on faceted search. Can anyone please direct me to some reading on how to formally configure facets for searching. Currently my facets are configured as follows 'facet.field': [ 'topics', 'organisations', 'exchanges', 'countryCodes' ], 'facet.limit': 20, 'facet.mincount': 1, 'f.topics.facet.limit': 50, 'f.countryCodes.facet.limit': -1, 'facet.date': 'date', 'facet.date.start': '1987-02-26T00:00:00.000Z/DAY', 'facet.date.end': '1987-10-20T00:00:00.000Z/DAY+1DAY', 'facet.date.gap': '+1DAY', 'json.nl': 'map' However I wish to change the fields to contain some constraints such as Topics < field Legislation < constraint Guidance/Policies < constraint Customer Service information/complaints procedure < constraint financial information < constraint etc etc Source < field html < constraint < constraint pdf < constraint email < constraint etc etc Date < field < constraint Basically I need resources to understand how to implement the above instead of the example I currently have. Some guidance would be great Thank you kindly Lewis Glasgow Caledonian University is a registered Scottish charity, number SC021474 Winner: Times Higher Education’s Widening Participation Initiative of the Year 2009 and Herald Society’s Education Initiative of the Year 2009. http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html Winner: Times Higher Education’s Outstanding Support for Early Career Researchers of the Year 2010, GCU as a lead with Universities Scotland partners. http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,15691,en.html
RE: Text field not defined in Solr Schema?
Thank you Markus, I am wondering if anyone can comment on the latter question I posted regarding supporting TextField or StrField with compression options. I understand the methodology behind configuring compressThreshold to the field type definition (1st part of my schema) and adding individual options to the individual field definitions (2nd part of my schema), my question regards any real benefits which can be gained when implemented in a 'small/medium' Solr use case. Thank you Lewis From: Markus Jelsma [markus.jel...@openindex.io] Sent: 26 February 2011 13:42 To: solr-user@lucene.apache.org Cc: McGibbney, Lewis John Subject: Re: Text field not defined in Solr Schema? Yes, you need to add the field text of type Text or use content instead of text. Glasgow Caledonian University is a registered Scottish charity, number SC021474 Winner: Times Higher Education’s Widening Participation Initiative of the Year 2009 and Herald Society’s Education Initiative of the Year 2009. http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html Winner: Times Higher Education’s Outstanding Support for Early Career Researchers of the Year 2010, GCU as a lead with Universities Scotland partners. http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,15691,en.html
Text field not defined in Solr Schema?
Hello list, I have recently been working on some JS (ajax solr) and when using Firebug I am alerted to an error within the JS file as below. It immediately breaks on line 12 stating that 'doc.text' is undefined! Here is the code snippet. 10 AjaxSolr.theme.prototype.snippet = function (doc) { 11 var output = ''; 12 if (doc.text.length > 300) { 13 output += doc.dateline + ' ' + doc.text.substring(0, 300); 14 output += '' + doc.text.substring(300); 15 output += ' more'; 16 } 17 else { 18 output += doc.dateline + ' ' + doc.text; 19 } 20 return output; 21 }; I have been advised that the problem might stem from my schema not defining a text field, however as my implementation of Solr is currently geared to index docs from a Nutch web crawl I am using the Nutch schema. A snippet of the schema is below ... ... ... Can someone confirm if I require to add something similar to the following ... Then perform a fresh crawl and reindex so that the schema field is recognised by the JS snippet? Also (sorry I apologise) from my reading on the Solr schema, I became intrigued in options for TextField... namely compressed and compressThreshold. I understand that they are used hand in glove, however can anyone please explain what benefits compression adds and what integer value should be appropriate for the latter option. Any help would be great Thank you Lewis Glasgow Caledonian University is a registered Scottish charity, number SC021474 Winner: Times Higher Education’s Widening Participation Initiative of the Year 2009 and Herald Society’s Education Initiative of the Year 2009. http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html Winner: Times Higher Education’s Outstanding Support for Early Career Researchers of the Year 2010, GCU as a lead with Universities Scotland partners. http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,15691,en.html
Solr Ajax
Hello list, I'm in the process of trying to implement Ajax within my Solr-backed webapp I have been reading both the Solrj wiki as well as the tutorial provided via the google group and various info from the wiki page https://github.com/evolvingweb/ajax-solr/wiki I have all solrj jar libraries available in my webapp /lib but I am unsure as to what steps I take to configure the Solrj client. What do I need to configure to begin working with Solrj? I am unsure as to where to go and finding information on the wiki seems to be a non trivial task. Any help would be great. Thanks Lewis Glasgow Caledonian University is a registered Scottish charity, number SC021474 Winner: Times Higher Education’s Widening Participation Initiative of the Year 2009 and Herald Society’s Education Initiative of the Year 2009. http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html Winner: Times Higher Education’s Outstanding Support for Early Career Researchers of the Year 2010, GCU as a lead with Universities Scotland partners. http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,15691,en.html
RE: Errors when implementing VelocityResponseWriter
Managed to get this working. Changed my solrconfig for the one provided in velocity dir, repackaged the war file and redeployed on tomcat. Although this seems like a ridiculously obvious thing to do, I somehow overlooked the repackaging aspect, this was where the problem was. Thanks for the help Erik From: Erik Hatcher [erik.hatc...@gmail.com] Sent: 16 February 2011 08:06 To: solr-user@lucene.apache.org Subject: Re: Errors when implementing VelocityResponseWriter Well, you need to specify a path, relative or absolute, that points to the directory where the Velocity JAR file resides. I'm not sure, at this point, exactly what you're missing. But it should be fairly straightforward. Solr startup logs the libraries it loads, so maybe that is helpful info. 1.4.1 - does it support ? (I'm not sure off the top of my head) Erik On Feb 15, 2011, at 12:04 , McGibbney, Lewis John wrote: > Hi Erik thank you for the reply > > I have placed all velocity jar files in my /lib directory. As explained > below, I have added relevant configuration to solrconfig.xml, I am just > wondering if the config instructions in the wiki are missing something? Can > anyone advise on this. > > As you mentioned, my terminal output suggests that the VelocityResponseWriter > class is not present and therefore the velocity jar is not present... however > this is not the case. > > I have specified in solrconfig.xml, is this enough or do > I need to use an exact path. I have already tried specifying an exact path > and it does not seem to work either. > > Thank you > > Lewis > > From: Erik Hatcher [erik.hatc...@gmail.com] > Sent: 15 February 2011 06:48 > To: solr-user@lucene.apache.org > Subject: Re: Errors when implementing VelocityResponseWriter > > looks like you're missing the Velocity JAR. It needs to be in some Solr > visible lib directory. With 1.4.1 you'll need to put it in /lib. > In later versions, you can use the elements in solrconfig.xml to point > to other directories. > >Erik > > On Feb 14, 2011, at 10:41 , McGibbney, Lewis John wrote: > >> Hello List, >> >> I am currently trying to implement the above in Solr 1.4.1. Having moved >> velocity directory from $SOLR_DIST/contrib/velocity/src/main/solr/conf to my >> webapp /lib directory, then adding queryResponseWriter name="blah" and >> class="blah" followed by the responseHandler specifics I am shown the >> following terminal output. I also added in solrconfig. >> Can anyone suggest what I have not included in the config that is still >> required? >> >> Thanks Lewis >> >> SEVERE: org.apache.solr.common.SolrException: Error loading class >> 'org.apache.solr.response.VelocityResponseWriter' >> at >> org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:375) >> at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:413) >> at org.apache.solr.core.SolrCore.createInitInstance(SolrCore.java:435) >> at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1498) >> at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1492) >> at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1525) >> at org.apache.solr.core.SolrCore.initWriters(SolrCore.java:1408) >> at org.apache.solr.core.SolrCore.(SolrCore.java:547) >> at >> org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:137) >> at >> org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:83) >> at >> org.apache.catalina.core.ApplicationFilterConfig.initFilter(ApplicationFilterConfig.java:273) >> at >> org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:254) >> at >> org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:372) >> at >> org.apache.catalina.core.ApplicationFilterConfig.(ApplicationFilterConfig.java:98) >> at >> org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:4382) >> at >> org.apache.catalina.core.StandardContext$2.call(StandardContext.java:5040) >> at >> org.apache.catalina.core.StandardContext$2.call(StandardContext.java:5035) >> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) >> at java.util.concurrent.FutureTask.run(FutureTask.java:138) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) >&
RE: Errors when implementing VelocityResponseWriter
To add to this (which stupidly, I have not mentioned previously) I am using Tomcat 7.0.8 as my servlet container. I have a sneaking suspicion that this is what is causing the problem, but as per below, I am unsure as to a solution. From: McGibbney, Lewis John [lewis.mcgibb...@gcu.ac.uk] Sent: 15 February 2011 17:04 To: solr-user@lucene.apache.org Subject: RE: Errors when implementing VelocityResponseWriter Hi Erik thank you for the reply I have placed all velocity jar files in my /lib directory. As explained below, I have added relevant configuration to solrconfig.xml, I am just wondering if the config instructions in the wiki are missing something? Can anyone advise on this. As you mentioned, my terminal output suggests that the VelocityResponseWriter class is not present and therefore the velocity jar is not present... however this is not the case. I have specified in solrconfig.xml, is this enough or do I need to use an exact path. I have already tried specifying an exact path and it does not seem to work either. Thank you Lewis From: Erik Hatcher [erik.hatc...@gmail.com] Sent: 15 February 2011 06:48 To: solr-user@lucene.apache.org Subject: Re: Errors when implementing VelocityResponseWriter looks like you're missing the Velocity JAR. It needs to be in some Solr visible lib directory. With 1.4.1 you'll need to put it in /lib. In later versions, you can use the elements in solrconfig.xml to point to other directories. Erik On Feb 14, 2011, at 10:41 , McGibbney, Lewis John wrote: > Hello List, > > I am currently trying to implement the above in Solr 1.4.1. Having moved > velocity directory from $SOLR_DIST/contrib/velocity/src/main/solr/conf to my > webapp /lib directory, then adding queryResponseWriter name="blah" and > class="blah" followed by the responseHandler specifics I am shown the > following terminal output. I also added in solrconfig. > Can anyone suggest what I have not included in the config that is still > required? > > Thanks Lewis > > SEVERE: org.apache.solr.common.SolrException: Error loading class > 'org.apache.solr.response.VelocityResponseWriter' >at > org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:375) >at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:413) >at org.apache.solr.core.SolrCore.createInitInstance(SolrCore.java:435) >at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1498) >at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1492) >at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1525) >at org.apache.solr.core.SolrCore.initWriters(SolrCore.java:1408) >at org.apache.solr.core.SolrCore.(SolrCore.java:547) >at > org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:137) >at > org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:83) >at > org.apache.catalina.core.ApplicationFilterConfig.initFilter(ApplicationFilterConfig.java:273) >at > org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:254) >at > org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:372) >at > org.apache.catalina.core.ApplicationFilterConfig.(ApplicationFilterConfig.java:98) >at > org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:4382) >at > org.apache.catalina.core.StandardContext$2.call(StandardContext.java:5040) >at > org.apache.catalina.core.StandardContext$2.call(StandardContext.java:5035) >at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) >at java.util.concurrent.FutureTask.run(FutureTask.java:138) >at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) >at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) >at java.lang.Thread.run(Thread.java:662) > Caused by: java.lang.ClassNotFoundException: > org.apache.solr.response.VelocityResponseWriter >at java.net.URLClassLoader$1.run(URLClassLoader.java:202) >at java.security.AccessController.doPrivileged(Native Method) >at java.net.URLClassLoader.findClass(URLClassLoader.java:190) >at java.lang.ClassLoader.loadClass(ClassLoader.java:307) >at java.net.FactoryURLClassLoader.loadClass(URLClassLoader.java:627) >at java.lang.ClassLoader.loadClass(ClassLoader.java:248) >at java.lang.Class.forName0(Native Method) >at java.lang.Class.forName(Class.java:247) >at > org.apache.s
RE: Errors when implementing VelocityResponseWriter
Hi Erik thank you for the reply I have placed all velocity jar files in my /lib directory. As explained below, I have added relevant configuration to solrconfig.xml, I am just wondering if the config instructions in the wiki are missing something? Can anyone advise on this. As you mentioned, my terminal output suggests that the VelocityResponseWriter class is not present and therefore the velocity jar is not present... however this is not the case. I have specified in solrconfig.xml, is this enough or do I need to use an exact path. I have already tried specifying an exact path and it does not seem to work either. Thank you Lewis From: Erik Hatcher [erik.hatc...@gmail.com] Sent: 15 February 2011 06:48 To: solr-user@lucene.apache.org Subject: Re: Errors when implementing VelocityResponseWriter looks like you're missing the Velocity JAR. It needs to be in some Solr visible lib directory. With 1.4.1 you'll need to put it in /lib. In later versions, you can use the elements in solrconfig.xml to point to other directories. Erik On Feb 14, 2011, at 10:41 , McGibbney, Lewis John wrote: > Hello List, > > I am currently trying to implement the above in Solr 1.4.1. Having moved > velocity directory from $SOLR_DIST/contrib/velocity/src/main/solr/conf to my > webapp /lib directory, then adding queryResponseWriter name="blah" and > class="blah" followed by the responseHandler specifics I am shown the > following terminal output. I also added in solrconfig. > Can anyone suggest what I have not included in the config that is still > required? > > Thanks Lewis > > SEVERE: org.apache.solr.common.SolrException: Error loading class > 'org.apache.solr.response.VelocityResponseWriter' >at > org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:375) >at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:413) >at org.apache.solr.core.SolrCore.createInitInstance(SolrCore.java:435) >at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1498) >at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1492) >at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1525) >at org.apache.solr.core.SolrCore.initWriters(SolrCore.java:1408) >at org.apache.solr.core.SolrCore.(SolrCore.java:547) >at > org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:137) >at > org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:83) >at > org.apache.catalina.core.ApplicationFilterConfig.initFilter(ApplicationFilterConfig.java:273) >at > org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:254) >at > org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:372) >at > org.apache.catalina.core.ApplicationFilterConfig.(ApplicationFilterConfig.java:98) >at > org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:4382) >at > org.apache.catalina.core.StandardContext$2.call(StandardContext.java:5040) >at > org.apache.catalina.core.StandardContext$2.call(StandardContext.java:5035) >at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) >at java.util.concurrent.FutureTask.run(FutureTask.java:138) >at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) >at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) >at java.lang.Thread.run(Thread.java:662) > Caused by: java.lang.ClassNotFoundException: > org.apache.solr.response.VelocityResponseWriter >at java.net.URLClassLoader$1.run(URLClassLoader.java:202) >at java.security.AccessController.doPrivileged(Native Method) >at java.net.URLClassLoader.findClass(URLClassLoader.java:190) >at java.lang.ClassLoader.loadClass(ClassLoader.java:307) >at java.net.FactoryURLClassLoader.loadClass(URLClassLoader.java:627) >at java.lang.ClassLoader.loadClass(ClassLoader.java:248) >at java.lang.Class.forName0(Native Method) >at java.lang.Class.forName(Class.java:247) >at > org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:359) >... 21 more > > Glasgow Caledonian University is a registered Scottish charity, number > SC021474 > > Winner: Times Higher Education’s Widening Participation Initiative of the > Year 2009 and Herald Society’s Education Initiative of the Year 2009. > http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html > > Winner: Times Highe
Errors when implementing VelocityResponseWriter
Hello List, I am currently trying to implement the above in Solr 1.4.1. Having moved velocity directory from $SOLR_DIST/contrib/velocity/src/main/solr/conf to my webapp /lib directory, then adding queryResponseWriter name="blah" and class="blah" followed by the responseHandler specifics I am shown the following terminal output. I also added in solrconfig. Can anyone suggest what I have not included in the config that is still required? Thanks Lewis SEVERE: org.apache.solr.common.SolrException: Error loading class 'org.apache.solr.response.VelocityResponseWriter' at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:375) at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:413) at org.apache.solr.core.SolrCore.createInitInstance(SolrCore.java:435) at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1498) at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1492) at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1525) at org.apache.solr.core.SolrCore.initWriters(SolrCore.java:1408) at org.apache.solr.core.SolrCore.(SolrCore.java:547) at org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:137) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:83) at org.apache.catalina.core.ApplicationFilterConfig.initFilter(ApplicationFilterConfig.java:273) at org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:254) at org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:372) at org.apache.catalina.core.ApplicationFilterConfig.(ApplicationFilterConfig.java:98) at org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:4382) at org.apache.catalina.core.StandardContext$2.call(StandardContext.java:5040) at org.apache.catalina.core.StandardContext$2.call(StandardContext.java:5035) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.ClassNotFoundException: org.apache.solr.response.VelocityResponseWriter at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:307) at java.net.FactoryURLClassLoader.loadClass(URLClassLoader.java:627) at java.lang.ClassLoader.loadClass(ClassLoader.java:248) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:247) at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:359) ... 21 more Glasgow Caledonian University is a registered Scottish charity, number SC021474 Winner: Times Higher Education’s Widening Participation Initiative of the Year 2009 and Herald Society’s Education Initiative of the Year 2009. http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html Winner: Times Higher Education’s Outstanding Support for Early Career Researchers of the Year 2010, GCU as a lead with Universities Scotland partners. http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,15691,en.html
Fatal error when posting to Solr
Hi list, Was attempting to check out the VelocityResponseWriter before I progress with customising it for my own usage, I seem to have opened a can of worms when posting documents to Solr. Using simple post command I get the following output. lewis@lewis-01:~/Downloads/apache-solr-1.4.1/example/exampledocs$ java -jar post.jar *.pdf SimplePostTool: version 1.2 SimplePostTool: WARNING: Make sure your XML documents are encoded in UTF-8, other encodings are not currently supported SimplePostTool: POSTing files to http://localhost:8983/solr/update.. SimplePostTool: POSTing file technical_handbook_2010_domestic_section_0_general.pdf SimplePostTool: FATAL: Solr returned an error: Unexpected_character__code_37_in_prolog_expected___at_rowcol_unknownsource_11 In some projects (E.g. Nutch) I am aware that the distribution does not come with alll jar's and these are required to be downloaded separately, I know this is not the case with Solr though. I have also successfully committed a host of .pdf to Solr recently so I know that this is working fine. Checking my Solr logs nothing seems to be out of place! Has anyone seen anything similar? Thanks Lewis Glasgow Caledonian University is a registered Scottish charity, number SC021474 Winner: Times Higher Education’s Widening Participation Initiative of the Year 2009 and Herald Society’s Education Initiative of the Year 2009. http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html Winner: Times Higher Education’s Outstanding Support for Early Career Researchers of the Year 2010, GCU as a lead with Universities Scotland partners. http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,15691,en.html
RE: Alternative to Solrj
Hi Erik, This sounds much more like it. I have had a look at the wiki and it sounds like a logical approach to UI customisation. Thank you for this From: Erik Hatcher [erik.hatc...@gmail.com] Sent: 11 February 2011 14:12 To: solr-user@lucene.apache.org Subject: Re: Alternative to Solrj Sounds like you just described the VelocityResponseWriter. On trunk (or 3.x I believe), try out http://localhost:8983/solr/browse and look at what makes that tick. Erik On Feb 11, 2011, at 08:40 , McGibbney, Lewis John wrote: > Hi list, > > I have been looking at an alternative UI config displaying retrieved results > from Solr after a query has been passed. At this point, I am not interested > in Solrj as all I wish to change is the default responseWriter (line 1007 of > Solrconfig). I've also noticed a snippet of default CSS code included in > /conf/xslt/example.xsl and understand that all response writers are located > in $SOLR_HOME/src/java/org/apache/solr/request and that the default is > XSLTResponseWriter.java. > Basically I wish to keep code for the search UI as simple as possible > (ideally write a simple JSP and CSS ), however I now find that this > configuration is proving slightly more confusing in practice. My thinking is > as follows, write own responseWriter, include within it my CSS template then > specify the responseWriter in solrconfig along with the java class. Can > anyone advise me on this from their own experiences. > > Thank you > > Lewis > > Glasgow Caledonian University is a registered Scottish charity, number > SC021474 > > Winner: Times Higher Education’s Widening Participation Initiative of the > Year 2009 and Herald Society’s Education Initiative of the Year 2009. > http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html > > Winner: Times Higher Education’s Outstanding Support for Early Career > Researchers of the Year 2010, GCU as a lead with Universities Scotland > partners. > http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,15691,en.html Email has been scanned for viruses by Altman Technologies' email management service - www.altman.co.uk/emailsystems Glasgow Caledonian University is a registered Scottish charity, number SC021474 Winner: Times Higher Education’s Widening Participation Initiative of the Year 2009 and Herald Society’s Education Initiative of the Year 2009. http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html Winner: Times Higher Education’s Outstanding Support for Early Career Researchers of the Year 2010, GCU as a lead with Universities Scotland partners. http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,15691,en.html
FW: Alternative to Solrj
Only a thought, but maybe a better approach to me finding a solution would be for someone to explain the process undertaken by Solr after a query has been passed and how results are displayed. Hopefully this would enable me to think more laterally about how I may implement my ideas. It would be my intention to implement a search box on every results page, which would then enable users to recursively search if this was required. Again, any help would be appreciated From: McGibbney, Lewis John [lewis.mcgibb...@gcu.ac.uk] Sent: 11 February 2011 13:40 To: solr-user@lucene.apache.org Subject: Alternative to Solrj Hi list, I have been looking at an alternative UI config displaying retrieved results from Solr after a query has been passed. At this point, I am not interested in Solrj as all I wish to change is the default responseWriter (line 1007 of Solrconfig). I've also noticed a snippet of default CSS code included in /conf/xslt/example.xsl and understand that all response writers are located in $SOLR_HOME/src/java/org/apache/solr/request and that the default is XSLTResponseWriter.java. Basically I wish to keep code for the search UI as simple as possible (ideally write a simple JSP and CSS ), however I now find that this configuration is proving slightly more confusing in practice. My thinking is as follows, write own responseWriter, include within it my CSS template then specify the responseWriter in solrconfig along with the java class. Can anyone advise me on this from their own experiences. Thank you Lewis Glasgow Caledonian University is a registered Scottish charity, number SC021474 Winner: Times Higher Education’s Widening Participation Initiative of the Year 2009 and Herald Society’s Education Initiative of the Year 2009. http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html Winner: Times Higher Education’s Outstanding Support for Early Career Researchers of the Year 2010, GCU as a lead with Universities Scotland partners. http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,15691,en.html Email has been scanned for viruses by Altman Technologies' email management service - www.altman.co.uk/emailsystems Glasgow Caledonian University is a registered Scottish charity, number SC021474 Winner: Times Higher Education’s Widening Participation Initiative of the Year 2009 and Herald Society’s Education Initiative of the Year 2009. http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html Winner: Times Higher Education’s Outstanding Support for Early Career Researchers of the Year 2010, GCU as a lead with Universities Scotland partners. http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,15691,en.html
Alternative to Solrj
Hi list, I have been looking at an alternative UI config displaying retrieved results from Solr after a query has been passed. At this point, I am not interested in Solrj as all I wish to change is the default responseWriter (line 1007 of Solrconfig). I've also noticed a snippet of default CSS code included in /conf/xslt/example.xsl and understand that all response writers are located in $SOLR_HOME/src/java/org/apache/solr/request and that the default is XSLTResponseWriter.java. Basically I wish to keep code for the search UI as simple as possible (ideally write a simple JSP and CSS ), however I now find that this configuration is proving slightly more confusing in practice. My thinking is as follows, write own responseWriter, include within it my CSS template then specify the responseWriter in solrconfig along with the java class. Can anyone advise me on this from their own experiences. Thank you Lewis Glasgow Caledonian University is a registered Scottish charity, number SC021474 Winner: Times Higher Education’s Widening Participation Initiative of the Year 2009 and Herald Society’s Education Initiative of the Year 2009. http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html Winner: Times Higher Education’s Outstanding Support for Early Career Researchers of the Year 2010, GCU as a lead with Universities Scotland partners. http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,15691,en.html
RequestHandler code within 1.4.0 dist
Hello list, I have been searching through 1.4.0 source for a standard requestHandler plug-in example. I understand that for my purposes, extending RequestHandlerBase is a starting point, however I was wondering if there is any examples of plug-ins which I can view such as those contained within /contrib. Initially my experience using plug-ins relates to those contained within /contrib folder in Solr, or /plugins folder in Nutch, but the structure does not seem to be the same in Solr. Can anyone please help. Thank you Lewis Glasgow Caledonian University is a registered Scottish charity, number SC021474 Winner: Times Higher Education's Widening Participation Initiative of the Year 2009 and Herald Society's Education Initiative of the Year 2009. http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html Winner: Times Higher Education's Outstanding Support for Early Career Researchers of the Year 2010, GCU as a lead with Universities Scotland partners. http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,15691,en.html
RE: DataImportHandler usage with RDF database
Hi Otis... thanks for your thoughts. >I don't think DIH can read from a triple store today. It can read from a >RDBMS, >RSS/Atom feeds, URLs, mail servers, maybe others... >Maybe what you should be looking at is the ManifoldCF instead, although I don't >think it can fetch data from triple stores today either. Ok well a way I can work around this (for the time being) is to pull data from URL's instead. >> without sending an index commit to Solr. As far as I can see >> DataImportHandler >>currently supports full and delta imports which mean I would be indexing. >> > I don't follow what you mean by this and how it relates to the first part. Well as you mentioned below, I'm talking about a custom SearchComponent that reads some data from somewhere (URL for the time being) and then uses it at search time for something. I have no need to index this data, I merely require it at search time. >> So far I have yet to find a requestHandler which is able to read then store >>data in memory, then use this data elsewhere prior to returning documents via >>queryResponseWriter. >I think you are talking about a custom SearchComponent that reads some data >from >somewhere (e.g. your triple store) and then uses it at search time for >something. This sounds doable, although you didn't provide details. For >example, we (Sematext) have implemented custom SearchComponents for e-commerce >customers where frequently-changing information about product availability was >fetched from external stores and applied to search results. I have web based files and the idea is to specify the URLs to the SearchComponent which can then use data within them during search time. Did your plug-in adhere to the general requestHandler design? Can you provide any resource from which I can get started with this? thank you Lewis Glasgow Caledonian University is a registered Scottish charity, number SC021474 Winner: Times Higher Education’s Widening Participation Initiative of the Year 2009 and Herald Society’s Education Initiative of the Year 2009. http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html Winner: Times Higher Education’s Outstanding Support for Early Career Researchers of the Year 2010, GCU as a lead with Universities Scotland partners. http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,15691,en.html
DataImportHandler usage with RDF database
Hello List, I am very interested in DataImportHandler. I have data stored in an RDF db and wish to use this data to boost query results via Solr. I wish to keep this data stored in db as I have a web app which directly maintains this db. Is it possible to use a DataImportHandler to read RDF data from db in memory, without sending an index commit to Solr. As far as I can see DataImportHandler currently supports full and delta imports which mean I would be indexing. So far I have yet to find a requestHandler which is able to read then store data in memory, then use this data elsewhere prior to returning documents via queryResponseWriter. Can anyone provide their thoughts/insight Thank you Lewis Glasgow Caledonian University is a registered Scottish charity, number SC021474 Winner: Times Higher Education's Widening Participation Initiative of the Year 2009 and Herald Society's Education Initiative of the Year 2009. http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html Winner: Times Higher Education's Outstanding Support for Early Career Researchers of the Year 2010, GCU as a lead with Universities Scotland partners. http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,15691,en.html
RE: value for maxFieldLength
Thank you Erick Lewis -Original Message- From: Erick Erickson [mailto:erickerick...@gmail.com] Sent: 03 February 2011 13:25 To: solr-user@lucene.apache.org Subject: Re: value for maxFieldLength This is not really vary large, Solr should handle this easily (assuming you've given it enough memory) so I'd go with a large number, say 20M. If you start running out of memory, then you've probably given the JVM too little memory. But Solr should handle this without a burp. Best Erick On Wed, Feb 2, 2011 at 10:20 AM, McGibbney, Lewis John < lewis.mcgibb...@gcu.ac.uk> wrote: > Hello list, > > I am aware that setting the value of maxFieldLength in solrconfig.xml too > high may/will result in out-of-mem errors. I wish to provide content > extraction on a number of pdf documents which are large, by large I mean > 8-11MB (occasionally more), and I am also not sure how many terms reside in > each field when it is indexed. My question is therefore what is a sensible > number to set this value to in order to include the majority/all terms > within documents of this size. > > Thank you > > Lewis > > > Glasgow Caledonian University is a registered Scottish charity, number > SC021474 > > Winner: Times Higher Education's Widening Participation Initiative of the > Year 2009 and Herald Society's Education Initiative of the Year 2009. > > http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html > > Winner: Times Higher Education's Outstanding Support for Early Career > Researchers of the Year 2010, GCU as a lead with Universities Scotland > partners. > > http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,15691,en.html > Email has been scanned for viruses by Altman Technologies' email management service - www.altman.co.uk/emailsystems Glasgow Caledonian University is a registered Scottish charity, number SC021474 Winner: Times Higher Education’s Widening Participation Initiative of the Year 2009 and Herald Society’s Education Initiative of the Year 2009. http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html Winner: Times Higher Education’s Outstanding Support for Early Career Researchers of the Year 2010, GCU as a lead with Universities Scotland partners. http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,15691,en.html
value for maxFieldLength
Hello list, I am aware that setting the value of maxFieldLength in solrconfig.xml too high may/will result in out-of-mem errors. I wish to provide content extraction on a number of pdf documents which are large, by large I mean 8-11MB (occasionally more), and I am also not sure how many terms reside in each field when it is indexed. My question is therefore what is a sensible number to set this value to in order to include the majority/all terms within documents of this size. Thank you Lewis Glasgow Caledonian University is a registered Scottish charity, number SC021474 Winner: Times Higher Education's Widening Participation Initiative of the Year 2009 and Herald Society's Education Initiative of the Year 2009. http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html Winner: Times Higher Education's Outstanding Support for Early Career Researchers of the Year 2010, GCU as a lead with Universities Scotland partners. http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,15691,en.html
Next steps in loading plug-in
Hi list, Having had a thorough look at the wiki over the weekend and doing some testing myself I have some additional questions regarding loading my plug-in to Solr. Taking the 'Old Way' to loading plug-ins, I have JARred up the relevant classes and added the JAR to the web app WEB-INF/lib dir. I am unsure of next steps to take as my plug-in has extension properties (which specify web-based OWL files which I wish to use whenever the plug-in is invoked). My main question would be where I would include these config properties? My initial thoughts are that they would be included within WEB-INF/web.xml but I am unsure as to how to include them. I have had a good look at web.xml and think that they could be included as 's but this is solely due to my lack of knowledge in this situation. Thank you Lewis Glasgow Caledonian University is a registered Scottish charity, number SC021474 Winner: Times Higher Education's Widening Participation Initiative of the Year 2009 and Herald Society's Education Initiative of the Year 2009 http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html
Adding plug-in to Solr
Hello list, I am attempting to port a plug-in to my Solr implementation and would like to discuss best practice for doing so. The plug-in relates specifically to the query submitted through Solr, the idea is to provide some sort of query 'refinement' mechanism relating t a specific domain. Some information of a similar type of plug-in can be found here http://wiki.apache.org/nutch/OntologyPlugin My question really relates to what config files I need to be consulting when adding plug-ins to Solr and would like to ask for users' experience with this type of experiment. Any comments would be great Lewis Glasgow Caledonian University is a registered Scottish charity, number SC021474 Winner: Times Higher Education’s Widening Participation Initiative of the Year 2009 and Herald Society’s Education Initiative of the Year 2009 http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html
RE: unknown field 'name'
I took a look at schema.xml and you are right the field names did not match up correctly. It was my own fault. Thank you for your time sivaprasad and Markus From: sivaprasad [sivaprasa...@echidnainc.com] Sent: 24 November 2010 03:58 To: solr-user@lucene.apache.org Subject: RE: unknown field 'name' The field names in the xml and schema.xml should be matched -Original Message----- From: "McGibbney, Lewis John [via Lucene]" Sent: Tuesday, November 23, 2010 4:01pm To: "sivaprasad" Subject: unknown field 'name' Good Evening List, I have been working with Nutch and due to numerous integration advantages I decided to get to grips with the Solr code base. Solr dist - 1.4.1 java version 1.6.0_22 Windows Vista Home Premium Command Prompt to execute commands I encountered the following problem very early on during indexing stage, and even though I asked this question (through the wrong list :0|) I have been unable to resolve what it is thats going wrong. My searches to date pick up hits relating to Db problems and are of no use. I have a new dist of Solr and have made no configuration to date. C:\Users\Mcgibbney\Documents\LEWIS\apache-solr-1.4.1\apache-solr-1.4.1\example\e xampledocs>java -jar post.jar *.xml SimplePostTool: version 1.2 SimplePostTool: WARNING: Make sure your XML documents are encoded in UTF-8, othe r encodings are not currently supported SimplePostTool: POSTing files to [http://localhost:8983/solr/update] http://localhost:8983/solr/update.. SimplePostTool: POSTing file hd.xml SimplePostTool: FATAL: Solr returned an error: ERRORunknown_field_name Help would be great. Lewis Mc Glasgow Caledonian University is a registered Scottish charity, number SC021474 Winner: Times Higher Education's Widening Participation Initiative of the Year 2009 and Herald Society's Education Initiative of the Year 2009 [http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html] http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html View message @ [http://lucene.472066.n3.nabble.com/unknown-field-name-tp1956387p1956387.html] http://lucene.472066.n3.nabble.com/unknown-field-name-tp1956387p1956387.html To start a new topic under Solr - User, email ml-node+472068-1030716887-225...@n3.nabble.com To unsubscribe from Solr - User, [http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=472068&code=c2l2YXByYXNhZC5qQGVjaGlkbmFpbmMuY29tfDQ3MjA2OHwtMjAyODMzMTY4OQ==] click here. -- View this message in context: http://lucene.472066.n3.nabble.com/unknown-field-name-tp1956387p1958454.html Sent from the Solr - User mailing list archive at Nabble.com. Email has been scanned for viruses by Altman Technologies' email management service - www.altman.co.uk/emailsystems Glasgow Caledonian University is a registered Scottish charity, number SC021474 Winner: Times Higher Education’s Widening Participation Initiative of the Year 2009 and Herald Society’s Education Initiative of the Year 2009 http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html
unknown field 'name'
Good Evening List, I have been working with Nutch and due to numerous integration advantages I decided to get to grips with the Solr code base. Solr dist - 1.4.1 java version 1.6.0_22 Windows Vista Home Premium Command Prompt to execute commands I encountered the following problem very early on during indexing stage, and even though I asked this question (through the wrong list :0|) I have been unable to resolve what it is thats going wrong. My searches to date pick up hits relating to Db problems and are of no use. I have a new dist of Solr and have made no configuration to date. C:\Users\Mcgibbney\Documents\LEWIS\apache-solr-1.4.1\apache-solr-1.4.1\example\e xampledocs>java -jar post.jar *.xml SimplePostTool: version 1.2 SimplePostTool: WARNING: Make sure your XML documents are encoded in UTF-8, othe r encodings are not currently supported SimplePostTool: POSTing files to http://localhost:8983/solr/update.. SimplePostTool: POSTing file hd.xml SimplePostTool: FATAL: Solr returned an error: ERRORunknown_field_name Help would be great. Lewis Mc Glasgow Caledonian University is a registered Scottish charity, number SC021474 Winner: Times Higher Education's Widening Participation Initiative of the Year 2009 and Herald Society's Education Initiative of the Year 2009 http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html