Re: Getting started with Solr
OK, got it, works now. Maybe you can advise on something more general? I'm trying to use Solr to analyze html data retrieved with Nutch. I want to crawl a list of webpages built according to a certain template, and analyze certain fields in their HTML (identified by a span class and consisting of a number,) then output results as csv to generate a list with the website's domain and sum of the numbers in all the specified fields. How should I set up the flow? Should I configure Nutch to only pull the relevant fields from each page, then use Solr to add the integers in those fields and output to a csv? Or should I use Nutch to pull in everything from the relevant page and then use Solr to strip out the relevant fields and process them as above? Can I do the processing strictly in Solr, using the stuff found here https://cwiki.apache.org/confluence/display/solr/Indexing+and+Basic+Data+Operations, or should I use PHP through Solarium or something along those lines? Your advice would be appreciated-I don't want to reinvent the bicycle. Sincerely, Baruch Kogan Marketing Manager Seller Panda http://sellerpanda.com +972(58)441-3829 baruch.kogan at Skype On Sun, Mar 1, 2015 at 9:17 AM, Baruch Kogan bar...@sellerpanda.com wrote: Thanks for bearing with me. I start Solr with `bin/solr start -e cloud' with 2 nodes. Then I get this: *Welcome to the SolrCloud example!* *This interactive session will help you launch a SolrCloud cluster on your local workstation.* *To begin, how many Solr nodes would you like to run in your local cluster? (specify 1-4 nodes) [2] * *Ok, let's start up 2 Solr nodes for your example SolrCloud cluster.* *Please enter the port for node1 [8983] * *8983* *Please enter the port for node2 [7574] * *7574* *Cloning Solr home directory /home/ubuntu/crawler/solr/example/cloud/node1 into /home/ubuntu/crawler/solr/example/cloud/node2* *Starting up SolrCloud node1 on port 8983 using command:* *solr start -cloud -s example/cloud/node1/solr -p 8983 * I then go to http://localhost:8983/solr/admin/cores and get the following: *This XML file does not appear to have any style information associated with it. The document tree is shown below.* *responselst name=responseHeaderint name=status0/intint name=QTime2/int/lstlst name=initFailures/lst name=statuslst name=testCollection_shard1_replica1str name=nametestCollection_shard1_replica1/strstr name=instanceDir/home/ubuntu/crawler/solr/example/cloud/node1/solr/testCollection_shard1_replica1//strstr name=dataDir/home/ubuntu/crawler/solr/example/cloud/node1/solr/testCollection_shard1_replica1/data//strstr name=configsolrconfig.xml/strstr name=schemaschema.xml/strdate name=startTime2015-03-01T06:59:12.296Z/datelong name=uptime46380/longlst name=indexint name=numDocs0/intint name=maxDoc0/intint name=deletedDocs0/intlong name=indexHeapUsageBytes0/longlong name=version1/longint name=segmentCount0/intbool name=currenttrue/boolbool name=hasDeletionsfalse/boolstr name=directoryorg.apache.lucene.store.NRTCachingDirectory:NRTCachingDirectory(MMapDirectory@/home/ubuntu/crawler/solr/example/cloud/node1/solr/testCollection_shard1_replica1/data/index lockFactory=org.apache.lucene.store.NativeFSLockFactory@2a4f8f8b; maxCacheMB=48.0 maxMergeSizeMB=4.0)/strlst name=userData/long name=sizeInBytes71/longstr name=size71 bytes/str/lst/lstlst name=testCollection_shard1_replica2str name=nametestCollection_shard1_replica2/strstr name=instanceDir/home/ubuntu/crawler/solr/example/cloud/node1/solr/testCollection_shard1_replica2//strstr name=dataDir/home/ubuntu/crawler/solr/example/cloud/node1/solr/testCollection_shard1_replica2/data//strstr name=configsolrconfig.xml/strstr name=schemaschema.xml/strdate name=startTime2015-03-01T06:59:12.751Z/datelong name=uptime45926/longlst name=indexint name=numDocs0/intint name=maxDoc0/intint name=deletedDocs0/intlong name=indexHeapUsageBytes0/longlong name=version1/longint name=segmentCount0/intbool name=currenttrue/boolbool name=hasDeletionsfalse/boolstr name=directoryorg.apache.lucene.store.NRTCachingDirectory:NRTCachingDirectory(MMapDirectory@/home/ubuntu/crawler/solr/example/cloud/node1/solr/testCollection_shard1_replica2/data/index lockFactory=org.apache.lucene.store.NativeFSLockFactory@2a4f8f8b; maxCacheMB=48.0 maxMergeSizeMB=4.0)/strlst name=userData/long name=sizeInBytes71/longstr name=size71 bytes/str/lst/lstlst name=testCollection_shard2_replica1str name=nametestCollection_shard2_replica1/strstr name=instanceDir/home/ubuntu/crawler/solr/example/cloud/node1/solr/testCollection_shard2_replica1//strstr name=dataDir/home/ubuntu/crawler/solr/example/cloud/node1/solr/testCollection_shard2_replica1/data//strstr name=configsolrconfig.xml/strstr name=schemaschema.xml/strdate name=startTime2015-03-01T06:59:12.596Z/datelong name=uptime46081/longlst name=indexint name=numDocs0/intint name=maxDoc0/intint name=deletedDocs0/intlong name=indexHeapUsageBytes0/longlong
Re: Getting started with Solr
Thanks for bearing with me. I start Solr with `bin/solr start -e cloud' with 2 nodes. Then I get this: *Welcome to the SolrCloud example!* *This interactive session will help you launch a SolrCloud cluster on your local workstation.* *To begin, how many Solr nodes would you like to run in your local cluster? (specify 1-4 nodes) [2] * *Ok, let's start up 2 Solr nodes for your example SolrCloud cluster.* *Please enter the port for node1 [8983] * *8983* *Please enter the port for node2 [7574] * *7574* *Cloning Solr home directory /home/ubuntu/crawler/solr/example/cloud/node1 into /home/ubuntu/crawler/solr/example/cloud/node2* *Starting up SolrCloud node1 on port 8983 using command:* *solr start -cloud -s example/cloud/node1/solr -p 8983 * I then go to http://localhost:8983/solr/admin/cores and get the following: *This XML file does not appear to have any style information associated with it. The document tree is shown below.* *responselst name=responseHeaderint name=status0/intint name=QTime2/int/lstlst name=initFailures/lst name=statuslst name=testCollection_shard1_replica1str name=nametestCollection_shard1_replica1/strstr name=instanceDir/home/ubuntu/crawler/solr/example/cloud/node1/solr/testCollection_shard1_replica1//strstr name=dataDir/home/ubuntu/crawler/solr/example/cloud/node1/solr/testCollection_shard1_replica1/data//strstr name=configsolrconfig.xml/strstr name=schemaschema.xml/strdate name=startTime2015-03-01T06:59:12.296Z/datelong name=uptime46380/longlst name=indexint name=numDocs0/intint name=maxDoc0/intint name=deletedDocs0/intlong name=indexHeapUsageBytes0/longlong name=version1/longint name=segmentCount0/intbool name=currenttrue/boolbool name=hasDeletionsfalse/boolstr name=directoryorg.apache.lucene.store.NRTCachingDirectory:NRTCachingDirectory(MMapDirectory@/home/ubuntu/crawler/solr/example/cloud/node1/solr/testCollection_shard1_replica1/data/index lockFactory=org.apache.lucene.store.NativeFSLockFactory@2a4f8f8b; maxCacheMB=48.0 maxMergeSizeMB=4.0)/strlst name=userData/long name=sizeInBytes71/longstr name=size71 bytes/str/lst/lstlst name=testCollection_shard1_replica2str name=nametestCollection_shard1_replica2/strstr name=instanceDir/home/ubuntu/crawler/solr/example/cloud/node1/solr/testCollection_shard1_replica2//strstr name=dataDir/home/ubuntu/crawler/solr/example/cloud/node1/solr/testCollection_shard1_replica2/data//strstr name=configsolrconfig.xml/strstr name=schemaschema.xml/strdate name=startTime2015-03-01T06:59:12.751Z/datelong name=uptime45926/longlst name=indexint name=numDocs0/intint name=maxDoc0/intint name=deletedDocs0/intlong name=indexHeapUsageBytes0/longlong name=version1/longint name=segmentCount0/intbool name=currenttrue/boolbool name=hasDeletionsfalse/boolstr name=directoryorg.apache.lucene.store.NRTCachingDirectory:NRTCachingDirectory(MMapDirectory@/home/ubuntu/crawler/solr/example/cloud/node1/solr/testCollection_shard1_replica2/data/index lockFactory=org.apache.lucene.store.NativeFSLockFactory@2a4f8f8b; maxCacheMB=48.0 maxMergeSizeMB=4.0)/strlst name=userData/long name=sizeInBytes71/longstr name=size71 bytes/str/lst/lstlst name=testCollection_shard2_replica1str name=nametestCollection_shard2_replica1/strstr name=instanceDir/home/ubuntu/crawler/solr/example/cloud/node1/solr/testCollection_shard2_replica1//strstr name=dataDir/home/ubuntu/crawler/solr/example/cloud/node1/solr/testCollection_shard2_replica1/data//strstr name=configsolrconfig.xml/strstr name=schemaschema.xml/strdate name=startTime2015-03-01T06:59:12.596Z/datelong name=uptime46081/longlst name=indexint name=numDocs0/intint name=maxDoc0/intint name=deletedDocs0/intlong name=indexHeapUsageBytes0/longlong name=version1/longint name=segmentCount0/intbool name=currenttrue/boolbool name=hasDeletionsfalse/boolstr name=directoryorg.apache.lucene.store.NRTCachingDirectory:NRTCachingDirectory(MMapDirectory@/home/ubuntu/crawler/solr/example/cloud/node1/solr/testCollection_shard2_replica1/data/index lockFactory=org.apache.lucene.store.NativeFSLockFactory@2a4f8f8b; maxCacheMB=48.0 maxMergeSizeMB=4.0)/strlst name=userData/long name=sizeInBytes71/longstr name=size71 bytes/str/lst/lstlst name=testCollection_shard2_replica2str name=nametestCollection_shard2_replica2/strstr name=instanceDir/home/ubuntu/crawler/solr/example/cloud/node1/solr/testCollection_shard2_replica2//strstr name=dataDir/home/ubuntu/crawler/solr/example/cloud/node1/solr/testCollection_shard2_replica2/data//strstr name=configsolrconfig.xml/strstr name=schemaschema.xml/strdate name=startTime2015-03-01T06:59:12.718Z/datelong name=uptime45959/longlst name=indexint name=numDocs0/intint name=maxDoc0/intint name=deletedDocs0/intlong name=indexHeapUsageBytes0/longlong name=version1/longint name=segmentCount0/intbool name=currenttrue/boolbool name=hasDeletionsfalse/boolstr name=directoryorg.apache.lucene.store.NRTCachingDirectory:NRTCachingDirectory(MMapDirectory@/home/ubuntu/crawler/solr/example/cloud/node1/solr/testCollection_shard2_replica2/data/index
Re: Getting started with Solr
I’m sorry, I’m not following exactly. Somehow you no longer have a gettingstarted collection, but it is not clear how that happened. Could you post the exact script steps you used that got you this error? What collections/cores does the Solr admin show you have?What are the results of http://localhost:8983/solr/admin/cores http://localhost:8983/solr/admin/cores ? — Erik Hatcher, Senior Solutions Architect http://www.lucidworks.com http://www.lucidworks.com/ On Feb 26, 2015, at 9:58 AM, Baruch Kogan bar...@sellerpanda.com wrote: Oh, I see. I used the start -e cloud command, then ran through a setup with one core and default options for the rest, then tried to post the json example again, and got another error: buntu@ubuntu-VirtualBox:~/crawler/solr$ bin/post -c gettingstarted example/exampledocs/*.json /usr/lib/jvm/java-7-oracle/bin/java -classpath /home/ubuntu/crawler/solr/dist/solr-core-5.0.0.jar -Dauto=yes -Dc=gettingstarted -Ddata=files org.apache.solr.util.SimplePostTool example/exampledocs/books.json SimplePostTool version 5.0.0 Posting files to [base] url http://localhost:8983/solr/gettingstarted/update... Entering auto mode. File endings considered are xml,json,csv,pdf,doc,docx,ppt,pptx,xls,xlsx,odt,odp,ods,ott,otp,ots,rtf,htm,html,txt,log POSTing file books.json (application/json) to [base] SimplePostTool: WARNING: Solr returned an error #404 (Not Found) for url: http://localhost:8983/solr/gettingstarted/update SimplePostTool: WARNING: Response: html head meta http-equiv=Content-Type content=text/html; charset=ISO-8859-1/ titleError 404 Not Found/title /head bodyh2HTTP ERROR 404/h2 pProblem accessing /solr/gettingstarted/update. Reason: preNot Found/pre/phr /ismallPowered by Jetty:///small/ibr/ Sincerely, Baruch Kogan Marketing Manager Seller Panda http://sellerpanda.com +972(58)441-3829 baruch.kogan at Skype On Thu, Feb 26, 2015 at 4:07 PM, Erik Hatcher erik.hatc...@gmail.com wrote: How did you start Solr? If you started with `bin/solr start -e cloud` you’ll have a gettingstarted collection created automatically, otherwise you’ll need to create it yourself with `bin/solr create -c gettingstarted` — Erik Hatcher, Senior Solutions Architect http://www.lucidworks.com http://www.lucidworks.com/ On Feb 26, 2015, at 4:53 AM, Baruch Kogan bar...@sellerpanda.com wrote: Hi, I've just installed Solr (will be controlling with Solarium and using to search Nutch queries.) I'm working through the starting tutorials described here: https://cwiki.apache.org/confluence/display/solr/Running+Solr When I try to run $ bin/post -c gettingstarted example/exampledocs/*.json, I get a bunch of errors having to do with there not being a gettingstarted folder in /solr/. Is this normal? Should I create one? Sincerely, Baruch Kogan Marketing Manager Seller Panda http://sellerpanda.com +972(58)441-3829 baruch.kogan at Skype
Re: Getting started with Solr
Oh, I see. I used the start -e cloud command, then ran through a setup with one core and default options for the rest, then tried to post the json example again, and got another error: buntu@ubuntu-VirtualBox:~/crawler/solr$ bin/post -c gettingstarted example/exampledocs/*.json /usr/lib/jvm/java-7-oracle/bin/java -classpath /home/ubuntu/crawler/solr/dist/solr-core-5.0.0.jar -Dauto=yes -Dc=gettingstarted -Ddata=files org.apache.solr.util.SimplePostTool example/exampledocs/books.json SimplePostTool version 5.0.0 Posting files to [base] url http://localhost:8983/solr/gettingstarted/update... Entering auto mode. File endings considered are xml,json,csv,pdf,doc,docx,ppt,pptx,xls,xlsx,odt,odp,ods,ott,otp,ots,rtf,htm,html,txt,log POSTing file books.json (application/json) to [base] SimplePostTool: WARNING: Solr returned an error #404 (Not Found) for url: http://localhost:8983/solr/gettingstarted/update SimplePostTool: WARNING: Response: html head meta http-equiv=Content-Type content=text/html; charset=ISO-8859-1/ titleError 404 Not Found/title /head bodyh2HTTP ERROR 404/h2 pProblem accessing /solr/gettingstarted/update. Reason: preNot Found/pre/phr /ismallPowered by Jetty:///small/ibr/ Sincerely, Baruch Kogan Marketing Manager Seller Panda http://sellerpanda.com +972(58)441-3829 baruch.kogan at Skype On Thu, Feb 26, 2015 at 4:07 PM, Erik Hatcher erik.hatc...@gmail.com wrote: How did you start Solr? If you started with `bin/solr start -e cloud` you’ll have a gettingstarted collection created automatically, otherwise you’ll need to create it yourself with `bin/solr create -c gettingstarted` — Erik Hatcher, Senior Solutions Architect http://www.lucidworks.com http://www.lucidworks.com/ On Feb 26, 2015, at 4:53 AM, Baruch Kogan bar...@sellerpanda.com wrote: Hi, I've just installed Solr (will be controlling with Solarium and using to search Nutch queries.) I'm working through the starting tutorials described here: https://cwiki.apache.org/confluence/display/solr/Running+Solr When I try to run $ bin/post -c gettingstarted example/exampledocs/*.json, I get a bunch of errors having to do with there not being a gettingstarted folder in /solr/. Is this normal? Should I create one? Sincerely, Baruch Kogan Marketing Manager Seller Panda http://sellerpanda.com +972(58)441-3829 baruch.kogan at Skype
Re: Getting started with Solr
How did you start Solr? If you started with `bin/solr start -e cloud` you’ll have a gettingstarted collection created automatically, otherwise you’ll need to create it yourself with `bin/solr create -c gettingstarted` — Erik Hatcher, Senior Solutions Architect http://www.lucidworks.com http://www.lucidworks.com/ On Feb 26, 2015, at 4:53 AM, Baruch Kogan bar...@sellerpanda.com wrote: Hi, I've just installed Solr (will be controlling with Solarium and using to search Nutch queries.) I'm working through the starting tutorials described here: https://cwiki.apache.org/confluence/display/solr/Running+Solr When I try to run $ bin/post -c gettingstarted example/exampledocs/*.json, I get a bunch of errors having to do with there not being a gettingstarted folder in /solr/. Is this normal? Should I create one? Sincerely, Baruch Kogan Marketing Manager Seller Panda http://sellerpanda.com +972(58)441-3829 baruch.kogan at Skype
Getting started with Solr
Hi, I've just installed Solr (will be controlling with Solarium and using to search Nutch queries.) I'm working through the starting tutorials described here: https://cwiki.apache.org/confluence/display/solr/Running+Solr When I try to run $ bin/post -c gettingstarted example/exampledocs/*.json, I get a bunch of errors having to do with there not being a gettingstarted folder in /solr/. Is this normal? Should I create one? Sincerely, Baruch Kogan Marketing Manager Seller Panda http://sellerpanda.com +972(58)441-3829 baruch.kogan at Skype
Getting Started with Enterprise Search using Apache Solr
Hi. Most of the members here are already seasoned search professionals. However I believe there may also be a few who joined because they want to get started on search and IMHO, probably like you, Solr is the best way to start. Therefore I wanted to post a link to a course that I created on Getting Started Enterprise Search using Apache Solr. For some it might be a good way to start learning. If you are already a search professional maybe you will not benefit greatly, but if you can provide feedback that will be great as I want to create more trainings to help people get started on search. It is a Pluralsight training so if you are not a subscriber, just create a trial account and you have 10 days to watch. If you have questions, let me know. You can reach me through here or @xmorera in Twitter Here is the course http://pluralsight.com/training/Courses/TableOfContents/enterprise-search-using-apache-solr PS: Pluralsight is also a great way to learn so I really recommend it. https://www.linkedin.com/news?viewArticle=articleID=8578259352468791690gid=161594type=memberitem=5887568199951605762articleURL=http%3A%2F%2Fpluralsight%2Ecom%2Ftraining%2FCourses%2FTableOfContents%2Fenterprise-search-using-apache-solrurlhash=45UXgoback=%2Egde_161594_member_5887568199951605762 Getting Started with Enterprise Search using Apache Solr pluralsight.com Search is one of the most misunderstood functionalities in the IT industry. Even further, Enterprise Search used to be neither for the faint of heart, nor for those with a thin wallet. However, since the introduction of Apache Solr, the name of the game has changed. Don't leave home without it! -- *Xavier Morera* email: xav...@familiamorera.com CR: +(506) 8849 8866 US: +1 (305) 600 4919 skype: xmorera
newbie getting started with solr
Sorry if this is obvious (because it isn't for me) I want to build a solr (4.5.1) + nutch (1.7.1) environment. I'm doing this on amazon linux (I may put nutch on a separate server eventually). Please let me know if my thinking is sound or off base in the example folder are a lot of files and folders including the war file and start.jar drwxr-xr-x cloud-scripts drwxr-xr-x contexts drwxr-xr-x etc drwxr-xr-x example-DIH drwxr-xr-x exampledocs drwxr-xr-x example-schemaless drwxr-xr-x lib drwxr-xr-x logs drwxr-xr-x multicore -rw-r--r-- README.txt drwxr-xr-x resources drwxr-xr-x solr drwxr-xr-x solr-webapp -rw-r--r-- start.jar drwxr-xr-x webapps I am creating a separate folder for the conf and data folders (on another disk) and placing these files in the conf file schema-solr.xml (from nutch) renamed to schema.solr solrconfig.xml I will use the example folder and start.jar from that location. (is this okay) Where do I set the collection name? What else do I need to do to get a basic web page indexer built. (I'll work out the crawling later, I just want to be able to manually add some documents and query). I'm trying to understand solr first and then will use nutch. I have several books and have looked at the tutorial and other web sites. It seems they assume that I know where to begin when creating a new collection and customizing it. Thanks in advance for your help. -- Eric Palmer Web Services U of Richmond To report technical issues, obtain technical support or make requests for enhancements please visit http://web.richmond.edu/contact/technical-support.html
Re: newbie getting started with solr
Hi Eric, Solr configuration can certainly be confusing at first. And for some time after. :P If you're running start.jar from the example folder (which is fine for testing, and I've known some people to use it for production systems) then the default solr home is example/solr. This contains solr.xml, which specifies where to find per-core configuration and data. (A core is equivalent to a collection in a simple non-sharded setup). For now, the easiest thing would be to use the default core in example/solr/collection1. Copy your solrconfig.xml and schema.xml over the ones in collection1/conf (backing up the originals for reference). Create your data directory wherever you like and symlink it into collection1. Now when you run $ java -jar start.jar in example/, you should be able to access Solr at http://localhost:8983/solr/ , and add and search for documents. Hope that helps a bit! Tom On 7 November 2013 14:50, Palmer, Eric epal...@richmond.edu wrote: Sorry if this is obvious (because it isn't for me) I want to build a solr (4.5.1) + nutch (1.7.1) environment. I'm doing this on amazon linux (I may put nutch on a separate server eventually). Please let me know if my thinking is sound or off base in the example folder are a lot of files and folders including the war file and start.jar drwxr-xr-x cloud-scripts drwxr-xr-x contexts drwxr-xr-x etc drwxr-xr-x example-DIH drwxr-xr-x exampledocs drwxr-xr-x example-schemaless drwxr-xr-x lib drwxr-xr-x logs drwxr-xr-x multicore -rw-r--r-- README.txt drwxr-xr-x resources drwxr-xr-x solr drwxr-xr-x solr-webapp -rw-r--r-- start.jar drwxr-xr-x webapps I am creating a separate folder for the conf and data folders (on another disk) and placing these files in the conf file schema-solr.xml (from nutch) renamed to schema.solr solrconfig.xml I will use the example folder and start.jar from that location. (is this okay) Where do I set the collection name? What else do I need to do to get a basic web page indexer built. (I'll work out the crawling later, I just want to be able to manually add some documents and query). I'm trying to understand solr first and then will use nutch. I have several books and have looked at the tutorial and other web sites. It seems they assume that I know where to begin when creating a new collection and customizing it. Thanks in advance for your help. -- Eric Palmer Web Services U of Richmond To report technical issues, obtain technical support or make requests for enhancements please visit http://web.richmond.edu/contact/technical-support.html
Re: newbie getting started with solr
Tried my book? It should explain that. You can see the collections with examples in GitHub: https://github.com/arafalov/solr-indexing-book/tree/master/published Start from collection1. Regards, Alex. Personal website: http://www.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch - Time is the quality of nature that keeps events from happening all at once. Lately, it doesn't seem to be working. (Anonymous - via GTD book) On Thu, Nov 7, 2013 at 4:50 PM, Palmer, Eric epal...@richmond.edu wrote: Sorry if this is obvious (because it isn't for me) I want to build a solr (4.5.1) + nutch (1.7.1) environment. I'm doing this on amazon linux (I may put nutch on a separate server eventually). Please let me know if my thinking is sound or off base in the example folder are a lot of files and folders including the war file and start.jar drwxr-xr-x cloud-scripts drwxr-xr-x contexts drwxr-xr-x etc drwxr-xr-x example-DIH drwxr-xr-x exampledocs drwxr-xr-x example-schemaless drwxr-xr-x lib drwxr-xr-x logs drwxr-xr-x multicore -rw-r--r-- README.txt drwxr-xr-x resources drwxr-xr-x solr drwxr-xr-x solr-webapp -rw-r--r-- start.jar drwxr-xr-x webapps I am creating a separate folder for the conf and data folders (on another disk) and placing these files in the conf file schema-solr.xml (from nutch) renamed to schema.solr solrconfig.xml I will use the example folder and start.jar from that location. (is this okay) Where do I set the collection name? What else do I need to do to get a basic web page indexer built. (I'll work out the crawling later, I just want to be able to manually add some documents and query). I'm trying to understand solr first and then will use nutch. I have several books and have looked at the tutorial and other web sites. It seems they assume that I know where to begin when creating a new collection and customizing it. Thanks in advance for your help. -- Eric Palmer Web Services U of Richmond To report technical issues, obtain technical support or make requests for enhancements please visit http://web.richmond.edu/contact/technical-support.html
Re: Re: Unable to getting started with SOLR
I suggest you to start from here: http://wiki.apache.org/solr/HowToCompileSolr 15 Eylül 2013 Pazar tarihinde Erick Erickson erickerick...@gmail.com adlı kullanıcı şöyle yazdı: If you're using the default jetty container, there's no log unless you set it up, the content is echoed to the screen. About a zillion people have downloaded this and started it running without issue, so you need to give us the exact steps you followed. If you checked the code out from SVN, you need to build it, go into solrhome/solr and execute ant example dist the dist bit isn't strictly necessary, but it builds the jars that you link to if you try to develop custom plugins etc. Best, Erick On Fri, Sep 13, 2013 at 3:56 AM, Rah1x raheel_itst...@yahoo.com wrote: I have the same issue can anyone tell me if they found a solution? -- View this message in context: http://lucene.472066.n3.nabble.com/Unable-to-getting-started-with-SOLR-tp3497276p4089761.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Re: Unable to getting started with SOLR
If you're using the default jetty container, there's no log unless you set it up, the content is echoed to the screen. About a zillion people have downloaded this and started it running without issue, so you need to give us the exact steps you followed. If you checked the code out from SVN, you need to build it, go into solrhome/solr and execute ant example dist the dist bit isn't strictly necessary, but it builds the jars that you link to if you try to develop custom plugins etc. Best, Erick On Fri, Sep 13, 2013 at 3:56 AM, Rah1x raheel_itst...@yahoo.com wrote: I have the same issue can anyone tell me if they found a solution? -- View this message in context: http://lucene.472066.n3.nabble.com/Unable-to-getting-started-with-SOLR-tp3497276p4089761.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Re: Unable to getting started with SOLR
I have the same issue can anyone tell me if they found a solution? -- View this message in context: http://lucene.472066.n3.nabble.com/Unable-to-getting-started-with-SOLR-tp3497276p4089761.html Sent from the Solr - User mailing list archive at Nabble.com.
Getting started with solr 4.2 and cassandra
Hello, I am evaluating solr 4.2 and ElasticSearch (I am new to both) for a search API, where data sits in cassandra. Getting started with elasticsearch is pretty straight forward and I was able to write an ES riverhttp://www.elasticsearch.org/guide/reference/river/ which pulls data from cassandra and indexes it in ES within a day. Now, I trying to implement something similar with solr and compare both of them. Getting started with solr/examplehttp://lucene.apache.org/solr/4_2_0/tutorial.htmlwas pretty easy and an example solr instance works. But the example folder contains whole bunch of stuff which I am not sure if I need: http://pastebin.com/Gv660mRT . I am sure I don't need 53 directories and 527 files So my questions are: 1. How can I create a bare bone solr app up and running with minimum set of configuration? (I will build over it when needed by taking reference from /example) 2. What is a best practice to run solr in production? Am approach like this jetty+nginx recommended: http://sacharya.com/nginx-proxy-to-jetty-for-java-apps/ ? Once I am done setting up a simple solr instance: 3. What is the general practice to import data to solr? For now, I am writing a python script which will read data in bulk from cassandra and throw it to solr. -- Thanks, -Utkarsh
Re: Getting started with solr 4.2 and cassandra
You might want to check out DataStax Enterprise, which actually integrates Cassandra and Solr. You keep the data in Cassandra, but as data is added and updated and deleted, the Solr index is automatically updated in parallel. You can add and update data and query using either the Cassandra API or the Solr API. See: http://www.datastax.com/what-we-offer/products-services/datastax-enterprise -- Jack Krupansky -Original Message- From: Utkarsh Sengar Sent: Monday, April 01, 2013 6:34 PM To: solr-user@lucene.apache.org Subject: Getting started with solr 4.2 and cassandra Hello, I am evaluating solr 4.2 and ElasticSearch (I am new to both) for a search API, where data sits in cassandra. Getting started with elasticsearch is pretty straight forward and I was able to write an ES riverhttp://www.elasticsearch.org/guide/reference/river/ which pulls data from cassandra and indexes it in ES within a day. Now, I trying to implement something similar with solr and compare both of them. Getting started with solr/examplehttp://lucene.apache.org/solr/4_2_0/tutorial.htmlwas pretty easy and an example solr instance works. But the example folder contains whole bunch of stuff which I am not sure if I need: http://pastebin.com/Gv660mRT . I am sure I don't need 53 directories and 527 files So my questions are: 1. How can I create a bare bone solr app up and running with minimum set of configuration? (I will build over it when needed by taking reference from /example) 2. What is a best practice to run solr in production? Am approach like this jetty+nginx recommended: http://sacharya.com/nginx-proxy-to-jetty-for-java-apps/ ? Once I am done setting up a simple solr instance: 3. What is the general practice to import data to solr? For now, I am writing a python script which will read data in bulk from cassandra and throw it to solr. -- Thanks, -Utkarsh
Re: Getting started with solr 4.2 and cassandra
Thanks for the reply. So DSE is one of the options and I am looking into that too. Although, before diving into solr+cassandra integration (which comes out of the box with DSE). I am just trying to setup a solr instance on my local machine without the bloat the example solr instance has to offer. Any suggestions about that? Thanks, -Utkarsh On Mon, Apr 1, 2013 at 4:00 PM, Jack Krupansky j...@basetechnology.comwrote: You might want to check out DataStax Enterprise, which actually integrates Cassandra and Solr. You keep the data in Cassandra, but as data is added and updated and deleted, the Solr index is automatically updated in parallel. You can add and update data and query using either the Cassandra API or the Solr API. See: http://www.datastax.com/what-**we-offer/products-services/** datastax-enterprisehttp://www.datastax.com/what-we-offer/products-services/datastax-enterprise -- Jack Krupansky -Original Message- From: Utkarsh Sengar Sent: Monday, April 01, 2013 6:34 PM To: solr-user@lucene.apache.org Subject: Getting started with solr 4.2 and cassandra Hello, I am evaluating solr 4.2 and ElasticSearch (I am new to both) for a search API, where data sits in cassandra. Getting started with elasticsearch is pretty straight forward and I was able to write an ES riverhttp://www.**elasticsearch.org/guide/**reference/river/http://www.elasticsearch.org/guide/reference/river/ which pulls data from cassandra and indexes it in ES within a day. Now, I trying to implement something similar with solr and compare both of them. Getting started with solr/examplehttp://lucene.**apache.org/solr/4_2_0/**tutorial.htmlhttp://lucene.apache.org/solr/4_2_0/tutorial.html was pretty easy and an example solr instance works. But the example folder contains whole bunch of stuff which I am not sure if I need: http://pastebin.com/Gv660mRT . I am sure I don't need 53 directories and 527 files So my questions are: 1. How can I create a bare bone solr app up and running with minimum set of configuration? (I will build over it when needed by taking reference from /example) 2. What is a best practice to run solr in production? Am approach like this jetty+nginx recommended: http://sacharya.com/nginx-**proxy-to-jetty-for-java-apps/http://sacharya.com/nginx-proxy-to-jetty-for-java-apps/? Once I am done setting up a simple solr instance: 3. What is the general practice to import data to solr? For now, I am writing a python script which will read data in bulk from cassandra and throw it to solr. -- Thanks, -Utkarsh -- Thanks, -Utkarsh
Re: Getting started with solr 4.2 and cassandra
The Solr example really is rather simple. Download, unzip, run, add data, query. It's really that simple. Make sure you are looking at the Solr tutorial: http://lucene.apache.org/solr/4_2_0/tutorial.html Download from here: http://lucene.apache.org/solr/tutorial.html -- Jack Krupansky -Original Message- From: Utkarsh Sengar Sent: Monday, April 01, 2013 7:41 PM To: solr-user@lucene.apache.org Subject: Re: Getting started with solr 4.2 and cassandra Thanks for the reply. So DSE is one of the options and I am looking into that too. Although, before diving into solr+cassandra integration (which comes out of the box with DSE). I am just trying to setup a solr instance on my local machine without the bloat the example solr instance has to offer. Any suggestions about that? Thanks, -Utkarsh On Mon, Apr 1, 2013 at 4:00 PM, Jack Krupansky j...@basetechnology.comwrote: You might want to check out DataStax Enterprise, which actually integrates Cassandra and Solr. You keep the data in Cassandra, but as data is added and updated and deleted, the Solr index is automatically updated in parallel. You can add and update data and query using either the Cassandra API or the Solr API. See: http://www.datastax.com/what-**we-offer/products-services/** datastax-enterprisehttp://www.datastax.com/what-we-offer/products-services/datastax-enterprise -- Jack Krupansky -Original Message- From: Utkarsh Sengar Sent: Monday, April 01, 2013 6:34 PM To: solr-user@lucene.apache.org Subject: Getting started with solr 4.2 and cassandra Hello, I am evaluating solr 4.2 and ElasticSearch (I am new to both) for a search API, where data sits in cassandra. Getting started with elasticsearch is pretty straight forward and I was able to write an ES riverhttp://www.**elasticsearch.org/guide/**reference/river/http://www.elasticsearch.org/guide/reference/river/ which pulls data from cassandra and indexes it in ES within a day. Now, I trying to implement something similar with solr and compare both of them. Getting started with solr/examplehttp://lucene.**apache.org/solr/4_2_0/**tutorial.htmlhttp://lucene.apache.org/solr/4_2_0/tutorial.html was pretty easy and an example solr instance works. But the example folder contains whole bunch of stuff which I am not sure if I need: http://pastebin.com/Gv660mRT . I am sure I don't need 53 directories and 527 files So my questions are: 1. How can I create a bare bone solr app up and running with minimum set of configuration? (I will build over it when needed by taking reference from /example) 2. What is a best practice to run solr in production? Am approach like this jetty+nginx recommended: http://sacharya.com/nginx-**proxy-to-jetty-for-java-apps/http://sacharya.com/nginx-proxy-to-jetty-for-java-apps/? Once I am done setting up a simple solr instance: 3. What is the general practice to import data to solr? For now, I am writing a python script which will read data in bulk from cassandra and throw it to solr. -- Thanks, -Utkarsh -- Thanks, -Utkarsh
Re: Getting started with solr 4.2 and cassandra
Hi, Solr doesn't have anything like ES River. DIH (DataImportHandler) feels like the closest thing in Solr, though it's not quite the same thing. DIH pulls in data like a typical River does, but most people have external indexers that push data into Solr using one of its client libraries to talk to Solr, such as SolrJ. Otis -- Solr ElasticSearch Support http://sematext.com/ On Mon, Apr 1, 2013 at 6:34 PM, Utkarsh Sengar utkarsh2...@gmail.com wrote: Hello, I am evaluating solr 4.2 and ElasticSearch (I am new to both) for a search API, where data sits in cassandra. Getting started with elasticsearch is pretty straight forward and I was able to write an ES riverhttp://www.elasticsearch.org/guide/reference/river/ which pulls data from cassandra and indexes it in ES within a day. Now, I trying to implement something similar with solr and compare both of them. Getting started with solr/examplehttp://lucene.apache.org/solr/4_2_0/tutorial.htmlwas pretty easy and an example solr instance works. But the example folder contains whole bunch of stuff which I am not sure if I need: http://pastebin.com/Gv660mRT . I am sure I don't need 53 directories and 527 files So my questions are: 1. How can I create a bare bone solr app up and running with minimum set of configuration? (I will build over it when needed by taking reference from /example) 2. What is a best practice to run solr in production? Am approach like this jetty+nginx recommended: http://sacharya.com/nginx-proxy-to-jetty-for-java-apps/ ? Once I am done setting up a simple solr instance: 3. What is the general practice to import data to solr? For now, I am writing a python script which will read data in bulk from cassandra and throw it to solr. -- Thanks, -Utkarsh
Re: Getting started with indexing a database
Hi Mike, Can you try removing ' field column=doc_id name=DOC_ID / from the nested entities? Just keep it in the top level entity. Regards, Rakesh Varna On Wed, Jan 11, 2012 at 7:26 AM, Gora Mohanty g...@mimirtech.com wrote: On Tue, Jan 10, 2012 at 7:09 AM, Mike O'Leary tmole...@uw.edu wrote: [...] My data-config.xml file looks like this: dataConfig dataSource type=JdbcDataSource driver=com.mysql.jdbc.Driver url=jdbc:mysql://localhost:3306/bioscope user=db_user password=/ document name=bioscope entity name=docs pk=doc_id query=SELECT doc_id, type FROM bioscope.docs deltaQuery=SELECT doc_id FROM bioscope.docs where last_modified '${dataimporter.last_index_time}' field column=doc_id name=DOC_ID/ field column=type name=DOC_TYPE/ Your SELECT above does not include the field type entity name=codes pk=id query=SELECT id, origin, type, code FROM bioscope.codes WHERE doc_id='${docs.doc_id}' ^^ This should be: WHERE id=='${docs.doc_id}' as 'id' is what you are selecting in this entity. Same issue for the second nested entity, i.e., replace doc_id= with id= Regards, Gora
Re: Getting started with indexing a database
I'm not going to be much help here since DIH is a mystery to me, I usually go with a SolrJ program when DIH gets beyond simple cases. But have you seen: http://wiki.apache.org/solr/DataImportHandler#interactive It's a tool that helps you see what's going on with your query. Best Erick On Mon, Jan 9, 2012 at 8:39 PM, Mike O'Leary tmole...@uw.edu wrote: I am trying to index the contents of a database for the first time, and I am only getting the primary key of the table represented by the top level entity in my data-config.xml file to be indexed. The database I am starting with has three tables: The table called docs has columns called doc_id, type and last_modified. The primary key is doc_id. The table called codes has columns called id, doc_id, origin, type, code and last_modified. The primary key is id. doc_id is a foreign key to the doc_id column in the docs table. The table called texts has columns called id, doc_id, origin, type, text and last_modified. The primary key is id. doc_id is a foreign key to the doc_id column in the docs table. My data-config.xml file looks like this: dataConfig dataSource type=JdbcDataSource driver=com.mysql.jdbc.Driver url=jdbc:mysql://localhost:3306/bioscope user=db_user password=/ document name=bioscope entity name=docs pk=doc_id query=SELECT doc_id, type FROM bioscope.docs deltaQuery=SELECT doc_id FROM bioscope.docs where last_modified '${dataimporter.last_index_time}' field column=doc_id name=DOC_ID/ field column=type name=DOC_TYPE/ entity name=codes pk=id query=SELECT id, origin, type, code FROM bioscope.codes WHERE doc_id='${docs.doc_id}' deltaQuery=SELECT doc_id FROM bioscope.codes WHERE last_modified '${dataimporter.last_index_time}' parentDeltaQuery=SELECT doc_id from bioscope.docs WHERE doc_id='${codes.doc_id}' field column=id name=CODE_ID/ field column=doc_id name=DOC_ID/ field column=origin name=CODE_ORIGIN/ field column=type name=CODE_TYPE/ field column=code name=CODE_VALUE/ /entity entity name=notes pk=id query=SELECT id, origin, type, text FROM bioscope.texts WHERE doc_id='${docs.doc_id}' deltaQuery=SELECT doc_id FROM bioscope.texts WHERE last_modified '${dataimporter.last_index_time}' parentDeltaQuery=SELECT doc_id from bioscope.docs WHERE doc_id='${texts.doc_id}' field column=id name=NOTE_ID/ field column=doc_id name=DOC_ID/ field column=origin name=NOTE_ORIGIN/ field column=type name=NOTE_TYPE/ field column=text name=NOTE_TEXT/ /entity /entity /document /dataConfig I added these lines to the schema.xml file: field name=DOC_ID type=string indexed=true omitNorms=true stored=true/ field name=DOC_TYPE type=string indexed=true omitNorms=true stored=true/ field name=CODE_ID type=string indexed=true omitNorms=true stored=true/ field name=CODE_ORIGIN type=string indexed=true omitNorms=true stored=true/ field name=CODE_TYPE type=string indexed=true omitNorms=true stored=true/ field name=CODE_VALUE type=string indexed=true omitNorms=true stored=true/ field name=NOTE_ID type=string indexed=true omitNorms=true stored=true/ field name=NOTE_ORIGIN type=string indexed=true omitNorms=true stored=true/ field name=NOTE_TYPE type=string indexed=true omitNorms=true stored=true/ field name=NOTE_TEXT type=text_ws indexed=true omitNorms=true stored=true/ ... uniqueKeyDOC_ID/uniqueKey defaultSearchFieldNOTE_TEXT/defaultSearchField When I run the full-import operation, only the DOC_ID values are written to the index. When I run a program that dumps the index contents as an xml string, the output looks like this: ?xml version=1.0 ? documents document field name=DOC_ID value=97634811 /field /document document field name=DOC_ID value=97634910 /field /document ... /documents Since this is new to me, I am sure that I have simply left something out or specified something the wrong way, but I haven't been able to spot what I have been doing wrong when I have gone over the configuration files that I am using. Can anyone help me figure out why the other database contents are not being indexed? Thanks, Mike
Re: Getting started with indexing a database
On Tue, Jan 10, 2012 at 7:09 AM, Mike O'Leary tmole...@uw.edu wrote: [...] My data-config.xml file looks like this: dataConfig dataSource type=JdbcDataSource driver=com.mysql.jdbc.Driver url=jdbc:mysql://localhost:3306/bioscope user=db_user password=/ document name=bioscope entity name=docs pk=doc_id query=SELECT doc_id, type FROM bioscope.docs deltaQuery=SELECT doc_id FROM bioscope.docs where last_modified '${dataimporter.last_index_time}' field column=doc_id name=DOC_ID/ field column=type name=DOC_TYPE/ Your SELECT above does not include the field type entity name=codes pk=id query=SELECT id, origin, type, code FROM bioscope.codes WHERE doc_id='${docs.doc_id}' ^^ This should be: WHERE id=='${docs.doc_id}' as 'id' is what you are selecting in this entity. Same issue for the second nested entity, i.e., replace doc_id= with id= Regards, Gora
Getting started with indexing a database
I am trying to index the contents of a database for the first time, and I am only getting the primary key of the table represented by the top level entity in my data-config.xml file to be indexed. The database I am starting with has three tables: The table called docs has columns called doc_id, type and last_modified. The primary key is doc_id. The table called codes has columns called id, doc_id, origin, type, code and last_modified. The primary key is id. doc_id is a foreign key to the doc_id column in the docs table. The table called texts has columns called id, doc_id, origin, type, text and last_modified. The primary key is id. doc_id is a foreign key to the doc_id column in the docs table. My data-config.xml file looks like this: dataConfig dataSource type=JdbcDataSource driver=com.mysql.jdbc.Driver url=jdbc:mysql://localhost:3306/bioscope user=db_user password=/ document name=bioscope entity name=docs pk=doc_id query=SELECT doc_id, type FROM bioscope.docs deltaQuery=SELECT doc_id FROM bioscope.docs where last_modified '${dataimporter.last_index_time}' field column=doc_id name=DOC_ID/ field column=type name=DOC_TYPE/ entity name=codes pk=id query=SELECT id, origin, type, code FROM bioscope.codes WHERE doc_id='${docs.doc_id}' deltaQuery=SELECT doc_id FROM bioscope.codes WHERE last_modified '${dataimporter.last_index_time}' parentDeltaQuery=SELECT doc_id from bioscope.docs WHERE doc_id='${codes.doc_id}' field column=id name=CODE_ID/ field column=doc_id name=DOC_ID/ field column=origin name=CODE_ORIGIN/ field column=type name=CODE_TYPE/ field column=code name=CODE_VALUE/ /entity entity name=notes pk=id query=SELECT id, origin, type, text FROM bioscope.texts WHERE doc_id='${docs.doc_id}' deltaQuery=SELECT doc_id FROM bioscope.texts WHERE last_modified '${dataimporter.last_index_time}' parentDeltaQuery=SELECT doc_id from bioscope.docs WHERE doc_id='${texts.doc_id}' field column=id name=NOTE_ID/ field column=doc_id name=DOC_ID/ field column=origin name=NOTE_ORIGIN/ field column=type name=NOTE_TYPE/ field column=text name=NOTE_TEXT/ /entity /entity /document /dataConfig I added these lines to the schema.xml file: field name=DOC_ID type=string indexed=true omitNorms=true stored=true/ field name=DOC_TYPE type=string indexed=true omitNorms=true stored=true/ field name=CODE_ID type=string indexed=true omitNorms=true stored=true/ field name=CODE_ORIGIN type=string indexed=true omitNorms=true stored=true/ field name=CODE_TYPE type=string indexed=true omitNorms=true stored=true/ field name=CODE_VALUE type=string indexed=true omitNorms=true stored=true/ field name=NOTE_ID type=string indexed=true omitNorms=true stored=true/ field name=NOTE_ORIGIN type=string indexed=true omitNorms=true stored=true/ field name=NOTE_TYPE type=string indexed=true omitNorms=true stored=true/ field name=NOTE_TEXT type=text_ws indexed=true omitNorms=true stored=true/ ... uniqueKeyDOC_ID/uniqueKey defaultSearchFieldNOTE_TEXT/defaultSearchField When I run the full-import operation, only the DOC_ID values are written to the index. When I run a program that dumps the index contents as an xml string, the output looks like this: ?xml version=1.0 ? documents document field name=DOC_ID value=97634811 /field /document document field name=DOC_ID value=97634910 /field /document ... /documents Since this is new to me, I am sure that I have simply left something out or specified something the wrong way, but I haven't been able to spot what I have been doing wrong when I have gone over the configuration files that I am using. Can anyone help me figure out why the other database contents are not being indexed? Thanks, Mike
Unable to getting started with SOLR
Hi all, Sorry for the in convenience caused if to anyone but I need reply for following. I want to work in Solr and for the same I downloaded it and started to follow the instruction provided in the Tutorial available at http://lucene.apache.org/solr/tutorial.html; to execute some examples first. but when I tried to check whether Solr is running or not bye using http://localhost:8983/solr/admin/; in the web browser I found the following message. I will be thankful if one can suggest some solution for it. Message: Unable to connect Firefox can't establish a connection to the server at localhost:8983. The site could be temporarily unavailable or too busy. Try again in a few moments. If you are unable to load any pages, check your computer's network connection. If your computer or network is protected by a firewall or proxy, make sure that Firefox is permitted to access the Web. _ With Regds: Divakar -- View this message in context: http://lucene.472066.n3.nabble.com/Unable-to-getting-started-with-SOLR-tp3497276p3497276.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Unable to getting started with SOLR
Did you start the server ( *java -jar start.jar* )? Was it successful? Have you checked the logs? Am 10.11.2011 17:54, schrieb dsy99: Hi all, Sorry for the in convenience caused if to anyone but I need reply for following. I want to work in Solr and for the same I downloaded it and started to follow the instruction provided in the Tutorial available at http://lucene.apache.org/solr/tutorial.html; to execute some examples first. but when I tried to check whether Solr is running or not bye using http://localhost:8983/solr/admin/; in the web browser I found the following message. I will be thankful if one can suggest some solution for it. Message: Unable to connect Firefox can't establish a connection to the server at localhost:8983. The site could be temporarily unavailable or too busy. Try again in a few moments. If you are unable to load any pages, check your computer's network connection. If your computer or network is protected by a firewall or proxy, make sure that Firefox is permitted to access the Web. _ With Regds: Divakar -- View this message in context: http://lucene.472066.n3.nabble.com/Unable-to-getting-started-with-SOLR-tp3497276p3497276.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Unable to getting started with SOLR
Sounds strange. Did you do java -jar start.jar on the console? Am 10.11.2011 18:19, schrieb dsy99: Yes I executed the server start.jar embedded in example folder but not getting any message after that. I checked to logs also.it is empty. On Thu, 10 Nov 2011 22:34:57 +0530 wrote Did you start the server ( *java -jar start.jar* )? Was it successful? Have you checked the logs? Am 10.11.2011 17:54, schrieb dsy99: Hi all, Sorry for the in convenience caused if to anyone but I need reply for following. I want to work in Solr and for the same I downloaded it and started to follow the instruction provided in the Tutorial available at http://lucene.apache.org/solr/tutorial.html; to execute some examples first. but when I tried to check whether Solr is running or not bye using http://localhost:8983/solr/admin/; in the web browser I found the following message. I will be thankful if one can suggest some solution for it. Message: Unable to connect Firefox can't establish a connection to the server at localhost:8983. The site could be temporarily unavailable or too busy. Try again in a few moments. If you are unable to load any pages, check your computer's network connection. If your computer or network is protected by a firewall or proxy, make sure that Firefox is permitted to access the Web. _ With Regds: Divakar -- View this message in context: http://lucene.472066.n3.nabble.com/Unable-to-getting-started-with-SOLR-tp3497276p3497276.html Sent from the Solr - User mailing list archive at Nabble.com. If you reply to this email, your message will be added to the discussion below: http://lucene.472066.n3.nabble.com/Unable-to-getting-started-with-SOLR-tp3497276p3497310.html To unsubscribe from Unable to getting started with SOLR, click here. See how NAML generates this email -- View this message in context: http://lucene.472066.n3.nabble.com/Unable-to-getting-started-with-SOLR-tp3497276p3497364.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Re: Unable to getting started with SOLR
Try replacing localhost with your domain or ip address and make sure the port is open. Use the ps command to see if java is running. -- View this message in context: http://lucene.472066.n3.nabble.com/Unable-to-getting-started-with-SOLR-tp3497276p3497583.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: Getting started with Velocity
Thanks. Is there any way to change what fields browse uses / asks for? I've tried changing the code, and I'm clearly missing something. I either get the same fields it was displaying before (and no search results) or I get something that doesn't work at all. -Original Message- From: Way Cool [mailto:way1.wayc...@gmail.com] Sent: Friday, July 01, 2011 5:51 PM To: solr-user@lucene.apache.org Subject: Re: Getting started with Velocity By default, browse is using the following config: requestHandler name=/browse class=solr.SearchHandler lst name=defaults str name=echoParamsexplicit/str !-- VelocityResponseWriter settings -- str name=wtvelocity/str str name=v.templatebrowse/str str name=v.layoutlayout/str str name=titleSolritas/str str name=defTypeedismax/str str name=q.alt*:*/str str name=rows10/str str name=fl*,score/str str name=mlt.qf text^0.5 features^1.0 name^1.2 sku^1.5 id^10.0 manu^1.1 cat^1.4 /str str name=mlt.fltext,features,name,sku,id,manu,cat/str int name=mlt.count3/int str name=qf text^0.5 features^1.0 name^1.2 sku^1.5 id^10.0 manu^1.1 cat^1.4 /str str name=faceton/str str name=facet.fieldcat/str str name=facet.fieldmanu_exact/str str name=facet.queryipod/str str name=facet.queryGB/str str name=facet.mincount1/str str name=facet.pivotcat,inStock/str str name=facet.rangeprice/str int name=f.price.facet.range.start0/int int name=f.price.facet.range.end600/int int name=f.price.facet.range.gap50/int str name=f.price.facet.range.otherafter/str str name=facet.rangemanufacturedate_dt/str str name=f.manufacturedate_dt.facet.range.startNOW/YEAR-10YEARS/str str name=f.manufacturedate_dt.facet.range.endNOW/str str name=f.manufacturedate_dt.facet.range.gap+1YEAR/str str name=f.manufacturedate_dt.facet.range.otherbefore/str str name=f.manufacturedate_dt.facet.range.otherafter/str !-- Highlighting defaults -- str name=hlon/str str name=hl.fltext features name/str str name=f.name.hl.fragsize0/str str name=f.name.hl.alternateFieldname/str /lst arr name=last-components strspellcheck/str /arr !-- str name=url-schemehttpx/str -- /requestHandler while the normal search is using the following: requestHandler name=search class=solr.SearchHandler default=true !-- default values for query parameters can be specified, these will be overridden by parameters in the request -- lst name=defaults str name=echoParamsexplicit/str int name=rows10/int /lst /requestHandler. Just make sure you have those fields defined in browse also in your doc, otherwise change to not use dismax. :-) On Fri, Jul 1, 2011 at 12:51 PM, Chip Calhoun ccalh...@aip.org wrote: I'm a Solr novice, so I hope I'm missing something obvious. When I run a search in the Admin view, everything works fine. When I do the same search in http://localhost:8983/solr/browse , I invariably get 0 results found. What am i missing? Are these not supposed to be searching the same index? Thanks, Chip
Getting started with Velocity
I'm a Solr novice, so I hope I'm missing something obvious. When I run a search in the Admin view, everything works fine. When I do the same search in http://localhost:8983/solr/browse , I invariably get 0 results found. What am i missing? Are these not supposed to be searching the same index? Thanks, Chip
Re: Getting started with Velocity
By default, browse is using the following config: requestHandler name=/browse class=solr.SearchHandler lst name=defaults str name=echoParamsexplicit/str !-- VelocityResponseWriter settings -- str name=wtvelocity/str str name=v.templatebrowse/str str name=v.layoutlayout/str str name=titleSolritas/str str name=defTypeedismax/str str name=q.alt*:*/str str name=rows10/str str name=fl*,score/str str name=mlt.qf text^0.5 features^1.0 name^1.2 sku^1.5 id^10.0 manu^1.1 cat^1.4 /str str name=mlt.fltext,features,name,sku,id,manu,cat/str int name=mlt.count3/int str name=qf text^0.5 features^1.0 name^1.2 sku^1.5 id^10.0 manu^1.1 cat^1.4 /str str name=faceton/str str name=facet.fieldcat/str str name=facet.fieldmanu_exact/str str name=facet.queryipod/str str name=facet.queryGB/str str name=facet.mincount1/str str name=facet.pivotcat,inStock/str str name=facet.rangeprice/str int name=f.price.facet.range.start0/int int name=f.price.facet.range.end600/int int name=f.price.facet.range.gap50/int str name=f.price.facet.range.otherafter/str str name=facet.rangemanufacturedate_dt/str str name=f.manufacturedate_dt.facet.range.startNOW/YEAR-10YEARS/str str name=f.manufacturedate_dt.facet.range.endNOW/str str name=f.manufacturedate_dt.facet.range.gap+1YEAR/str str name=f.manufacturedate_dt.facet.range.otherbefore/str str name=f.manufacturedate_dt.facet.range.otherafter/str !-- Highlighting defaults -- str name=hlon/str str name=hl.fltext features name/str str name=f.name.hl.fragsize0/str str name=f.name.hl.alternateFieldname/str /lst arr name=last-components strspellcheck/str /arr !-- str name=url-schemehttpx/str -- /requestHandler while the normal search is using the following: requestHandler name=search class=solr.SearchHandler default=true !-- default values for query parameters can be specified, these will be overridden by parameters in the request -- lst name=defaults str name=echoParamsexplicit/str int name=rows10/int /lst /requestHandler. Just make sure you have those fields defined in browse also in your doc, otherwise change to not use dismax. :-) On Fri, Jul 1, 2011 at 12:51 PM, Chip Calhoun ccalh...@aip.org wrote: I'm a Solr novice, so I hope I'm missing something obvious. When I run a search in the Admin view, everything works fine. When I do the same search in http://localhost:8983/solr/browse , I invariably get 0 results found. What am i missing? Are these not supposed to be searching the same index? Thanks, Chip
getting started
Hello, I am new to Solr and am in the beginning planning stage of a large project and could use some advice so as not to make a huge design blunder that I will regret down the road. Currently I have about 10 MySQL databases that store information about different archival collections. For example, we have data and metadata about a political poster collection, a television program, documents and photographs of and about a famous author, etc. My job is to work with the staff archivists to come up with a standard metadata template so the 10 databases can be consolidated into one. Currently the info in these databases is accessed through 10 different sets of PHP pages that were written a long time ago for PHP 4. My plan is to write a new Java application that will handle both public display of the info as well as an administrative interface so that staff members can add or edit the records. I have decided to use Solr as the search mechanism for this project. Because the info in each of our 10 collections is slightly different (e.g., a record about a poster does not contain duration information, but a record about a TV show does) I was thinking it would be good to separate each collection's index into a separate Solr core so that commits coming from one collection do not bog down the other unrelated collections. One reservation I have is that eventually we would like to be able to type in Iraq and find records across all of the collections at once instead of having to search each collection separately. Although I don't know anything about it at this stage, I did Google sharding after reading someone's recent post on this list and it sounds like that may be a potential answer to my question. Does anyone have any advice on how I should initially set up Solr for my situation? I am slowly making my way through the wiki and RTFMing, but I wanted to see what the experts have to say because at this point I don't really know where to start. Thank you very much, Mari
Re: getting started
On 6/16/2011 4:41 PM, Mari Masuda wrote: One reservation I have is that eventually we would like to be able to type in Iraq and find records across all of the collections at once instead of having to search each collection separately. Although I don't know anything about it at this stage, I did Google sharding after reading someone's recent post on this list and it sounds like that may be a potential answer to my question. So this kind of stuff can be tricky, but with that eventual requirement I would NOT put these in seperate cores. Sharding isn't (IMO, if someone disagrees, they will hopefully say so!) a good answer to searching accross entirely different 'schemas', or avoiding frequent-commit issues -- sharding is really just for scaling/performance when your index gets very very large. (Which it doesn't sound like yours will be, but you can deal with that as a separate issue if it becomes so). If you're going to want to search across all the collections, put them all in the same core. Either in the exact same indexed fields, or using certain common indexed fields -- those common ones are the ones you'll be able to search across all collections on. It's okay if some collections have unique indexed fields too --- documents in the core that don't belong to that collection just won't have any terms in that indexed field that is only used by a certain collection, no problem. (Then you can distribute this single core into shards if you need to for performance reasons related to number of documents/size of index). You're right to be thinking about the fact that very frequent commits can be performance issues in Solr. But separating in different cores is going to create more problems for yourself (if you want to be able to search accross all collections), in an attempt to solve that one. (Among other things, not every Solr feature works in a distributed/sharded environment, it's just a more complicated and somewhat less mature setup for Solr). The way I deal with the frequent-commit issue is by NOT doing frequent commits to my production Solr. Instead, I use Solr replication to have a 'master' Solr index that I do commits to whenever I want, and a 'slave' Solr index that serves the production searches, and which only replicates from master periodically -- not too often to be too-frequent-commits. That seems to be a somewhat common solution, if that use pattern works for you. There are also some near real time features in more recent versions of Solr, that I'm not very familiar with. (not sure if any are included in the current latest release, or if they are all only still in the repo) My sense is that they too only work for certain use patterns, they aren't magic bullets for commit whatever you want as often as you want to Solr. In general Solr isn't so great at very frequent major changes to the index. Depending on exactly what sort of use pattern you are predicting/planning for your commits, maybe people can give you advice on how (or if) to do it. But I personally don't think your idea of splitting your collections (that you'll eventually want to search accross into a single search) into shards is a good solution to frequent-commit issues. You'd be complicating your setup and causing other problems for yourself, and not really even entirely addressing the too-frequent-commit issue with that setup.
Re: getting started
Hi Mari, it depends ... * How many records are stored in your MySQL databases? * How often will updates occur? * How many db records / index documents are changed per update? I would suggest to start with a single Solr core first. Thereby, you can concentrate on the basics and do not need to deal with more advanced things like sharding. In case you encounter performance issues later on, you can switch to a multi-core setup. -Sascha Mari Masuda wrote: Hello, I am new to Solr and am in the beginning planning stage of a large project and could use some advice so as not to make a huge design blunder that I will regret down the road. Currently I have about 10 MySQL databases that store information about different archival collections. For example, we have data and metadata about a political poster collection, a television program, documents and photographs of and about a famous author, etc. My job is to work with the staff archivists to come up with a standard metadata template so the 10 databases can be consolidated into one. Currently the info in these databases is accessed through 10 different sets of PHP pages that were written a long time ago for PHP 4. My plan is to write a new Java application that will handle both public display of the info as well as an administrative interface so that staff members can add or edit the records. I have decided to use Solr as the search mechanism for this project. Because the info in each of our 10 collections is slightly different (e.g., a record about a poster does not contain duration information, but a record about a TV show does) I was thinking it would be good to separate each collection's index into a separate Solr core so that commits coming from one collection do not bog down the other unrelated collections. One reservation I have is that eventually we would like to be able to type in Iraq and find records across all of the collections at once instead of having to search each collection separately. Although I don't know anything about it at this stage, I did Google sharding after reading someone's recent post on this list and it sounds like that may be a potential answer to my question. Does anyone have any advice on how I should initially set up Solr for my situation? I am slowly making my way through the wiki and RTFMing, but I wanted to see what the experts have to say because at this point I don't really know where to start. Thank you very much, Mari
Re: Getting started with writing parser
On Tue, Jan 25, 2011 at 10:05 AM, Dinesh mdineshkuma...@karunya.edu.in wrote: http://pastebin.com/CkxrEh6h this is my sample log [...] And, which portions of the log text do you want to preserve? Does it go into Solr as a single error message, or do you want to separate out parts of it. Regards, Gora
Re: Getting started with writing parser
i want to take the month, time, DHCPMESSAGE, from_mac, gateway_ip, net_ADDR - DINESHKUMAR . M I am neither especially clever nor especially gifted. I am only very, very curious. -- View this message in context: http://lucene.472066.n3.nabble.com/Getting-started-with-writing-parser-tp2278092p2327738.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Getting started with writing parser
On Tue, Jan 25, 2011 at 11:44 AM, Dinesh mdineshkuma...@karunya.edu.in wrote: i don't even know whether the regex expression that i'm using for my log is correct or no.. If it is the same try.xml that you posted earlier, it is very likely not going to work. You seem to have just cut and pasted entries from the Hathi Trust blog, without understanding how they work. Could you take a fresh look at http://wiki.apache.org/solr/DataImportHandler and explain in words the following: * What is your directory structure for storing the log files? * What parts of the log file do you want to keep (you have already explained this in another message)? * How would the above translate into: - A Solr schema - Setting up (a) a data source, (b) processor(s), and (c) transformers. i very much worried i couldn't proceed in my project already 1/3 rd of the timing is over.. please help.. this is just the first stage.. after this i have ti setup up all the log to be redirected to SYSLOG and from there i'll send it to SOLR server.. then i have to analyse all the data's that i obtained from DNS, DHCP, WIFI, SWITCES.. and i have to prepare a user based report on his actions.. please help me cause the day's i have keeps reducing.. my project leader is questioning me a lot.. pls.. [...] Well, I am sorry, but at least I strongly feel that we should not be doing your work for you, and especially not if it is a student project, as seems to be the case. If you can address the above points one by one (stay on this thread, please), people should be able to help you. However, it is up to you to get to understand Solr well enough. Regards, Gora
Re: Getting started with writing parser
no i actually changed the directory to mine where i stored the log files.. it is /home/exam/apa..solr/example/exampledocs i specified it in a solr schema.. i created an DataImportHandler for that in try.xml.. then in that i changed that file name to sample.txt that new try.xml is http://pastebin.com/pfVVA7Hs i changed the log into one word per line thinking there might be error in my regex expression.. now i'm completely stuck.. - DINESHKUMAR . M I am neither especially clever nor especially gifted. I am only very, very curious. -- View this message in context: http://lucene.472066.n3.nabble.com/Getting-started-with-writing-parser-tp2278092p2327920.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Getting started with writing parser
On Tue, Jan 25, 2011 at 3:46 PM, Dinesh mdineshkuma...@karunya.edu.in wrote: no i actually changed the directory to mine where i stored the log files.. it is /home/exam/apa..solr/example/exampledocs i specified it in a solr schema.. i created an DataImportHandler for that in try.xml.. then in that i changed that file name to sample.txt that new try.xml is http://pastebin.com/pfVVA7Hs [...] Let us take this one part at a time. In your inner nested entity, entity name=tryli... what do you expect the attribute url=${hathifile.fileAbsolutePath} to resolve to? Regards, Gora
Re: Getting started with writing parser
my solrconfig.xml http://pastebin.com/XDg0L4di my schema.xml http://pastebin.com/3Vqvr3C0 my try.xml http://pastebin.com/YWsB37ZW - DINESHKUMAR . M I am neither especially clever nor especially gifted. I am only very, very curious. -- View this message in context: http://lucene.472066.n3.nabble.com/Getting-started-with-writing-parser-tp2278092p2318218.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Getting started with writing parser
On Mon, Jan 24, 2011 at 2:28 PM, Dinesh mdineshkuma...@karunya.edu.in wrote: my solrconfig.xml http://pastebin.com/XDg0L4di my schema.xml http://pastebin.com/3Vqvr3C0 my try.xml http://pastebin.com/YWsB37ZW [...] OK, thanks for the above. You also need to: * Give us a sample of your log files (for crying out loud, this has got to be the fifth time that I have asked you for this). * Tell us what happens when you run with the above configuration. From a cursory look at try.xml, you have not really understood how it works, or how to configure it for your needs. Regards, Gora
Re: Getting started with writing parser
http://pastebin.com/CkxrEh6h this is my sample log - DINESHKUMAR . M I am neither especially clever nor especially gifted. I am only very, very curious. -- View this message in context: http://lucene.472066.n3.nabble.com/Getting-started-with-writing-parser-tp2278092p2326646.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Getting started with writing parser
i don't even know whether the regex expression that i'm using for my log is correct or no.. i very much worried i couldn't proceed in my project already 1/3 rd of the timing is over.. please help.. this is just the first stage.. after this i have ti setup up all the log to be redirected to SYSLOG and from there i'll send it to SOLR server.. then i have to analyse all the data's that i obtained from DNS, DHCP, WIFI, SWITCES.. and i have to prepare a user based report on his actions.. please help me cause the day's i have keeps reducing.. my project leader is questioning me a lot.. pls.. - DINESHKUMAR . M I am neither especially clever nor especially gifted. I am only very, very curious. -- View this message in context: http://lucene.472066.n3.nabble.com/Getting-started-with-writing-parser-tp2278092p2326917.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Getting started with writing parser
i tried editing the schema file and indexing my own log.. the error that i got is root@karunya-desktop:/home/karunya/apache-solr-1.4.1/example/exampledocs# java -jar post.jar sample.txt SimplePostTool: version 1.2 SimplePostTool: WARNING: Make sure your XML documents are encoded in UTF-8, other encodings are not currently supported SimplePostTool: POSTing files to http://localhost:8983/solr/update.. SimplePostTool: POSTing file sample.txt SimplePostTool: FATAL: Solr returned an error: Severe_errors_in_solr_configuration__Check_your_log_files_for_more_detailed_information_on_what_may_be_wrong__If_you_want_solr_to_continue_after_configuration_errors_changeabortOnConfigurationErrorfalseabortOnConfigurationError__in_null___orgapachesolrcommonSolrException_Unknown_fieldtype_text_specified_on_field_month__at_orgapachesolrschemaIndexSchemareadSchemaIndexSchemajava477__at_orgapachesolrschemaIndexSchemainitIndexSchemajava95__at_orgapachesolrcoreSolrCoreinitSolrCorejava520__at_orgapachesolrcoreCoreContainer$InitializerinitializeCoreContainerjava137__at_orgapachesolrservletSolrDispatchFilterinitSolrDispatchFilterjava83__at_orgmortbayjettyservletFilterHolderdoStartFilterHolderjava99__at_orgmortbaycomponentAbstractLifeCyclestartAbstractLifeCyclejava40__at_orgmortbayjettyservletServletHandlerinitializeServletHandlerjava594__at_orgmortbayjettyservletContextstartContextContextjava139__at_orgmortbayjettywebappWebAppContextstartContextWebAppContextjava1218__at_orgmortbayjettyhandlerContextHandlerdoStartContextHandlerjava500__at_orgmortbayjettywebappWebAppContextdoStartWebAppContextjava448__at_orgmortbaycomponentAbstractLifeCyclestartAbstractLifeCyclejava40__at_orgmortbayjettyhandlerHandlerCollectiondoStartHandlerCollectionjava147__at_orgmortbayjettyhandlerContextHandlerCollectiondoStartContextHandlerCollectionjava161__at_orgmortbaycomponentAbstractLifeCyclestartAbstractLifeCyclejava40__at_orgmortbayjettyhandlerHandlerCollectiondoStartHandlerCollectionjava147__at_orgmortbaycomponentAbstractLifeCyclestartAbstractLifeCyclejava40__at_orgmortbayjettyhandlerHandlerWrapperdoStartHandlerWrapperjava117__at_orgmortbayjettyServerdoStartServerjava210__at_orgmortbaycomponentAbstractLifeCyclestartAbstractLifeCyclejava40__at_orgmortbayxmlXmlConfigurationmain please help me solve this - DINESHKUMAR . M I am neither especially clever nor especially gifted. I am only very, very curious. -- View this message in context: http://lucene.472066.n3.nabble.com/Getting-started-with-writing-parser-tp2278092p2317421.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Getting started with writing parser
On Mon, Jan 24, 2011 at 10:47 AM, Dinesh mdineshkuma...@karunya.edu.in wrote: i tried editing the schema file and indexing my own log.. the error that i got is root@karunya-desktop:/home/karunya/apache-solr-1.4.1/example/exampledocs# java -jar post.jar sample.txt SimplePostTool: version 1.2 SimplePostTool: WARNING: Make sure your XML documents are encoded in UTF-8, other encodings are not currently supported SimplePostTool: POSTing files to http://localhost:8983/solr/update.. SimplePostTool: POSTing file sample.txt SimplePostTool: FATAL: Solr returned an error: [...] Most likely, you are trying to send a plain text file to Solr, instead of the XML that it is expecting. Please see http://lucene.apache.org/solr/tutorial.html#Indexing+Data for an example of how to index XML files to Solr via a POST. That references files in example/exampledocs/ in your Solr source code directory that can serve as examples. E.g., see example/exampledocs/solr.xml. You can try first to see that you can get the built-in Solr examples running, by following the instructions from the beginning of http://lucene.apache.org/solr/tutorial.html. Once, that is done, and if you describe the format of your log files, and what data you want to retain from them, people can help you further. Regards, Gora
Re: Getting started with writing parser
i tried those examples.. is it compuslory that i should make it into XML, how does it index CSV.. should i post my entire schema that i made it myself and the text file that i tried to index.. - DINESHKUMAR . M I am neither especially clever nor especially gifted. I am only very, very curious. -- View this message in context: http://lucene.472066.n3.nabble.com/Getting-started-with-writing-parser-tp2278092p2317521.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Getting started with writing parser
On Mon, Jan 24, 2011 at 11:18 AM, Dinesh mdineshkuma...@karunya.edu.in wrote: i tried those examples.. is it compuslory that i should make it into XML, how does it index CSV.. You will have to convert either into XML, or CSV, but neither of those should be too difficult. should i post my entire schema that i made it myself and the text file that i tried to index.. Post the schema. How big is the text file? If it is more than, say 50 lines, put it up on the web somewhere, and post a link to it. If you are going to do that for the text file, also do it for the schema, and post links to both. Regards, Gora
Re: Getting started with writing parser
i did all the configurations correctly.. previously i missed a configuration file after adding it i'm getting a new error called Unknown FieldType: 'string' used in QueryElevationComponent i found it was defined in solrconfig.xml i didn't change any of the line in that but i don't know why am i getting error - DINESHKUMAR . M I am neither especially clever nor especially gifted. I am only very, very curious. -- View this message in context: http://lucene.472066.n3.nabble.com/Getting-started-with-writing-parser-tp2278092p2317618.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Getting started with writing parser
On Mon, Jan 24, 2011 at 11:54 AM, Dinesh mdineshkuma...@karunya.edu.in wrote: i did all the configurations correctly.. previously i missed a configuration file Sorry, what are you trying to configure now? The built-in Solr example, or the setup for your log files? Did you get the built-in Solr example to work? How were things working earlier that you were getting Solr running, but facing an error on POST. Please proceed systematically, and do not jump back and forth between steps. after adding it i'm getting a new error called Unknown FieldType: 'string' used in QueryElevationComponent i found it was defined in solrconfig.xml [...] Please make your schema.xml, and solrconfig.xml available on the web somewhere, say on http://pastebin.com/ . Regards, Gora P.S. I will not be in network connectivity from now till late tonight, but others might be able to help in the meantime.
Getting started with writing parser
how to write a parser program that will convert log files into XML.. -- View this message in context: http://lucene.472066.n3.nabble.com/Getting-started-with-writing-parser-tp2278092p2278092.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Getting started with writing parser
On Tue, Jan 18, 2011 at 11:59 AM, Dinesh mdineshkuma...@karunya.edu.in wrote: how to write a parser program that will convert log files into XML.. [...] There is no point to starting multiple threads on this issue, hoping that someone will somehow solve your problem. You have been given the following: * Links that should help you get started, including an example of someone indexing Solr's own logs. * Some ideas on how to proceed. * Requests to try the above suggestions out, and ask specific questions when you run into issues. * A suggestion to contact a local expert in Solr. * Multiple requests for a sample of your log files. Please show some signs that you have tried the above suggestions. Otherwise, I am afraid that it will be difficult, if not impossible. for people on this list to help you out. Regards, Gora
getting started - books/in dept material
I really don't want to understand the code that is IN Solr/Lucene. So I'm looking for books on USING Solr/Lucene and configuring it plus making good queries. Any suggestions for current material? Dennis Gearon Signature Warning EARTH has a Right To Life, otherwise we all die. Read 'Hot, Flat, and Crowded' Laugh at http://www.yert.com/film.php
RE: getting started - books/in dept material
Did you miss the wiki? http://wiki.apache.org/solr/SolrResources -Original message- From: Dennis Gearon gear...@sbcglobal.net Sent: Mon 06-09-2010 22:05 To: solr-user@lucene.apache.org; Subject: getting started - books/in dept material I really don't want to understand the code that is IN Solr/Lucene. So I'm looking for books on USING Solr/Lucene and configuring it plus making good queries. Any suggestions for current material? Dennis Gearon Signature Warning EARTH has a Right To Life, otherwise we all die. Read 'Hot, Flat, and Crowded' Laugh at http://www.yert.com/film.php
RE: getting started - books/in dept material
Not sure there's enough info there?(NOT, LOL!) ;-) Thanks very much, had missed that. Dennis Gearon Signature Warning EARTH has a Right To Life, otherwise we all die. Read 'Hot, Flat, and Crowded' Laugh at http://www.yert.com/film.php --- On Mon, 9/6/10, Markus Jelsma markus.jel...@buyways.nl wrote: From: Markus Jelsma markus.jel...@buyways.nl Subject: RE: getting started - books/in dept material To: solr-user@lucene.apache.org Date: Monday, September 6, 2010, 2:51 PM Did you miss the wiki? http://wiki.apache.org/solr/SolrResources -Original message- From: Dennis Gearon gear...@sbcglobal.net Sent: Mon 06-09-2010 22:05 To: solr-user@lucene.apache.org; Subject: getting started - books/in dept material I really don't want to understand the code that is IN Solr/Lucene. So I'm looking for books on USING Solr/Lucene and configuring it plus making good queries. Any suggestions for current material? Dennis Gearon Signature Warning EARTH has a Right To Life, otherwise we all die. Read 'Hot, Flat, and Crowded' Laugh at http://www.yert.com/film.php
Getting started with DIH
I would like to start using DIH to index some RSS-Feeds and mail folders To get started I tried the RSS example from the wiki but as it is Solr complains about the missing id field. After some experimenting I found out two ways to fill the id: - copyField source=link dest=id/ in schema.xml This works but isn't very flexible. Perhaps I have other types of records with a real id or a multivalued link-field. Then this solution would break. - Changing the id field to type uuid Again I would like to keep real ids where I have them and not a random UUID. What didn't work but looks like the potentially best solution is to fill the id in my data-config by using the link twice: field column=link xpath=/RDF/item/link / field column=id xpath=/RDF/item/link / This would be a definition just for this single data source but I don't get any docs (also no error message). No trace of any inserts whatsoever. Is it possible to fill the id that way? Another question regarding MailEntityProcessor I found this example: document entity processor=MailEntityProcessor user=someb...@gmail.com password=something host=imap.gmail.com protocol=imaps folders = x,y,z/ /document But what is the dataSource (the enclosing tag to document)? That is, how would a minimal but complete data-config.xml look like to index mails from an IMAP server? And finally, is it possible to combine the definitions for several RSS-Feeds and Mail-accounts into one data-config? Or do I need a separate config file and request handler for each of them? -Michael
Re: Getting started with DIH
You have an example on using mail dih in solr distro []s, Lucas Frare Teixeira .·. - lucas...@gmail.com - lucastex.com.br - blog.lucastex.com - twitter.com/lucastex On Sun, Nov 8, 2009 at 1:56 PM, Michael Lackhoff mich...@lackhoff.dewrote: I would like to start using DIH to index some RSS-Feeds and mail folders To get started I tried the RSS example from the wiki but as it is Solr complains about the missing id field. After some experimenting I found out two ways to fill the id: - copyField source=link dest=id/ in schema.xml This works but isn't very flexible. Perhaps I have other types of records with a real id or a multivalued link-field. Then this solution would break. - Changing the id field to type uuid Again I would like to keep real ids where I have them and not a random UUID. What didn't work but looks like the potentially best solution is to fill the id in my data-config by using the link twice: field column=link xpath=/RDF/item/link / field column=id xpath=/RDF/item/link / This would be a definition just for this single data source but I don't get any docs (also no error message). No trace of any inserts whatsoever. Is it possible to fill the id that way? Another question regarding MailEntityProcessor I found this example: document entity processor=MailEntityProcessor user=someb...@gmail.com password=something host=imap.gmail.com protocol=imaps folders = x,y,z/ /document But what is the dataSource (the enclosing tag to document)? That is, how would a minimal but complete data-config.xml look like to index mails from an IMAP server? And finally, is it possible to combine the definitions for several RSS-Feeds and Mail-accounts into one data-config? Or do I need a separate config file and request handler for each of them? -Michael
Re: Getting started with DIH
On 08.11.2009 17:03 Lucas F. A. Teixeira wrote: You have an example on using mail dih in solr distro blushDon't know where my eyes were. Thanks!/blush When I was at it I looked at the schema.xml for the rss example and it uses link as UniqueKey, which is of course good, if you only have rss items but not so good if you also plan to add other data sources. So I am still interested in a good solution for my id problem: What didn't work but looks like the potentially best solution is to fill the id in my data-config by using the link twice: field column=link xpath=/RDF/item/link / field column=id xpath=/RDF/item/link / This would be a definition just for this single data source but I don't get any docs (also no error message). No trace of any inserts whatsoever. Is it possible to fill the id that way? and this one: And finally, is it possible to combine the definitions for several RSS-Feeds and Mail-accounts into one data-config? Or do I need a separate config file and request handler for each of them? Thanks -Michael
Re: Getting started with DIH
If I'm not wrong, you can have several entities in one document, but just one datasource configured. []sm Lucas Frare Teixeira .·. - lucas...@gmail.com - lucastex.com.br - blog.lucastex.com - twitter.com/lucastex On Sun, Nov 8, 2009 at 3:36 PM, Michael Lackhoff mich...@lackhoff.dewrote: On 08.11.2009 17:03 Lucas F. A. Teixeira wrote: You have an example on using mail dih in solr distro blushDon't know where my eyes were. Thanks!/blush When I was at it I looked at the schema.xml for the rss example and it uses link as UniqueKey, which is of course good, if you only have rss items but not so good if you also plan to add other data sources. So I am still interested in a good solution for my id problem: What didn't work but looks like the potentially best solution is to fill the id in my data-config by using the link twice: field column=link xpath=/RDF/item/link / field column=id xpath=/RDF/item/link / This would be a definition just for this single data source but I don't get any docs (also no error message). No trace of any inserts whatsoever. Is it possible to fill the id that way? and this one: And finally, is it possible to combine the definitions for several RSS-Feeds and Mail-accounts into one data-config? Or do I need a separate config file and request handler for each of them? Thanks -Michael
Re: Getting started with DIH
On 08.11.2009 16:56 Michael Lackhoff wrote: What didn't work but looks like the potentially best solution is to fill the id in my data-config by using the link twice: field column=link xpath=/RDF/item/link / field column=id xpath=/RDF/item/link / This would be a definition just for this single data source but I don't get any docs (also no error message). No trace of any inserts whatsoever. Is it possible to fill the id that way? Found the answer in the list archive: use TemplateTransformer: field column=link xpath=/RDF/item/link / field column=id template=${slashdot.link} / Only minor and cosmetic problem: there are brackets around the id field (like [http://somelink/]). For an id this doesn't really matter but I would like to understand what is going on here. In the wiki I found only this info: The rules for the template are same as the templates in 'query', 'url' etc but I couldn't find any info about those either. Is this documented somewhere? -Michael
Re: Getting started with DIH
The brackets probably come from it being transformed as an array. Try saying multiValued=false on your field specifications. Erik On Nov 9, 2009, at 12:34 AM, Michael Lackhoff wrote: On 08.11.2009 16:56 Michael Lackhoff wrote: What didn't work but looks like the potentially best solution is to fill the id in my data-config by using the link twice: field column=link xpath=/RDF/item/link / field column=id xpath=/RDF/item/link / This would be a definition just for this single data source but I don't get any docs (also no error message). No trace of any inserts whatsoever. Is it possible to fill the id that way? Found the answer in the list archive: use TemplateTransformer: field column=link xpath=/RDF/item/link / field column=id template=${slashdot.link} / Only minor and cosmetic problem: there are brackets around the id field (like [http://somelink/]). For an id this doesn't really matter but I would like to understand what is going on here. In the wiki I found only this info: The rules for the template are same as the templates in 'query', 'url' etc but I couldn't find any info about those either. Is this documented somewhere? -Michael
Re: Getting started with DIH
On 09.11.2009 06:54 Erik Hatcher wrote: The brackets probably come from it being transformed as an array. Try saying multiValued=false on your field specifications. Indeed. Thanks Erik that was it. My first steps with DIH showed me what a powerful tool this is but although the DIH wiki page might well be the longest in the whole wiki there are so many mysteries left for the uninitiated. Is there any other documentation I might have missed? Thanks -Michael
Re: Getting started with DIH
On Mon, Nov 9, 2009 at 12:43 PM, Michael Lackhoff mich...@lackhoff.de wrote: On 09.11.2009 06:54 Erik Hatcher wrote: The brackets probably come from it being transformed as an array. Try saying multiValued=false on your field specifications. Indeed. Thanks Erik that was it. My first steps with DIH showed me what a powerful tool this is but although the DIH wiki page might well be the longest in the whole wiki there are so many mysteries left for the uninitiated. Is there any other documentation I might have missed? There is an FAQ page and that is it http://wiki.apache.org/solr/DataImportHandlerFaq It just started of as a single page and the features just got piled up and the page just bigger. we are thinking of cutting it down to smaller more manageable pages Thanks -Michael -- - Noble Paul | Principal Engineer| AOL | http://aol.com
Re: Getting started with DIH
On 09.11.2009 08:20 Noble Paul നോബിള് नोब्ळ् wrote: It just started of as a single page and the features just got piled up and the page just bigger. we are thinking of cutting it down to smaller more manageable pages Oh, I like it the way it is as one page, so that the browser full text search can help. It is just that the features and power seem to grow even faster than the wike page ;-) E.g. I couldn't find a way how to add a second rss feed. I tried with a second entity parallel to the slashdot one but got an exception: java.io.IOException: FULL whatever that means, so I must be doing something wrong but couldn't find a hint. -Michael
Re: Getting started with DIH
The tried and tested strategy is to post the question in this mailing list w/ your data-config.xml. On Mon, Nov 9, 2009 at 1:08 PM, Michael Lackhoff mich...@lackhoff.de wrote: On 09.11.2009 08:20 Noble Paul നോബിള് नोब्ळ् wrote: It just started of as a single page and the features just got piled up and the page just bigger. we are thinking of cutting it down to smaller more manageable pages Oh, I like it the way it is as one page, so that the browser full text search can help. It is just that the features and power seem to grow even faster than the wike page ;-) E.g. I couldn't find a way how to add a second rss feed. I tried with a second entity parallel to the slashdot one but got an exception: java.io.IOException: FULL whatever that means, so I must be doing something wrong but couldn't find a hint. -Michael -- - Noble Paul | Principal Engineer| AOL | http://aol.com
Re: help getting started with spell check dictionary
I'm guessing it is because you have your Spell checker mapped to the spellchecker request handler, but you are asking the standard request handler to build the spell checker. Unless you've modified the Standard Req Handler, it is not spell check aware. Try http://localhost:8983/solr/select?qt=spellcheckerq=cancr;... Otherwise, what you really likely want is to add the SpellCheckComponent to your standard handler. See the comments in the example solrconfig.xml for more on this. -Grant On Aug 5, 2009, at 8:52 AM, Ian Connor wrote: Hi, I have downloaded a dictionary in plane text format from http://icon.shef.ac.uk/Moby/mwords.html and added it to my /mnt directory. When I tried to add: lst name=dictionary str name=nameexternal/str str name=typeorg.apache.solr.spelling.FileBasedSpellChecker/str str name=sourceLocation/mnt/dictionary.txt/str str name=fieldTypetext/str /lst within the requestHandler name=spellchecker class=solr.SpellCheckerRequestHandler startup=lazy block, I thought it would be as easy as running a query like: http://localhost:8983/solr/select/?q=cancrspellcheck=truespellcheck.build=true to get it to work. Can anyone tell me what steps I am missing here? Thanks for any help here. I was trying to get the idea from the example here: https://issues.apache.org/jira/browse/SOLR-572 after reading through http://wiki.apache.org/solr/SpellCheckComponent -- Regards, Ian Connor -- Grant Ingersoll http://www.lucidimagination.com/ Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using Solr/Lucene: http://www.lucidimagination.com/search
help getting started with spell check dictionary
Hi, I have downloaded a dictionary in plane text format from http://icon.shef.ac.uk/Moby/mwords.html and added it to my /mnt directory. When I tried to add: lst name=dictionary str name=nameexternal/str str name=typeorg.apache.solr.spelling.FileBasedSpellChecker/str str name=sourceLocation/mnt/dictionary.txt/str str name=fieldTypetext/str /lst within the requestHandler name=spellchecker class=solr.SpellCheckerRequestHandler startup=lazy block, I thought it would be as easy as running a query like: http://localhost:8983/solr/select/?q=cancrspellcheck=truespellcheck.build=true to get it to work. Can anyone tell me what steps I am missing here? Thanks for any help here. I was trying to get the idea from the example here: https://issues.apache.org/jira/browse/SOLR-572 after reading through http://wiki.apache.org/solr/SpellCheckComponent -- Regards, Ian Connor
Re: getting started
: http://lucene.apache.org/solr/tutorial.html#Getting+Started : : link - lucene QueryParser syntax fixed in svn, ... the site should update in about 30 minutes. thanks for pointing this out. -Hoss
getting started
Hi Some of the getting started link dont work. Can you please enable it?
Re: getting started
Which links? Please be as specific as possible. Erick On Wed, Mar 25, 2009 at 1:20 PM, nga pham nga.p...@gmail.com wrote: Hi Some of the getting started link dont work. Can you please enable it?
Re: getting started
Oops my mistake. Sorry for the trouble On Wed, Mar 25, 2009 at 10:42 AM, Erick Erickson erickerick...@gmail.comwrote: Which links? Please be as specific as possible. Erick On Wed, Mar 25, 2009 at 1:20 PM, nga pham nga.p...@gmail.com wrote: Hi Some of the getting started link dont work. Can you please enable it?
Re: getting started
http://lucene.apache.org/solr/tutorial.html#Getting+Started link - lucene QueryParser syntax is not working On Wed, Mar 25, 2009 at 10:48 AM, nga pham nga.p...@gmail.com wrote: Oops my mistake. Sorry for the trouble On Wed, Mar 25, 2009 at 10:42 AM, Erick Erickson erickerick...@gmail.com wrote: Which links? Please be as specific as possible. Erick On Wed, Mar 25, 2009 at 1:20 PM, nga pham nga.p...@gmail.com wrote: Hi Some of the getting started link dont work. Can you please enable it?
Re: getting started
OK, now I'll turn it over to the folks who actually maintain that site G. Meanwhile, here's the link to the 2.4.1 query syntax. http://lucene.apache.org/java/2_4_1/queryparsersyntax.html Best Erick On Wed, Mar 25, 2009 at 2:00 PM, nga pham nga.p...@gmail.com wrote: http://lucene.apache.org/solr/tutorial.html#Getting+Started link - lucene QueryParser syntax is not working On Wed, Mar 25, 2009 at 10:48 AM, nga pham nga.p...@gmail.com wrote: Oops my mistake. Sorry for the trouble On Wed, Mar 25, 2009 at 10:42 AM, Erick Erickson erickerick...@gmail.com wrote: Which links? Please be as specific as possible. Erick On Wed, Mar 25, 2009 at 1:20 PM, nga pham nga.p...@gmail.com wrote: Hi Some of the getting started link dont work. Can you please enable it?
Getting started with Solr
Hi, I'm very new to search engines in general. I've been using Zend_Search_Lucene PHP class before to try Lucene in general and though it surely works it's not what I'm looking for performance wise. I recently installed Solr on a newly installed Ubuntu (Hardy Heron) machine. I have about 207k docs (currently, and I'm getting about 100k each month from now on) and that's why I decided to throw myself into something real for once. As I'm learning from today, I was wondering two main things. I'm using Jetty as the Java container, and PHP5 to handle the search- requests from an agent. If I start Solr using java -jar start.jar in the example directory, everything works fine. I even manage to populate the index with the example data as documented in the tutorials. How can I setup to run Solr as a service, so I don't need to have a SSH connection open? Sorry for being stupid here btw. I'm working to have a multi-langual search. So a company (doc) exists in say Poland, what design of scheme should I read/work on to be able to write Poland/Polen/Polska (Poland in different languages) and still hit the same results. I have the data from geonames.org for this, but I can't really grasp how I should be working the scheme.xml. The easiest solution would be to populate each document with each possible hit word, but this would give me a bunch of duplicates. Yours, Martin Iwanowski
Re: Getting started with Solr
How can I setup to run Solr as a service, so I don't need to have a SSH connection open? Sorry for being stupid here btw. This is kind of independent from solr. You have to look how to do it for the OS you are running on. With Ubuntu, you could just launch solr with nohup to keep it from stopping when you log off, or look into writing an init.d/rc startup script that launches solr (just google). I'm working to have a multi-langual search. So a company (doc) exists in say Poland, what design of scheme should I read/work on to be able to write Poland/Polen/Polska (Poland in different languages) and still hit the same results. I have the data from geonames.org for this, but I can't really grasp how I should be working the scheme.xml. The easiest solution would be to populate each document with each possible hit word, but this would give me a bunch of duplicates. Not sure I get you completely, but you one option might be to index each language to a separate field, and search over those fields sep/together as needed. Another option, if there is a lot of overlap, might be to use something like a synonym type analyzer: put tokens that differ in each language at the same position in the index. Of course this immediately gets difficult if one language has two tokens for a word and another has 1. This could get tricky quick depending on what queries you need to support how they should work, etc. - Mark