RE: solr-xslt question
Yes, that was it! thanks Bent -Original Message- From: Erick Erickson [mailto:erickerick...@gmail.com] Sent: Thursday, December 22, 2011 11:12 AM To: solr-user@lucene.apache.org Subject: Re: solr-xslt question You're probably hitting the default limit on a field. This is set in solrconfig.xml, the maxFieldLength element. The first thing I'd try is upping that to, say, 1000 reindex and see if that fixes your problem. This is the number of *tokens*, not characters. Roughly the number of words... Searching for the common word is probably a complete red herring. Best Erick On Wed, Dec 21, 2011 at 4:36 PM, Bent Jensen bentjen...@yahoo.com wrote: Being new to xml/xslt/solr, I am hoping someone can explain/help me with the following: Using Apache-Solr 3.4.0 . I have a php page for submitting the search, and display the result in html. I indexed a 1.5MB size pdf document (400 pages). Using the admin interface with *:* query everything is returned. I then try using' highlighting' in the query, and modified the xsl file to return the highlighting. It works fine for the text in the beginning of the document. I can also query with a phrase between and it returns the exact match. When searching content approx. beyond the first 100 pages, I see this behavior: I must include common words in a phrase to get a result returned. For example if I search using the word handymen, that only appears in one place towards the end of the document, nothing is returned, but if I add a common word that appears in the sentence where handymen is; e.g. 'handymen that', then both are returned in the highlighting including many other occurrences of 'that'. If I query with handymen that, nothing is returned. thanks Ben _ No virus found in this message. Checked by AVG - www.avg.com Version: 2012.0.1890 / Virus Database: 2109/4694 - Release Date: 12/21/11 - No virus found in this message. Checked by AVG - www.avg.com Version: 2012.0.1901 / Virus Database: 2109/4696 - Release Date: 12/22/11 - No virus found in this message. Checked by AVG - www.avg.com Version: 2012.0.1901 / Virus Database: 2109/4702 - Release Date: 12/25/11
solr-xslt question
Being new to xml/xslt/solr, I am hoping someone can explain/help me with the following: Using Apache-Solr 3.4.0 . I have a php page for submitting the search, and display the result in html. I indexed a 1.5MB size pdf document (400 pages). Using the admin interface with *:* query everything is returned. I then try using' highlighting' in the query, and modified the xsl file to return the highlighting. It works fine for the text in the beginning of the document. I can also query with a phrase between and it returns the exact match. When searching content approx. beyond the first 100 pages, I see this behavior: I must include common words in a phrase to get a result returned. For example if I search using the word handymen, that only appears in one place towards the end of the document, nothing is returned, but if I add a common word that appears in the sentence where handymen is; e.g. 'handymen that', then both are returned in the highlighting including many other occurrences of 'that'. If I query with handymen that, nothing is returned. thanks Ben _ No virus found in this message. Checked by AVG - www.avg.com Version: 2012.0.1890 / Virus Database: 2109/4694 - Release Date: 12/21/11 - No virus found in this message. Checked by AVG - www.avg.com Version: 2012.0.1890 / Virus Database: 2109/4694 - Release Date: 12/21/11 - No virus found in this message. Checked by AVG - www.avg.com Version: 2012.0.1901 / Virus Database: 2109/4696 - Release Date: 12/22/11 - No virus found in this message. Checked by AVG - www.avg.com Version: 2012.0.1901 / Virus Database: 2109/4696 - Release Date: 12/22/11
reposting highlighting questions
I am new to solr/xml/xslt, and trying to figure out how to display search query fields highlighted in html. I can enable the highlighting in the query, and I think I get the correct xml response back (See below: I search using 'Contents' and the highlighting is shown with strong and /strong. However, I cannot figure out what to add to the xslt file to transform it in html. I think it is a question of defining the appropriate xpath(?), but I am stuck. Can someone point me in the right direction? Thanks in advance! Here is the result I get back: ?xml version=1.0 encoding=UTF-8 ? http://us.mg4.mail.yahoo.com/neo/ - response http://us.mg4.mail.yahoo.com/neo/ - lst name=responseHeader int name=status0/int int name=QTime20/int http://us.mg4.mail.yahoo.com/neo/ - lst name=params str name=explainOther / str name=indenton/str str name=hl.simple.pre'strong'/str str name=hl.fl*/str str name=wt / str name=hlon/str str name=rows10/str str name=version2.2/str str name=fl / str name=start0/str str name=qcontents/str str name=hl.simple.post'/strong'/str str name=qt / str name=fq / /lst /lst http://us.mg4.mail.yahoo.com/neo/ - result name=response numFound=1 start=0 http://us.mg4.mail.yahoo.com/neo/ - doc http://us.mg4.mail.yahoo.com/neo/ - arr name=content strStart with the Table of Contents. See if you can find the topic that you are interested in. Look through the section to see if there is a resource that can help you. If you find one, you may want to attach a Post-it tab so you can find the page later. Write down all of the information that you need to find out more information about the resource: agency name, name of contact person, telephone number, email and website addresses. If you were unable to find a resource that will help you in this resource guide, a good first step would be to call your local Independent Living Center. They will have a good idea of what is available in your area. A second step would be to call or email us at the Rehabilitation Research Center. We have a ROBOT resource specialist who may be able to assist. /str /arr http://us.mg4.mail.yahoo.com/neo/ - arr name=doclink strrobot.pdf#page=11/str /arr str name=heading1CHAPTER 1: How to Use This Resource Guide/str str name=id1-1/str /doc /result http://us.mg4.mail.yahoo.com/neo/ - lst name=highlighting http://us.mg4.mail.yahoo.com/neo/ - lst name=1-1 http://us.mg4.mail.yahoo.com/neo/ - arr name=content strStart with the Table of 'strong'Contents'/strong'. See if you can find the topic that you are interested in. Look/str /arr /lst /lst /response _ No virus found in this message. Checked by AVG - www.avg.com Version: 2012.0.1873 / Virus Database: 2108/4678 - Release Date: 12/13/11
highlighting questions
I am trying to figure out how to display search query fields highlighted in html. I can enable the highlighting in the query, and I think I get the correct response back (See below: I search using 'Contents' and the highlighting is shown with strong and /strong. However, I can't figure out what to add to the xslt file to display in html. I think it is a question of defining the appropriate xpath(?), but I am stuck. Can someone point me in the right direction? Thanks in advance! Here is the result I get back: ?xml version=1.0 encoding=UTF-8 ? - response - lstname=responseHeader intname=status0/int intname=QTime20/int - lstname=params str name=explainOther/ strname=indenton/str strname=hl.simple.pre'strong'/str strname=hl.fl*/str str name=wt/ strname=hlon/str strname=rows10/str strname=version2.2/str str name=fl/ strname=start0/str strname=qcontents/str strname=hl.simple.post'/strong'/str str name=qt/ str name=fq/ /lst /lst - resultname=responsenumFound=1start=0 - doc - arrname=content strStart with the Table of Contents. See if you can find the topic that you are interested in. Look through the section to see if there is a resource that can help you. If you find one, you may want to attach a Post-it tab so you can find the page later. Write down all of the information that you need to find out more information about the resource: agency name, name of contact person, telephone number, email and website addresses. If you were unable to find a resource that will help you in this resource guide, a good first step would be to call your local Independent Living Center. They will have a good idea of what is available in your area. A second step would be to call or email us at the Rehabilitation Research Center. We have a ROBOT resource specialist who may be able to assist. You can reach Lois Roberts, the “Back On Track …To Success” Mentoring Program Assistant, at 408-793-6426 or email her at lois.robe...@hhs.sccgov.org/str /arr - arrname=doclink strrobot.pdf#page=11/str /arr strname=heading1CHAPTER 1: How to Use This Resource Guide/str strname=id1-1/str /doc /result - lstname=highlighting - lstname=1-1 - arrname=content strStart with the Table of 'strong'Contents'/strong'. See if you can find the topic that you are interested in. Look/str /arr /lst /lst /response
how to transform a URL (newbie question)
I am a beginner to solr and need to ask the following: Using the apache-solr example, how can I display an url in the xml document as an active link/url in http? Do i need to add some special transform in the example.xslt file? thanks Ben - No virus found in this message. Checked by AVG - www.avg.com Version: 2012.0.1869 / Virus Database: 2092/4628 - Release Date: 11/20/11
RE: how to transform a URL (newbie question)
Erik, OK, I will look at that. Basically, what I amtrying to do is to index a document with lots of URLs. I also index the url and give it a field type. Don't know much about solr yet, but though maybe I can transform the url to an active link, i.e. 'a href'. I tried putting the href into the xml document, but it just prints out as text in html. I also could not find any xslt transform or schema. thanks Ben -Original Message- From: Erik Hatcher [mailto:erik.hatc...@gmail.com] Sent: Sunday, November 20, 2011 9:05 AM To: solr-user@lucene.apache.org Subject: Re: how to transform a URL (newbie question) Ben, Not quite sure how to interpret what you're asking here. Are you speaking of the /browse view? If so, you can tweak the templates under conf/velocity to make links out of things. But generally, it's the end application that would take the results from Solr and render links as appropriate. Erik On Nov 20, 2011, at 11:53 , Bent Jensen wrote: I am a beginner to solr and need to ask the following: Using the apache-solr example, how can I display an url in the xml document as an active link/url in http? Do i need to add some special transform in the example.xslt file? thanks Ben - No virus found in this message. Checked by AVG - www.avg.com Version: 2012.0.1869 / Virus Database: 2092/4628 - Release Date: 11/20/11 - No virus found in this message. Checked by AVG - www.avg.com Version: 2012.0.1869 / Virus Database: 2092/4628 - Release Date: 11/20/11 - No virus found in this message. Checked by AVG - www.avg.com Version: 2012.0.1869 / Virus Database: 2092/4628 - Release Date: 11/20/11
RE: question from a beginner
Yes, that certainly crossed my mind, but I have no idea of how to do that. Would I need to pick a unique keyword from every paragraph and use that for the index? -Original Message- From: Michael Sokolov [mailto:soko...@ifactory.com] Sent: Monday, October 31, 2011 5:20 AM To: solr-user@lucene.apache.org Cc: Phil Scadden Subject: Re: question from a beginner You might also consider indexing each paragraph as a separate document if the documents are very large. -Mike On 10/30/2011 11:51 PM, Phil Scadden wrote: Look up highlighting. http://wiki.apache.org/solr/HighlightingParameters Notice: This email and any attachments are confidential. If received in error please destroy and immediately notify us. Do not copy or disclose the contents.
question from a beginner
Not sure if this is appropirate for this list, but I will try anyway and hope to get a few pointers. I am trying to help a Rehabilitation Research Center set up a document search on their website (as a volunteer). They have a word document with a lot of information about resources and contact places for rehab patients. So for example, if searching on Santa Clara I would like to display all sections/paragraphs where Santa Clara occurs in the document. I am trying to use the solr example, and have indexed the document. However, when I do a search on Santa Clara, I only get the fields back such as author, content_type, id, etc.. Also it only shows one search result per file, even though the file has multiple occurrences of the search word. If I use q=*:*, then all the document text is returned. I Hope you can point me in the right direction. thanks in advance Ben