solr multicore problem on SLES 11
Hello, I have a problem with solr and multicores on SLES 11 SP 2. I have 3 cores, each with more than 20 segments. When I try to start the tomcat6, it can not start the CoreContainer. Caused by: java.lang.OutOfMemoryError: Map failed at sun.nio.ch.FileChannelImpl.map0(Native Method) I read a lot about this problem, but I do not find the solution. The strange problem is now: It works fine under openSuSE 12.x, tomcat6, openjdk. But the virtual maschine with SLES 11 SP 2, tomcat6, openjdk it crashes. Both tomcat/java configurations are the same. Has anyboday a idea, how to solve this problem? I have another SLES maschine with 5 core, but each has only 1 segment (very small index), and this maschine runs fine. Greetings Jochen -- Dr. rer. nat. Jochen Lienhard Dezernat EDV Albert-Ludwigs-Universität Freiburg Universitätsbibliothek Rempartstr. 10-16 | Postfach 1629 79098 Freiburg | 79016 Freiburg Telefon: +49 761 203-3908 E-Mail: lienh...@ub.uni-freiburg.de Internet: www.ub.uni-freiburg.de smime.p7s Description: Kryptographische S/MIME-Signatur
AW: solr multicore problem on SLES 11
The first thing I would check is the virtual memory limit (ulimit -v, check this for the operating system user that runs Tomcat /Solr). It should be set to unlimited, but this is as far as i remember not the default settings on SLES 11. Since 3.1, Solr maps the index files to virtual memory. So if the size of your index files are larger than the allowed virtual memory, it may fail. Regards, André Von: Jochen Lienhard [lienh...@ub.uni-freiburg.de] Gesendet: Montag, 17. September 2012 09:17 An: solr-user@lucene.apache.org Betreff: solr multicore problem on SLES 11 Hello, I have a problem with solr and multicores on SLES 11 SP 2. I have 3 cores, each with more than 20 segments. When I try to start the tomcat6, it can not start the CoreContainer. Caused by: java.lang.OutOfMemoryError: Map failed at sun.nio.ch.FileChannelImpl.map0(Native Method) I read a lot about this problem, but I do not find the solution. The strange problem is now: It works fine under openSuSE 12.x, tomcat6, openjdk. But the virtual maschine with SLES 11 SP 2, tomcat6, openjdk it crashes. Both tomcat/java configurations are the same. Has anyboday a idea, how to solve this problem? I have another SLES maschine with 5 core, but each has only 1 segment (very small index), and this maschine runs fine. Greetings Jochen -- Dr. rer. nat. Jochen Lienhard Dezernat EDV Albert-Ludwigs-Universität Freiburg Universitätsbibliothek Rempartstr. 10-16 | Postfach 1629 79098 Freiburg | 79016 Freiburg Telefon: +49 761 203-3908 E-Mail: lienh...@ub.uni-freiburg.de Internet: www.ub.uni-freiburg.de
Re: solr multicore problem on SLES 11
Great. Thanks. That solves my problem. Greetings Jochen André Widhani schrieb: The first thing I would check is the virtual memory limit (ulimit -v, check this for the operating system user that runs Tomcat /Solr). It should be set to unlimited, but this is as far as i remember not the default settings on SLES 11. Since 3.1, Solr maps the index files to virtual memory. So if the size of your index files are larger than the allowed virtual memory, it may fail. Regards, André Von: Jochen Lienhard [lienh...@ub.uni-freiburg.de] Gesendet: Montag, 17. September 2012 09:17 An: solr-user@lucene.apache.org Betreff: solr multicore problem on SLES 11 Hello, I have a problem with solr and multicores on SLES 11 SP 2. I have 3 cores, each with more than 20 segments. When I try to start the tomcat6, it can not start the CoreContainer. Caused by: java.lang.OutOfMemoryError: Map failed at sun.nio.ch.FileChannelImpl.map0(Native Method) I read a lot about this problem, but I do not find the solution. The strange problem is now: It works fine under openSuSE 12.x, tomcat6, openjdk. But the virtual maschine with SLES 11 SP 2, tomcat6, openjdk it crashes. Both tomcat/java configurations are the same. Has anyboday a idea, how to solve this problem? I have another SLES maschine with 5 core, but each has only 1 segment (very small index), and this maschine runs fine. Greetings Jochen -- Dr. rer. nat. Jochen Lienhard Dezernat EDV Albert-Ludwigs-Universität Freiburg Universitätsbibliothek Rempartstr. 10-16 | Postfach 1629 79098 Freiburg | 79016 Freiburg Telefon: +49 761 203-3908 E-Mail: lienh...@ub.uni-freiburg.de Internet: www.ub.uni-freiburg.de -- Dr. rer. nat. Jochen Lienhard Dezernat EDV Albert-Ludwigs-Universität Freiburg Universitätsbibliothek Rempartstr. 10-16 | Postfach 1629 79098 Freiburg | 79016 Freiburg Telefon: +49 761 203-3908 E-Mail: lienh...@ub.uni-freiburg.de Internet: www.ub.uni-freiburg.de smime.p7s Description: Kryptographische S/MIME-Signatur
Re: Question about Fuzzy search in Solr
Hello! Is this what you are looking for https://lucene.apache.org/core/old_versioned_docs/versions/3_0_0/queryparsersyntax.html#Fuzzy%20Searches ? -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch Hi, I need to know how we can implement fuzzy searches using Solr. Can someone provide any links to any relevant documentation ?
Re: Only exact match searches working
Thank you for the reply. I have done a bit of reading and it says I can also use this one: filter class=solr.NGramFilterFactory minGramSize=3 maxGramSize=30 / This is what I will use I think, as it weeds out words like at I as a bonus. -- View this message in context: http://lucene.472066.n3.nabble.com/Only-exact-match-searches-working-tp4008160p4008264.html Sent from the Solr - User mailing list archive at Nabble.com.
Taking a full text, then truncate and duplicate with stopwords
I've hit a bit of a wall and would appreciate some guidance. I want to index a large block of text, like such: I dont want to store this as it is in Solr, I want to instead have two versions of it. One as a truncated form, and one as a keyword form. *Truncated Form:* *Keyword Form (using stopwords to remove common words):* How should I be doing this. Purely with index analyzer's? -- View this message in context: http://lucene.472066.n3.nabble.com/Taking-a-full-text-then-truncate-and-duplicate-with-stopwords-tp4008269.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Taking a full text, then truncate and duplicate with stopwords
I dont want to store this as it is in Solr, I want to instead have two versions of it. One as a truncated form, and one as a keyword form. *Truncated Form:* If truncated form means first N characters then copyField can be used http://wiki.apache.org/solr/SchemaXml#Copy_Fields *Keyword Form (using stopwords to remove common words):* Are you going to use this keyword form for searching or displaying purposes?
Re: Question about Fuzzy search in Solr
Thanks. Is any extra configuration from the Solr side to make this work ? Any additional text files like synonyms.txt, any additional fields or any changes in schema.xml or solrconfig.xml ? On Mon, Sep 17, 2012 at 4:45 PM, Rafał Kuć r@solr.pl wrote: Hello! Is this what you are looking for https://lucene.apache.org/core/old_versioned_docs/versions/3_0_0/queryparsersyntax.html#Fuzzy%20Searches ? -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch Hi, I need to know how we can implement fuzzy searches using Solr. Can someone provide any links to any relevant documentation ? -- Thanks and Regards Rahul A. Warawdekar
Re: 1.3 to 3.6 migration
Hi Jack, Thanks. Even though I have mentioned compound Index to true in the Indexconfig section of schema for 3.6 version ,it still seems to create normal Index files. Attached is the solrconfig.xml Please let me know if anything wrong Regards Sujatha On Sat, Sep 15, 2012 at 9:43 PM, Jack Krupansky j...@basetechnology.comwrote: Correcting myself, for #4, Solr doesn't analyze string fields such as the unique key field, but... a transformer or other logic, say in DIH, that constructs the document key values might behave differently between Solr 1.3 and 3.6. Maybe there was a bug in 1.3 that caused distinct keys to map to the same value (causing documents to be discarded), but now in 3.6 the mapping is correct and distinct (and more documents are correctly indexed. -- Jack Krupansky -Original Message- From: Jack Krupansky Sent: Saturday, September 15, 2012 10:34 AM To: solr-user@lucene.apache.org Subject: Re: 1.3 to 3.6 migration Try some queries in both the old and the new and identify some documents that appear in one and not the other. Then examine a couple of those docs in detail one field at a time and see if anything is suspicious. Take each field value and enter it into the Solr Admin Analysis page to see how Solr 3.6 analyzes the field value compared to 1.3. Four likely scenarios: 1. The additional docs were not present when you indexed with 1.3. 2. Your indexing tool (DIH, or whatever) may have discarded the docs in 1.3 due to some issue that has now been resolved. 3. Solr 1.3 got an error those documents but your indexing process continued despite the error, while Solr 3.6 may not have hit those errors, possibly because it is more flexible and has more features now. 4. Your key values analyze differently in Solr 3.6 so that the keys of the extra documents mapped to other existing keys in Solr 1.3, causing the extra documents to overwrite existing documents in Solr 1.3. -- Jack Krupansky -Original Message- From: Sujatha Arun Sent: Saturday, September 15, 2012 2:39 AM To: solr-user@lucene.apache.org Subject: Re: 1.3 to 3.6 migration Can you please elaborate? Regards Sujatha On Sat, Sep 15, 2012 at 1:34 AM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: Hi, Maybe your indexer is different/modified/buggy? Otis -- Search Analytics - http://sematext.com/search-**analytics/index.htmlhttp://sematext.com/search-analytics/index.html Performance Monitoring - http://sematext.com/spm/index.**htmlhttp://sematext.com/spm/index.html On Fri, Sep 14, 2012 at 3:23 PM, Sujatha Arun suja.a...@gmail.com wrote: Hi, Just migrated to 3.6.1 from 1.3 version with the following observation Indexed content using the same source 1.3 3.6.1 Number of documents indexed 11505 13937 Index Time - Full Index 170ms 171ms Index size 23 MB 31MB Query Time [first time] for *:* 44 ms 187 and *:* query is not cached in 3.6.1 in query result cache ,is this expected? some points: Even though I used the same data source ,the number of documents indexed seem to be more in 3.6.1 [ not sure why?] All the other params including index size and query time seem to be more instnead of less in 3.6.1 and queries are not getting cached in 3.6.1 Attached the schema's - any pointers? Regards Sujatha ?xml version=1.0 encoding=UTF-8 ? !-- Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information regarding copyright ownership. The ASF licenses this file to You under the Apache License, Version 2.0 (the License); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. -- config abortOnConfigurationError${solr.abortOnConfigurationError:true}/abortOnConfigurationError luceneMatchVersionLUCENE_36/luceneMatchVersion !-- The DirectoryFactory to use for indexes. solr.StandardDirectoryFactory, the default, is filesystem based. solr.RAMDirectoryFactory is memory based, not persistent, and doesn't work with replication. -- directoryFactory name=DirectoryFactory class=${solr.directoryFactory:solr.StandardDirectoryFactory}/ indexConfig !-- Values here affect all index writers and act as a default unless overridden. -- useCompoundFiletrue/useCompoundFile mergeFactor4/mergeFactor
Re: Question about Fuzzy search in Solr
Hello! There is no need to include any changes or additional component to have fuzzy search working in Solr. -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch Thanks. Is any extra configuration from the Solr side to make this work ? Any additional text files like synonyms.txt, any additional fields or any changes in schema.xml or solrconfig.xml ? On Mon, Sep 17, 2012 at 4:45 PM, Rafał Kuć r@solr.pl wrote: Hello! Is this what you are looking for https://lucene.apache.org/core/old_versioned_docs/versions/3_0_0/queryparsersyntax.html#Fuzzy%20Searches ? -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch Hi, I need to know how we can implement fuzzy searches using Solr. Can someone provide any links to any relevant documentation ?
Re: Question about Fuzzy search in Solr
Got it. Thanks Rafał ! On Mon, Sep 17, 2012 at 6:37 PM, Rafał Kuć r@solr.pl wrote: Hello! There is no need to include any changes or additional component to have fuzzy search working in Solr. -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch Thanks. Is any extra configuration from the Solr side to make this work ? Any additional text files like synonyms.txt, any additional fields or any changes in schema.xml or solrconfig.xml ? On Mon, Sep 17, 2012 at 4:45 PM, Rafał Kuć r@solr.pl wrote: Hello! Is this what you are looking for https://lucene.apache.org/core/old_versioned_docs/versions/3_0_0/queryparsersyntax.html#Fuzzy%20Searches ? -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch Hi, I need to know how we can implement fuzzy searches using Solr. Can someone provide any links to any relevant documentation ? -- Thanks and Regards Rahul A. Warawdekar
Stats field with decimal values
Hello everyone, When im using stats=truestats=product_price parameter, it returns me the following structure: lst name=stats lst name=stats_fields lst name=produto_preco double name=min1.0/double double name=max1.0/double long name=count7/long long name=missing0/long double name=sum7.0/double double name=sumOfSquares7.0/double double name=mean1.0/double double name=stddev0.0/double /lst /lst /lst What im looking for is these 2: double name=min1.0/double double name=max1.0/double Is it possible to them be returned as decimal values? Like this: double name=min1.00/double double name=max1.00/double Tnks! -- View this message in context: http://lucene.472066.n3.nabble.com/Stats-field-with-decimal-values-tp4008292.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Taking a full text, then truncate and duplicate with stopwords
Purely for searching. The truncated form is just to show to the user as a preview, and the keyword form is for the keyword searching. -- View this message in context: http://lucene.472066.n3.nabble.com/Taking-a-full-text-then-truncate-and-duplicate-with-stopwords-tp4008269p4008295.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Stats field with decimal values
Could you clue us in as to why this is important to you? I mean, any modern programming language should be capable of dealing with parsing 1.0 if it can deal with parsing 1.00. -- Jack Krupansky -Original Message- From: Gustav Sent: Monday, September 17, 2012 9:19 AM To: solr-user@lucene.apache.org Subject: Stats field with decimal values Hello everyone, When im using stats=truestats=product_price parameter, it returns me the following structure: lst name=stats lst name=stats_fields lst name=produto_preco double name=min1.0/double double name=max1.0/double long name=count7/long long name=missing0/long double name=sum7.0/double double name=sumOfSquares7.0/double double name=mean1.0/double double name=stddev0.0/double /lst /lst /lst What im looking for is these 2: double name=min1.0/double double name=max1.0/double Is it possible to them be returned as decimal values? Like this: double name=min1.00/double double name=max1.00/double Tnks! -- View this message in context: http://lucene.472066.n3.nabble.com/Stats-field-with-decimal-values-tp4008292.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Indexing PDF-Files using Solr Cell
Add the fmap.content=your-stored-field to the URL. Or if your schema doesn't already have a content field, add one that is stored and it will automatically be used. -- Jack Krupansky -Original Message- From: Alexander Troost Sent: Monday, September 17, 2012 1:12 AM To: solr-user@lucene.apache.org Subject: Re: Indexing PDF-Files using Solr Cell Thank you for your response. I'm writing my Bachelor-Thesis about Solr and my company doesn't want me to use a beta-version. I dont want to be annoying, but how do i direct the content to a stored filed and so on... in the URL i use for the HTTP-POST? In a config-file? 2012/9/17 Jack Krupansky j...@basetechnology.com Be sure to direct the content to a stored field (such as content) which you can add to your fl field list to return. Then use a copyField to copy that stored field to the text field for searching. Again, this is all simplified in Solr 4.0-BETA. -- Jack Krupansky -Original Message- From: Alexander Troost Sent: Sunday, September 16, 2012 11:59 PM To: solr-user@lucene.apache.org Subject: Re: Indexing PDF-Files using Solr Cell Hi, first of all: Thank you for that quick response! But i am not sure if i am doing this right. For my point of view the command now has to look like: curl http://localhost:8983/solr/**update/extract?literal.id=** doc11literal.filename=markus**fmap.content=textcommit=truehttp://localhost:8983/solr/update/extract?literal.id=doc11literal.filename=markusfmap.content=textcommit=true -F myfile=@markus.pdf When I am seaching now for Text in the PDF, i am getting the result: result name=response numFound=1 start=0 doc str name=authorA28240/str arr name=content_typestr**application/pdf/str/arr str name=iddoc11/str date name=last_modified2012-09-**17T03:49:39Z/date /doc /result SORRY for being such a newbie and sorry for my bad english. It's 6 AM here and i spend the whole night at the computer :-) Greetz A 2012/9/17 Jack Krupansky j...@basetechnology.com The content will be sent to the content field, which you can redirect using the fmap.content=some-field request parameter. You need to explicitly set the file name field yourself, using the literal.your-file-name-field=file-name request parameter. Also, if using Solr 4.0-BETA, you can simply use the SimplePostTool (post.jar) to send documents to SolrCell, which will automatically take care of these extra steps. -- Jack Krupansky -Original Message- From: Alexander Troost Sent: Sunday, September 16, 2012 10:16 PM To: solr-user@lucene.apache.org Subject: Indexing PDF-Files using Solr Cell Hello *, I've got a problem indexing and searching PDF-Files. It seems like Solr doenst index the name of the file. In returning i only get result name=response numFound=1 start=0docstr name=authorA28240/strarr name=content_typestrapplication/pdf/str/arrstr name=iddoc5/strdate name=last_modified2012-09-17T01:45:39Z/date/doc/result He founds the right document, but no content or title is displayed in the XML-Response. Where do i config that? I index my documents (right now) via curl e.g.: curl http://localhost:8983/solr/update/extract?literal.id=**http://localhost:8983/solr/**update/extract?literal.id=** doc7commit=truehttp://**localhost:8983/solr/update/** extract?literal.id=doc7**commit=truehttp://localhost:8983/solr/update/extract?literal.id=doc7commit=true -F myfile=@xyz.pdf Where is my mistake? Greeting Alex
Re: Taking a full text, then truncate and duplicate with stopwords
In an attempt to answer my own question, is this a good solution. Before I was thinking of importing my fulltext description once, then sorting it into two seperate fields in solr, one truncated, one keyword. How about instead actually importing my fulltext description twice. Then I can import it first into truncated_description and then again into keyword_description. -- View this message in context: http://lucene.472066.n3.nabble.com/Taking-a-full-text-then-truncate-and-duplicate-with-stopwords-tp4008269p4008327.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Only exact match searches working
That will match internal substrings in addition to prefix strings. EdgeNGram does only prefix substrings, which is generally what people want. So, NGramFilter would match England when the query is land or gland, gla, etc. Use the Solr Admin Analysis UI to enter text to see how the filter analyzes it to make sure it is what you expect. -- Jack Krupansky -Original Message- From: Spadez Sent: Monday, September 17, 2012 7:16 AM To: solr-user@lucene.apache.org Subject: Re: Only exact match searches working Thank you for the reply. I have done a bit of reading and it says I can also use this one: filter class=solr.NGramFilterFactory minGramSize=3 maxGramSize=30 / This is what I will use I think, as it weeds out words like at I as a bonus. -- View this message in context: http://lucene.472066.n3.nabble.com/Only-exact-match-searches-working-tp4008160p4008264.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Question about Fuzzy search in Solr
That doc is out of date for 4.0. See the 4.0 Javadoc on FuzzyQuery for updated info. The tilda right operand is now an integer editing distance (number of times to insert char, delete char, change char, or transpose two adjacent chars to map index term to query term) that is limited to 2. Be aware that if you use fuzzy query in 3.6/3.6.1 or earlier, it will change when you go to 4.0. -- Jack Krupansky -Original Message- From: Rafał Kuć Sent: Monday, September 17, 2012 7:15 AM To: solr-user@lucene.apache.org Subject: Re: Question about Fuzzy search in Solr Hello! Is this what you are looking for https://lucene.apache.org/core/old_versioned_docs/versions/3_0_0/queryparsersyntax.html#Fuzzy%20Searches ? -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch Hi, I need to know how we can implement fuzzy searches using Solr. Can someone provide any links to any relevant documentation ?
Re: Question about Fuzzy search in Solr
Thanks Jack. We are using Solr 3.4. On Mon, Sep 17, 2012 at 8:18 PM, Jack Krupansky j...@basetechnology.comwrote: That doc is out of date for 4.0. See the 4.0 Javadoc on FuzzyQuery for updated info. The tilda right operand is now an integer editing distance (number of times to insert char, delete char, change char, or transpose two adjacent chars to map index term to query term) that is limited to 2. Be aware that if you use fuzzy query in 3.6/3.6.1 or earlier, it will change when you go to 4.0. -- Jack Krupansky -Original Message- From: Rafał Kuć Sent: Monday, September 17, 2012 7:15 AM To: solr-user@lucene.apache.org Subject: Re: Question about Fuzzy search in Solr Hello! Is this what you are looking for https://lucene.apache.org/**core/old_versioned_docs/**versions/3_0_0/** queryparsersyntax.html#Fuzzy%**20Searcheshttps://lucene.apache.org/core/old_versioned_docs/versions/3_0_0/queryparsersyntax.html#Fuzzy%20Searches ? -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch Hi, I need to know how we can implement fuzzy searches using Solr. Can someone provide any links to any relevant documentation ? -- Thanks and Regards Rahul A. Warawdekar
Re: Taking a full text, then truncate and duplicate with stopwords
--- On Mon, 9/17/12, Spadez james_will...@hotmail.com wrote: From: Spadez james_will...@hotmail.com Subject: Re: Taking a full text, then truncate and duplicate with stopwords To: solr-user@lucene.apache.org Date: Monday, September 17, 2012, 5:32 PM In an attempt to answer my own question, is this a good solution. Before I was thinking of importing my fulltext description once, then sorting it into two seperate fields in solr, one truncated, one keyword. How about instead actually importing my fulltext description twice. Then I can import it first into truncated_description and then again into keyword_description. Have you used copyField? copyField source=keyword_description dest=truncated_description maxChars=3000/ field name=truncated_description indexed=false stored=true field name=keyword_description indexed=true stored=false
Installing Tomcat as the user solr?
Can I have some clarification about installing Tomcat as the user solr? See http://wiki.apache.org/solr/SolrTomcat#Installing_Tomcat_6 second paragraph, which states Create the solr user. As solr, extract the Tomcat 6.0 download into /opt/tomcat6. Does this user need a home-dir? (I'm guessing no). Should it have it's own private group? If so, is that group a system group with GID 500? What about a login shell (again I'm guessing not necessary) The documentation doesn't go on to say that you should switch to the solr user account when installing SOLR. Sorry if that sounds like a dumb question, but there is no explanation about why tomcat needs to be installed as solr rather than tomcat or root. Thanks.
Apache solr for Oracle DB
Hi I am to planning use APache solr for Oracle DB based (Future we may may use some other DB) search for our project. Its going to be a customer faced product and we are using Spring MVC frame work. Could anybody help me how can i integrate Apache Solr with my project or could any body suggest me a better document? Thanks Regards Vijay -- View this message in context: http://lucene.472066.n3.nabble.com/Apache-solr-for-Oracle-DB-tp4008351.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Installing Tomcat as the user solr?
I probably wouldn't suggest running Tomcat as root because of the principle of least privilege, but aside from that, it's sort of immaterial what you call the account, particularly if you already have a 'tomcat' daemon account set up. Michael Della Bitta Appinions | 18 East 41st St., Suite 1806 | New York, NY 10017 www.appinions.com Where Influence Isn’t a Game On Mon, Sep 17, 2012 at 11:13 AM, Ken Clarke k_cla...@perlprogrammer.net wrote: Can I have some clarification about installing Tomcat as the user solr? See http://wiki.apache.org/solr/SolrTomcat#Installing_Tomcat_6 second paragraph, which states Create the solr user. As solr, extract the Tomcat 6.0 download into /opt/tomcat6. Does this user need a home-dir? (I'm guessing no). Should it have it's own private group? If so, is that group a system group with GID 500? What about a login shell (again I'm guessing not necessary) The documentation doesn't go on to say that you should switch to the solr user account when installing SOLR. Sorry if that sounds like a dumb question, but there is no explanation about why tomcat needs to be installed as solr rather than tomcat or root. Thanks.
Re: Taking a full text, then truncate and duplicate with stopwords
Thank you for the reply. The trouble is, I want the truncated desciption to still have the keywords. If I pass it to the keyword_descipriton and remove words like and i then if etc, then copy it across to truncated_description, my truncated description will not be a sentance, it will only be keywords. *How I want my truncated text to be:* Several men are in the locker room of a golf club. A cell phone on a bench rings and a man engages the hands-free speaker function and begins to talk. Everyone else... *How it would be under your scenario:* Several men locker room golf club cell phone bench rings man engages hands-free speaker function begins talk Everyone else -- View this message in context: http://lucene.472066.n3.nabble.com/Taking-a-full-text-then-truncate-and-duplicate-with-stopwords-tp4008269p4008358.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Only exact match searches working
Ok. I can still define GramSize too? *filter class=solr.EdgeNGramFilterFactory minGramSize=3 maxGramSize=30 /* -- View this message in context: http://lucene.472066.n3.nabble.com/Only-exact-match-searches-working-tp4008160p4008361.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Taking a full text, then truncate and duplicate with stopwords
The trouble is, I want the truncated desciption to still have the keywords. copyField copies raw text, it has noting to do with analysis.
RE: Solr 4.0 - Join performance
Hi David, I see that you committed the work for solr-3304 to the 4.x tree, which is great news, thanks.I'm not fully familiar with the process, does that mean its currently available in the nighty builds?Eric. Date: Wed, 29 Aug 2012 08:44:14 -0700 From: dsmi...@mitre.org To: solr-user@lucene.apache.org Subject: Re: Solr 4.0 - Join performance The solr.GeoHashFieldType is useless; I'd like to see it deprecated then removed. You'll need to go with unreleased code and apply patches or wait till Solr 4. ~ David On Aug 29, 2012, at 10:53 AM, Eric Khoury [via Lucene] wrote: Awesome, thanks David. In the meantime, could I potentially use geohash, or something similar? Geohash looks like it supports seperate lon or lat range queries which would help, but its not a multivalue field, which I need. Date: Wed, 29 Aug 2012 07:20:42 -0700 From: [hidden email]x-msg://228/user/SendEmail.jtp?type=nodenode=4004060i=0 To: [hidden email]x-msg://228/user/SendEmail.jtp?type=nodenode=4004060i=1 Subject: Re: Solr 4.0 - Join performance Solr 4 is certainly the goal. There's a bit of a setback at the moment until some of the Lucene spatial API is re-thought. I'm working heavily on such things this week. ~ David On Aug 28, 2012, at 6:22 PM, Eric Khoury [via Lucene] wrote: David, Solr support for this will come in Solr-3304 I suppose?http://wiki.apache.org/solr/SolrAdaptersForLuceneSpatial4Any idea if this is going to make it into Solr 4.0? Thanks,Eric. Date: Wed, 15 Aug 2012 07:07:21 -0700 From: [hidden email]x-msg://178/user/SendEmail.jtp?type=nodenode=4003852i=0 To: [hidden email]x-msg://178/user/SendEmail.jtp?type=nodenode=4003852i=1 Subject: RE: Solr 4.0 - Join performance You would index rectangles of 0 height but that have a left edge 'x' of the start time and a right edge 'x' of your end time. You can index a variable number of these per Solr document and then query by either a point or another rectangle to find documents which intersect your query shape. It can't do a completely within based query, just intersection for now. I really look forward to seeing this wrapped up in some sort of RangeFieldType so that users don't have to think in spatial terms. - Author: http://www.packtpub.com/apache-solr-3-enterprise-search-server/book -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-4-0-Join-performance-tp3998827p4001404.html Sent from the Solr - User mailing list archive at Nabble.comhttp://Nabble.comhttp://Nabble.comhttp://Nabble.com/. If you reply to this email, your message will be added to the discussion below: NAMLx-msg://228/http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml - Author: http://www.packtpub.com/apache-solr-3-enterprise-search-server/book -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-4-0-Join-performance-tp3998827p4004035.html Sent from the Solr - User mailing list archive at Nabble.comhttp://Nabble.com. If you reply to this email, your message will be added to the discussion below: http://lucene.472066.n3.nabble.com/Solr-4-0-Join-performance-tp3998827p4004060.html To unsubscribe from Solr 4.0 - Join performance, click herehttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=3998827code=RFNNSUxFWUBtaXRyZS5vcmd8Mzk5ODgyN3wxMDE2NDI2OTUw. NAMLhttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml - Author: http://www.packtpub.com/apache-solr-3-enterprise-search-server/book -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-4-0-Join-performance-tp3998827p4004073.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Taking a full text, then truncate and duplicate with stopwords
Maybe I dont understand, but if you are copying the keyword description field and then truncating it then the truncated form will only have keywords too. That isnt what I want. I want the truncated form to have words like a the it etc that would have been removed when added to keyword_description. copyField source=keyword_description dest=truncated_description maxChars=3000/ -- View this message in context: http://lucene.472066.n3.nabble.com/Taking-a-full-text-then-truncate-and-duplicate-with-stopwords-tp4008269p4008372.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Only exact match searches working
Ok. I can still define GramSize too? *filter class=solr.EdgeNGramFilterFactory minGramSize=3 maxGramSize=30 /* Yes you can. http://lucene.apache.org/solr/api-3_6_1/org/apache/solr/analysis/EdgeNGramFilterFactory.html
Re: Taking a full text, then truncate and duplicate with stopwords
--- On Mon, 9/17/12, Spadez james_will...@hotmail.com wrote: From: Spadez james_will...@hotmail.com Subject: Re: Taking a full text, then truncate and duplicate with stopwords To: solr-user@lucene.apache.org Date: Monday, September 17, 2012, 7:10 PM Maybe I dont understand, but if you are copying the keyword description field and then truncating it then the truncated form will only have keywords too. That isnt what I want. I want the truncated form to have words like a the it etc that would have been removed when added to keyword_description. copyField source=keyword_description dest=truncated_description maxChars=3000/ If you add a document add doc field name=keyword_description Several men are in the locker room of a golf club. A cell phone on a bench rings and a man engages the hands-free speaker function and begins to talk. Everyone else in the room stops to listen. The man hangs up. The other men in the locker room are looking at him in astonishment. Then he smiles and asks: Anyone know whose phone is???!!! /field you will see that truncated_description will have joining words (a, the, etc).
Re: Taking a full text, then truncate and duplicate with stopwords
The only catch here is that copyField might truncate in the middle of a word, yielding an improper term. -- Jack Krupansky -Original Message- From: Ahmet Arslan Sent: Monday, September 17, 2012 11:54 AM To: solr-user@lucene.apache.org Subject: Re: Taking a full text, then truncate and duplicate with stopwords The trouble is, I want the truncated desciption to still have the keywords. copyField copies raw text, it has noting to do with analysis.
Re: Taking a full text, then truncate and duplicate with stopwords
I'm really confused here. I have a document which is say 4000 words long. I want to get this put into two fields in Solr without having to save the original document in its entirety within Solr. When I import my fulltext (4000 word) document to Solr I was going to put it straight into keyword_document which uses stopwords to remove words like and it this. Now I only have 3000 words for example. Then if I do copy command to move it into truncate_document then even though I can reduce it down to say 100 words, it is lacking words like and it and this because it has been copied from the keyword_document. I want the following scenario: truncate_document to have 100 words including words like and it and this keyword_docment to have only stop words removed And finally only have the fulltext document, full length and all stop words, exist in my SQL database. -- View this message in context: http://lucene.472066.n3.nabble.com/Taking-a-full-text-then-truncate-and-duplicate-with-stopwords-tp4008269p4008380.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Taking a full text, then truncate and duplicate with stopwords
Then if I do copy command to move it into truncate_document then even though I can reduce it down to say 100 words, it is lacking words like and it and this because it has been copied from the keyword_document. That's not true. copy operation is performed before analysis (stopword removal, lowercasing etc). It will copy raw text of keyword_document field. It has noting to do with analysis of source field.
Re: Taking a full text, then truncate and duplicate with stopwords
You said it has been copied from the keyword_document [field], but the reality is that Solr is not copying from the indexed value of the field, but from the source value for the field. The idea is that multiple fields can be based on the same source value even if they analyze and index the value in different ways. -- Jack Krupansky -Original Message- From: Spadez Sent: Monday, September 17, 2012 12:29 PM To: solr-user@lucene.apache.org Subject: Re: Taking a full text, then truncate and duplicate with stopwords I'm really confused here. I have a document which is say 4000 words long. I want to get this put into two fields in Solr without having to save the original document in its entirety within Solr. When I import my fulltext (4000 word) document to Solr I was going to put it straight into keyword_document which uses stopwords to remove words like and it this. Now I only have 3000 words for example. Then if I do copy command to move it into truncate_document then even though I can reduce it down to say 100 words, it is lacking words like and it and this because it has been copied from the keyword_document. I want the following scenario: truncate_document to have 100 words including words like and it and this keyword_docment to have only stop words removed And finally only have the fulltext document, full length and all stop words, exist in my SQL database. -- View this message in context: http://lucene.472066.n3.nabble.com/Taking-a-full-text-then-truncate-and-duplicate-with-stopwords-tp4008269p4008380.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr 4.0 - Join performance
Yes absolutely. Since 4.0 hasn't been released, anything with a fix version to 4.0 basically implies trunk as well. Also notice my comment Committed to trunk 4x which is explicit. ~ David On Sep 17, 2012, at 12:02 PM, Eric Khoury [via Lucene] wrote: Hi David, I see that you committed the work for solr-3304 to the 4.x tree, which is great news, thanks.I'm not fully familiar with the process, does that mean its currently available in the nighty builds?Eric. Date: Wed, 29 Aug 2012 08:44:14 -0700 From: [hidden email]x-msg://175/user/SendEmail.jtp?type=nodenode=4008368i=0 To: [hidden email]x-msg://175/user/SendEmail.jtp?type=nodenode=4008368i=1 Subject: Re: Solr 4.0 - Join performance The solr.GeoHashFieldType is useless; I'd like to see it deprecated then removed. You'll need to go with unreleased code and apply patches or wait till Solr 4. ~ David On Aug 29, 2012, at 10:53 AM, Eric Khoury [via Lucene] wrote: Awesome, thanks David. In the meantime, could I potentially use geohash, or something similar? Geohash looks like it supports seperate lon or lat range queries which would help, but its not a multivalue field, which I need. Date: Wed, 29 Aug 2012 07:20:42 -0700 From: [hidden email]x-msg://228/user/SendEmail.jtp?type=nodenode=4004060i=0 To: [hidden email]x-msg://228/user/SendEmail.jtp?type=nodenode=4004060i=1 Subject: Re: Solr 4.0 - Join performance Solr 4 is certainly the goal. There's a bit of a setback at the moment until some of the Lucene spatial API is re-thought. I'm working heavily on such things this week. ~ David On Aug 28, 2012, at 6:22 PM, Eric Khoury [via Lucene] wrote: David, Solr support for this will come in Solr-3304 I suppose?http://wiki.apache.org/solr/SolrAdaptersForLuceneSpatial4Any idea if this is going to make it into Solr 4.0? Thanks,Eric. Date: Wed, 15 Aug 2012 07:07:21 -0700 From: [hidden email]x-msg://178/user/SendEmail.jtp?type=nodenode=4003852i=0 To: [hidden email]x-msg://178/user/SendEmail.jtp?type=nodenode=4003852i=1 Subject: RE: Solr 4.0 - Join performance You would index rectangles of 0 height but that have a left edge 'x' of the start time and a right edge 'x' of your end time. You can index a variable number of these per Solr document and then query by either a point or another rectangle to find documents which intersect your query shape. It can't do a completely within based query, just intersection for now. I really look forward to seeing this wrapped up in some sort of RangeFieldType so that users don't have to think in spatial terms. - Author: http://www.packtpub.com/apache-solr-3-enterprise-search-server/book -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-4-0-Join-performance-tp3998827p4001404.html Sent from the Solr - User mailing list archive at Nabble.comhttp://Nabble.comhttp://Nabble.comhttp://Nabble.com/http://Nabble.comhttp://Nabble.com/http://Nabble.com/. If you reply to this email, your message will be added to the discussion below: NAMLx-msg://228/http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml - Author: http://www.packtpub.com/apache-solr-3-enterprise-search-server/book -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-4-0-Join-performance-tp3998827p4004035.html Sent from the Solr - User mailing list archive at Nabble.comhttp://Nabble.comhttp://Nabble.comhttp://Nabble.com/. If you reply to this email, your message will be added to the discussion below: NAMLx-msg://175/http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml - Author: http://www.packtpub.com/apache-solr-3-enterprise-search-server/book -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-4-0-Join-performance-tp3998827p4004073.html Sent from the Solr - User mailing list archive at Nabble.comhttp://Nabble.com. If you reply to this email, your message will be added to the discussion below: http://lucene.472066.n3.nabble.com/Solr-4-0-Join-performance-tp3998827p4008368.html To unsubscribe from Solr 4.0 - Join performance, click
Re: Stats field with decimal values
Well, my client is asking if is it possible, im just providing the search enginne to him, not working directly with the application. Dont know exactly in what language he is programming. -- View this message in context: http://lucene.472066.n3.nabble.com/Stats-field-with-decimal-values-tp4008292p4008395.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr Clustering
Sorry for late response. To be strict, here is what i want: * I get documents all the time. Let's assume those are news (It's rather similar thing). * Every time i get new batch of news i should add them to Solr index and get cluster information for that document. Store this information in the DB (so i should know each document's cluster). * I can't wait for cluster definition service/program to launch from time to time, but it should define clusters on the fly. * I want to be able to get clusters only for some period of time (For example i want to search for clusters only for documents that were loader one month ago). * I will have tens of thousands of new documents every day and overall base of several millions. I'm reading Mahout in action now. But maybe you can point me to what i need. --- Исходное сообщение --- От кого: Chandan Tamrakar chandan.tamra...@nepasoft.com Кому: solr-user@lucene.apache.org Дата: 4 сентября 2012, 12:30:56 Тема: Re: Solr Clustering yes there is a solr component if you want to cluster solr documents , check the following linkhttp://wiki.apache.org/solr/ClusteringComponent Carrot2 might be good if you want to cluster few thousands of documents , for example when user search solr , just cluster the search results Mahout is much more scalable and probably you need Hadoop for that thanks chandan On Tue, Sep 4, 2012 at 2:10 PM, Denis Kuzmenok forward...@ukr.net wrote: Original Message Subject: Solr Clustering From: Denis Kuzmenok forward...@ukr.net To: solr-user@lucene.apache.org CC: Hi, all. I know there is carrot2 and mahout for clustering. I want to implement such thing: I fetch documents and want to group them into clusters when they are added to index (i want to filter similar documents for example for 1 week). i need these documents quickly, so i cant rely on some postponed calculations. Each document should have assigned cluster id (like group similar documents into clusters and assign each document its cluster id. It's something similar to news aggregators like google news. I dont need to search for clusters with documents older than 1 week (for example). Each document will have its unique id and saved into DB. But solr will have cluster id field also. Is it possible to implement this with solr/carrot/mahout? -- Chandan Tamrakar * *
Re: Selective field level security
Hi, Solr doesn't have any built-in mechanism for document/field level security - basically it's delegated to the container to provide security, but this of course won't apply to specific documents and/or fields. There are are a lot of ways to skin this cat, some bits of which have been covered by your message. What can be the trickiest thing about this isn't so much adding indexed fields etc., but rather how you plan to determine who the 'searching user' actually is. This task can seem not too bad at first, then all sorts of worms start streaming out of the can (e.g. how to avoid spoofing/identity theft). Once you're app is confident it has a bona-fide user, you then need a way to map the user to a set of fields/docs/permissions etc. that he/she can/can't look at. There are plenty of approaches - mainly driven by: * where your original data lives (outside of Solr? does it still exist? etc) * is there an external ACL mechanism that you can use (e.g. file system permissions) * how do you manage users? (e.g. internal emplyoyees? public website account holders? anyone?) Two Jiras of note might help you in your quest: SOLR-1872 (a good approach if you don't have access to the original data at search-time) SOLR-1895 (uses ManifoldCF - good if you have access to original data and use its permissions - e.g. file system ACL) HTH, Peter On Mon, Sep 17, 2012 at 7:44 PM, Nalini Kartha nalinikar...@gmail.comwrote: Hi, We're trying to push some security related info into the index which will control which users can search certain fields and we're wondering what the best way to accomplish this is. Some records that are being indexed and searched can have certain fields marked as private. When a field is marked as private, some querying users should not see/search on it whereas some super users can. Here's the solutions we're considering - - Index a separate boolean value into a new _INTERNAL field to indicate if the corresponding field value is marked private or not and include a filter in the query when the searching user is not a super user. So for eg., consider that a record can contain 3 fields - field[123] where field1 and field2 can be marked as private but field3 cannot. Record A has only field1 marked as private, record B has both field1 and field2 marked as private. When we index these records here's what we'd end up with in the index - Record A - field1:something, field1_INTERNAL:1, field2:something, field2_INTERNAL:0, field3:something Record B - field1:something, field1_INTERNAL:1, field2:something, field2_INTERNAL:1, field3:something If the searching user is NOT a super user then the query (let's say it's 'hidden security') needs to look like this- ((field3:hidden) OR (field1:hidden AND field1_INTERNAL:0) OR (field2:hidden AND field2_INTERNAL:0)) AND ((field3:security) OR (field1:security AND field1_INTERNAL:0) OR (field2:security AND field2_INTERNAL:0)) Manipulating the query this way seems painful and error prone so we're wondering if Solr provides anything out of the box that would help with this? - Index the private values themselves into a separate _INTERNAL field and then determine which fields to query depending on the visibility of the searching user. So using the example from above, here's what the indexed records would look like - Record A - field1_INTERNAL:something, field2:something, field3:something Record B - field1_INTERNAL:something, field2_INTERNAL:something, field3:something If the searching user is NOT a super user then the query just needs to be against the regular fields whereas if the searching user IS a super user, the query needs to be against BOTH the regular and INTERNAL fields. The issue with this solution is that since the number of docs that include the INTERNAL fields is going to be much fewer we're wondering if relevancy would be messed up when we're querying both regular and internal fields for super users? Thoughts? Thanks, Nalini
RE: Selective field level security
Hi Nalini, We had similar requirements and this is how we did it (using your example): Record A: Field1_All: something Field1_Private: something Field2_All: '' Field2_Private: something private Field3_All: '' Field3_Private: something very private Fields_All: something Fields_Private: something something private something very private Basically, we're just using a lot of copy fields and dynamic fields. Instead of storing a type, we just change the column name. So if someone who had access to private fields, we would perform our search in the private column fields: (fields_private:something) Or if you want a specific field: (field1_private:something) OR (field2_private:something) or (field3_private:something) Likewise, if someone didn't have access to the private fields, we would only search in the all fields. We also created a super field so that we don't have to search each individual field -- we use copyfields to copy all private fields into the super field and just search that. I hope this helps. Swati -Original Message- From: Nalini Kartha [mailto:nalinikar...@gmail.com] Sent: Monday, September 17, 2012 2:45 PM To: solr-user@lucene.apache.org Subject: Selective field level security Hi, We're trying to push some security related info into the index which will control which users can search certain fields and we're wondering what the best way to accomplish this is. Some records that are being indexed and searched can have certain fields marked as private. When a field is marked as private, some querying users should not see/search on it whereas some super users can. Here's the solutions we're considering - - Index a separate boolean value into a new _INTERNAL field to indicate if the corresponding field value is marked private or not and include a filter in the query when the searching user is not a super user. So for eg., consider that a record can contain 3 fields - field[123] where field1 and field2 can be marked as private but field3 cannot. Record A has only field1 marked as private, record B has both field1 and field2 marked as private. When we index these records here's what we'd end up with in the index - Record A - field1:something, field1_INTERNAL:1, field2:something, field2_INTERNAL:0, field3:something Record B - field1:something, field1_INTERNAL:1, field2:something, field2_INTERNAL:1, field3:something If the searching user is NOT a super user then the query (let's say it's 'hidden security') needs to look like this- ((field3:hidden) OR (field1:hidden AND field1_INTERNAL:0) OR (field2:hidden AND field2_INTERNAL:0)) AND ((field3:security) OR (field1:security AND field1_INTERNAL:0) OR (field2:security AND field2_INTERNAL:0)) Manipulating the query this way seems painful and error prone so we're wondering if Solr provides anything out of the box that would help with this? - Index the private values themselves into a separate _INTERNAL field and then determine which fields to query depending on the visibility of the searching user. So using the example from above, here's what the indexed records would look like - Record A - field1_INTERNAL:something, field2:something, field3:something Record B - field1_INTERNAL:something, field2_INTERNAL:something, field3:something If the searching user is NOT a super user then the query just needs to be against the regular fields whereas if the searching user IS a super user, the query needs to be against BOTH the regular and INTERNAL fields. The issue with this solution is that since the number of docs that include the INTERNAL fields is going to be much fewer we're wondering if relevancy would be messed up when we're querying both regular and internal fields for super users? Thoughts? Thanks, Nalini
FilterCache Memory consumption high
I've looked through documentation and postings and expect that a single filter cache entry should be approx MaxDoc/8 bytes. Our frequently updated index (replication every 3 minutes) has maxdoc ~= 23 Million. So I'm figuring 3MB per entry. With CacheSize=512 I expect something like 1.5GB of RAM, but with the server in steady state after 1/2 hour, it is 7GB larger than without the cache. I can understand maybe a 2x difference, given the warming searcher but 4x I don't understand. I do have maxWarmingSearchers = 2, but have never seen 2 searchers sumiltaneously being warmed. Ideas anybody? -- View this message in context: http://lucene.472066.n3.nabble.com/FilterCache-Memory-consumption-high-tp4008444.html Sent from the Solr - User mailing list archive at Nabble.com.
Help with slow Solr Cloud query
Hi, I've got a set up as follows: - 13 cores - 2 servers - running Solr 4.0 Beta with numShards=1 and an embedded zookeeper. I'm trying to figure out why some complex queries are running so slowly in this setup versus quickly in a standalone mode. Given a query like: /select?q=(some complex query) It runs fast and gets faster (caches) when only running one server: 1. ?fl=*q=(complex query)wt=jsonrows=24 (QTime 3) When, I issue the same query to the cluster and watch the logs, it looks like it's actually performing the query 3 times like so: 1. ?q=(complex query)distrib=falsewt=javabinrows=24version=2NOW=1347911018556shard.url=(server1)|(server2)fl=id,scoredf=textstart=0isShard=truefsv=true (QTime 2) 2. ?ids=(ids from query 1)distrib=falsewt=javabinrows=24version=2df=textfl=*shard.url=(server1)|(server2)NOW=1347911018556start=0q=(complex query)isShard=true (QTime 4) 3. ?fl=*q=(complex query)wt=jsonrows=24 (QTime 459) Why is it performing #3? It already has everything it needs in #2 and #3 seems to be really slow even when warmed and cached. As stated above, this query is fast when running on a single server that is warmed and cached. Since my query is complex, I could understand some slowness if I was attempting this across multiple shards, but since there's only one shard, shouldn't it just pick one server and query it? Thanks! Jim -- View this message in context: http://lucene.472066.n3.nabble.com/Help-with-slow-Solr-Cloud-query-tp4008448.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: Stats field with decimal values
You can use an XSL response writer to transform your values to have a different precision. http://wiki.apache.org/solr/XsltResponseWriter Would most likely be better for your client to just do it on his end though. He is probably parsing the response anyway. -Original Message- From: Gustav [mailto:xbihy...@sharklasers.com] Sent: Monday, September 17, 2012 1:10 PM To: solr-user@lucene.apache.org Subject: Re: Stats field with decimal values Well, my client is asking if is it possible, im just providing the search enginne to him, not working directly with the application. Dont know exactly in what language he is programming. -- View this message in context: http://lucene.472066.n3.nabble.com/Stats-field-with-decimal-values-tp4008292p4008395.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: FilterCache Memory consumption high
On Mon, Sep 17, 2012 at 3:44 PM, Mike Schultz mike.schu...@gmail.com wrote: So I'm figuring 3MB per entry. With CacheSize=512 I expect something like 1.5GB of RAM, but with the server in steady state after 1/2 hour, it is 7GB larger than without the cache. Heap size and memory use aren't quite the same thing. Try running jconsole (it comes with every JDK), attaching to the process, and then make it run multiple garbage collections to see what the heap shrinks down to. -Yonik http://lucidworks.com
Re: Taking a full text, then truncate and duplicate with stopwords
Ah, ok this is news to me and makes a lot more sense. If I can just run this back past you to make sure I understand. If I move my full_text to If I move my fulltext document from my SQL database to keyword_document it will contain the original fulltext in the source, but the index will have the stopword filter, lowercase filter etc applied. Then by copying this to truncated_document the original source is being moved? *This is my definition for keyword_description, using the stopwords.txt* fieldType name=keyword_description class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true / filter class=solr.LowerCaseFilterFactory/ filter class=solr.EdgeNGramFilterFactory minGramSize=3 maxGramSize=30 / /analyzer analyzer type=query tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true / filter class=solr.LowerCaseFilterFactory/ /analyzer /fieldType *Then this to do the copying across. Is there somewhere specific to put this within the schema.xml?* copyField source=keyword_description dest=truncated_description maxChars=3000/ *Then do I need to have definitions for the truncated description in the same way that I did for keyword_description?* fieldType name=truncated_description class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.EdgeNGramFilterFactory minGramSize=3 maxGramSize=30 / /analyzer analyzer type=query tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ /analyzer /fieldType Jack Krupansky-2 wrote You said it has been copied from the keyword_document [field], but the reality is that Solr is not copying from the indexed value of the field, but from the source value for the field. The idea is that multiple fields can be based on the same source value even if they analyze and index the value in different ways. -- Jack Krupansky -Original Message- From: Spadez Sent: Monday, September 17, 2012 12:29 PM To: solr-user@.apache Subject: Re: Taking a full text, then truncate and duplicate with stopwords I'm really confused here. I have a document which is say 4000 words long. I want to get this put into two fields in Solr without having to save the original document in its entirety within Solr. When I import my fulltext (4000 word) document to Solr I was going to put it straight into keyword_document which uses stopwords to remove words like and it this. Now I only have 3000 words for example. Then if I do copy command to move it into truncate_document then even though I can reduce it down to say 100 words, it is lacking words like and it and this because it has been copied from the keyword_document. I want the following scenario: truncate_document to have 100 words including words like and it and this keyword_docment to have only stop words removed And finally only have the fulltext document, full length and all stop words, exist in my SQL database. -- View this message in context: http://lucene.472066.n3.nabble.com/Taking-a-full-text-then-truncate-and-duplicate-with-stopwords-tp4008269p4008380.html Sent from the Solr - User mailing list archive at Nabble.com. Jack Krupansky-2 wrote You said it has been copied from the keyword_document [field], but the reality is that Solr is not copying from the indexed value of the field, but from the source value for the field. The idea is that multiple fields can be based on the same source value even if they analyze and index the value in different ways. -- Jack Krupansky -Original Message- From: Spadez Sent: Monday, September 17, 2012 12:29 PM To: solr-user@.apache Subject: Re: Taking a full text, then truncate and duplicate with stopwords I'm really confused here. I have a document which is say 4000 words long. I want to get this put into two fields in Solr without having to save the original document in its entirety within Solr. When I import my fulltext (4000 word) document to Solr I was going to put it straight into keyword_document which uses stopwords to remove words like and it this. Now I only have 3000 words for example. Then if I do copy command to move it into truncate_document then even though I can reduce it down to say 100 words, it is lacking words like and it and this because it has been copied from the keyword_document. I want the following scenario: truncate_document to have 100 words including words like and it and this keyword_docment to have only stop words removed And finally only have the fulltext document, full length and all stop words, exist in my SQL database. -- View this message in
Re: Taking a full text, then truncate and duplicate with stopwords
You're getting the hang of it. No particular location for CopyField, just not within fields or types. Putting them after your fields makes sense. See the Solr example schema. -- Jack Krupansky -Original Message- From: Spadez Sent: Monday, September 17, 2012 4:47 PM To: solr-user@lucene.apache.org Subject: Re: Taking a full text, then truncate and duplicate with stopwords Ah, ok this is news to me and makes a lot more sense. If I can just run this back past you to make sure I understand. If I move my full_text to If I move my fulltext document from my SQL database to keyword_document it will contain the original fulltext in the source, but the index will have the stopword filter, lowercase filter etc applied. Then by copying this to truncated_document the original source is being moved? *This is my definition for keyword_description, using the stopwords.txt* fieldType name=keyword_description class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true / filter class=solr.LowerCaseFilterFactory/ filter class=solr.EdgeNGramFilterFactory minGramSize=3 maxGramSize=30 / /analyzer analyzer type=query tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true / filter class=solr.LowerCaseFilterFactory/ /analyzer /fieldType *Then this to do the copying across. Is there somewhere specific to put this within the schema.xml?* copyField source=keyword_description dest=truncated_description maxChars=3000/ *Then do I need to have definitions for the truncated description in the same way that I did for keyword_description?* fieldType name=truncated_description class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.EdgeNGramFilterFactory minGramSize=3 maxGramSize=30 / /analyzer analyzer type=query tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ /analyzer /fieldType Jack Krupansky-2 wrote You said it has been copied from the keyword_document [field], but the reality is that Solr is not copying from the indexed value of the field, but from the source value for the field. The idea is that multiple fields can be based on the same source value even if they analyze and index the value in different ways. -- Jack Krupansky -Original Message- From: Spadez Sent: Monday, September 17, 2012 12:29 PM To: solr-user@.apache Subject: Re: Taking a full text, then truncate and duplicate with stopwords I'm really confused here. I have a document which is say 4000 words long. I want to get this put into two fields in Solr without having to save the original document in its entirety within Solr. When I import my fulltext (4000 word) document to Solr I was going to put it straight into keyword_document which uses stopwords to remove words like and it this. Now I only have 3000 words for example. Then if I do copy command to move it into truncate_document then even though I can reduce it down to say 100 words, it is lacking words like and it and this because it has been copied from the keyword_document. I want the following scenario: truncate_document to have 100 words including words like and it and this keyword_docment to have only stop words removed And finally only have the fulltext document, full length and all stop words, exist in my SQL database. -- View this message in context: http://lucene.472066.n3.nabble.com/Taking-a-full-text-then-truncate-and-duplicate-with-stopwords-tp4008269p4008380.html Sent from the Solr - User mailing list archive at Nabble.com. Jack Krupansky-2 wrote You said it has been copied from the keyword_document [field], but the reality is that Solr is not copying from the indexed value of the field, but from the source value for the field. The idea is that multiple fields can be based on the same source value even if they analyze and index the value in different ways. -- Jack Krupansky -Original Message- From: Spadez Sent: Monday, September 17, 2012 12:29 PM To: solr-user@.apache Subject: Re: Taking a full text, then truncate and duplicate with stopwords I'm really confused here. I have a document which is say 4000 words long. I want to get this put into two fields in Solr without having to save the original document in its entirety within Solr. When I import my fulltext (4000 word) document to Solr I was going to put it straight into keyword_document which uses stopwords to remove words like and it this. Now I only have 3000 words for example. Then if I do copy command to move it into truncate_document then even though I can reduce it down to say 100 words, it is lacking words like and it and this because it has been copied from the keyword_document. I
Re: Installing Tomcat as the user solr?
Ok, I'll try running as tomcat. The wiki has a problem with the tomcat startup script. It looks like it's supposed to be a link which allows us to download a shell script, but when I click it, I get the error message You are not allowed to do AttachFile on this page. Login and try again.. The link I'm talking about is 1 line above http://wiki.apache.org/solr/SolrTomcat?action=AttachFiledo=viewtarget=tomcat6#Building_Solr - Original Message - From: Michael Della Bitta michael.della.bi...@appinions.com To: solr-user@lucene.apache.org Sent: Monday, September 17, 2012 12:32 PM Subject: Re: Installing Tomcat as the user solr? I probably wouldn't suggest running Tomcat as root because of the principle of least privilege, but aside from that, it's sort of immaterial what you call the account, particularly if you already have a 'tomcat' daemon account set up. Michael Della Bitta Appinions | 18 East 41st St., Suite 1806 | New York, NY 10017 www.appinions.com Where Influence Isn’t a Game On Mon, Sep 17, 2012 at 11:13 AM, Ken Clarke k_cla...@perlprogrammer.net wrote: Can I have some clarification about installing Tomcat as the user solr? See http://wiki.apache.org/solr/SolrTomcat#Installing_Tomcat_6 second paragraph, which states Create the solr user. As solr, extract the Tomcat 6.0 download into /opt/tomcat6. Does this user need a home-dir? (I'm guessing no). Should it have it's own private group? If so, is that group a system group with GID 500? What about a login shell (again I'm guessing not necessary) The documentation doesn't go on to say that you should switch to the solr user account when installing SOLR. Sorry if that sounds like a dumb question, but there is no explanation about why tomcat needs to be installed as solr rather than tomcat or root. Thanks.
broken links in solr wiki
Hi group, On this wiki page these two links below are broken as they are also on lucidworks' version, can someone point me at the correct locations please? I googled around and came up with possible good links. Thanks Robi http://wiki.apache.org/solr/LanguageAnalysis#Other_Tips http://lucidworks.lucidimagination.com/display/solr/Language+Analysis solr.KeywordMarkerFilterFactory A sample Solr protwords.txt with commentshttp://svn.apache.org/repos/asf/lucene/dev/trunk/solr/example/solr/conf/protwords.txt can be found in the Source Repository. Is this it? http://svn.apache.org/repos/asf/lucene/dev/trunk/solr/example/solr/collection1/conf/protwords.txt solr.StemmerOverrideFilterFactory A sample stemdict.txt with commentshttp://svn.apache.org/repos/asf/lucene/dev/trunk/solr/core/src/test-files/solr/conf/stemdict.txt can be found in the Source Repository. Is this it? https://svn.apache.org/repos/asf/lucene/dev/trunk/solr/core/src/test-files/solr/conf/stemdict.txt?p=1227271 (needs the ?p= parameter???)
Re: broken links in solr wiki
Hi Robert, Anyone can edit wiki, you just need to create user. Regarding URLs http://svn.apache.org/repos/asf/lucene/dev/trunk/solr/core/src/test-files/solr/collection1/conf/stemdict.txt http://svn.apache.org/repos/asf/lucene/dev/trunk/solr/example/solr/collection1/conf/protwords.txt --- On Tue, 9/18/12, Petersen, Robert rober...@buy.com wrote: From: Petersen, Robert rober...@buy.com Subject: broken links in solr wiki To: solr-user@lucene.apache.org solr-user@lucene.apache.org Date: Tuesday, September 18, 2012, 2:58 AM Hi group, On this wiki page these two links below are broken as they are also on lucidworks' version, can someone point me at the correct locations please? I googled around and came up with possible good links. Thanks Robi http://wiki.apache.org/solr/LanguageAnalysis#Other_Tips http://lucidworks.lucidimagination.com/display/solr/Language+Analysis solr.KeywordMarkerFilterFactory A sample Solr protwords.txt with commentshttp://svn.apache.org/repos/asf/lucene/dev/trunk/solr/example/solr/conf/protwords.txt can be found in the Source Repository. Is this it? http://svn.apache.org/repos/asf/lucene/dev/trunk/solr/example/solr/collection1/conf/protwords.txt solr.StemmerOverrideFilterFactory A sample stemdict.txt with commentshttp://svn.apache.org/repos/asf/lucene/dev/trunk/solr/core/src/test-files/solr/conf/stemdict.txt can be found in the Source Repository. Is this it? https://svn.apache.org/repos/asf/lucene/dev/trunk/solr/core/src/test-files/solr/conf/stemdict.txt?p=1227271 (needs the ?p= parameter???)
Personalized Boosting
Hello All, I have a requirement or a pre=requirement for our search application. Basically the engine will be on a website with plenty of users and more than 20 different fields, including location. So basically, the question is this: Is it possible to let user's define their position in search when location is queried? Let's say that I am UserA and when you make a search with Moscow, my default ranking is 258. By clicking a button, something like Boost Me!, I would like to see UserA as the first user when search is done by Moscow query. is this possible? I have some ideas (like adding more of the persons location like 10 times to their location field, so it will score the highest and so on) but I am not sure if the requirement is hard or easy to implement or will it require a plugin rather than config changes... anyone has any ideas? - Zeki ama calismiyor... Calissa yapar... -- View this message in context: http://lucene.472066.n3.nabble.com/Personalized-Boosting-tp4008495.html Sent from the Solr - User mailing list archive at Nabble.com.
In multi-core, special dataDir is not used?
Hi, I am using solr 3.6.1, I created a new core whatever3 dynamically, and I see solr.xml updated as: solr persistent=true cores adminPath=/admin/cores defaultCoreName=collection1 ... core name=whatever3 instanceDir=C:/lucene/Solr_3.6/apache-solr-3.6.1/example/mysolr\ dataDir=C:/lucene/Solr_3.6/apache-solr-3.6.1/example/mysolr/data/whatever3/ /cores /solr But when I update data like http://localhost:8080/solr/whatever3/update?commit=true;, the data did not go to the newly specified dataDir (I can see core whatver3 is apparently used from log)? Only way to make it work is NOT to define dataDir in solrconfig.xml, is this by design or I missed sth? Thanks very much for helps, Lisheng
Re: In multi-core, special dataDir is not used?
: But when I update data like http://localhost:8080/solr/whatever3/update?commit=true;, the data : did not go to the newly specified dataDir (I can see core whatver3 is apparently used from log)? : : Only way to make it work is NOT to define dataDir in solrconfig.xml, is this by design or I missed : sth? I can't reproduce the problem you are seeing -- can you please provide more details.. 1) what does your full solr.xml file look like after creating the new core? 2) what does the CoreAdminHandler (ie: probably /admin/cores - depends on your solr.xml file) return after you create the new core? 3) what does the dataDir section in the solrconfig.xml you are using when creating the whatver3 core look like? -Hoss
Re: In multi-core, special dataDir is not used?
: I can't reproduce the problem you are seeing -- can you please provide : more details.. Correction: i can reproduce this. This was in fact some odd behavior in the 1.x and 3.x lines that has been changed for 4.x in SOLR-1897. If you had no dataDir in your solrconfig.xml, or if you had a *blank* dataDir/dataDir then prior to 4.x the dataDir option specified when CREATEing a core would override the default -- but if you had any real path specified, then it would trump anything specified at runtime. The workarround i believe (but i haven't tested exhaustively) for 3.4-3.6.1 is not to specify a hardcoded dataDir in your solrconig.xml, but instead specify a property with a default value for the dataDir that can, and then use that property when issuing the CREATE command, ie... dataDir${yourPropertyName:/some/default/path}/dataDir ?action=CREATEname=yourCoreNameinstanceDir=yourCoreDirproperty.yourPropertyName=/override/path -Hoss
RE: In multi-core, special dataDir is not used?
Thanks very much for your quick guidance, which is very helpful! Lisheng -Original Message- From: Chris Hostetter [mailto:hossman_luc...@fucit.org] Sent: Monday, September 17, 2012 6:30 PM To: solr-user@lucene.apache.org Subject: Re: In multi-core, special dataDir is not used? : I can't reproduce the problem you are seeing -- can you please provide : more details.. Correction: i can reproduce this. This was in fact some odd behavior in the 1.x and 3.x lines that has been changed for 4.x in SOLR-1897. If you had no dataDir in your solrconfig.xml, or if you had a *blank* dataDir/dataDir then prior to 4.x the dataDir option specified when CREATEing a core would override the default -- but if you had any real path specified, then it would trump anything specified at runtime. The workarround i believe (but i haven't tested exhaustively) for 3.4-3.6.1 is not to specify a hardcoded dataDir in your solrconig.xml, but instead specify a property with a default value for the dataDir that can, and then use that property when issuing the CREATE command, ie... dataDir${yourPropertyName:/some/default/path}/dataDir ?action=CREATEname=yourCoreNameinstanceDir=yourCoreDirproperty.yourPropertyName=/override/path -Hoss
Re: Selective field level security
There is another option: pairing multi-valued roles and fields. Multi-valued fields support in-order return: the values are returned in the same order you added them. This means that you can have two fields with matched pairs of values. Secure data often a many-to-many relationship where any user can see some documents, and some documents are visible to more than one user. In this case, the above multi-valued array trick might help. You would have to repeat data for every role. - Original Message - | From: Swati Swoboda sswob...@igloosoftware.com | To: solr-user@lucene.apache.org | Sent: Monday, September 17, 2012 12:33:48 PM | Subject: RE: Selective field level security | | Hi Nalini, | | We had similar requirements and this is how we did it (using your | example): | | Record A: | Field1_All: something | Field1_Private: something | Field2_All: '' | Field2_Private: something private | Field3_All: '' | Field3_Private: something very private | | Fields_All: something | Fields_Private: something something private something very | private | | Basically, we're just using a lot of copy fields and dynamic fields. | Instead of storing a type, we just change the column name. So if | someone who had access to private fields, we would perform our | search in the private column fields: | | (fields_private:something) | | Or if you want a specific field: | | (field1_private:something) OR (field2_private:something) or | (field3_private:something) | | Likewise, if someone didn't have access to the private fields, we | would only search in the all fields. We also created a super | field so that we don't have to search each individual field -- we | use copyfields to copy all private fields into the super field and | just search that. | | I hope this helps. | | Swati | | -Original Message- | From: Nalini Kartha [mailto:nalinikar...@gmail.com] | Sent: Monday, September 17, 2012 2:45 PM | To: solr-user@lucene.apache.org | Subject: Selective field level security | | Hi, | | We're trying to push some security related info into the index which | will control which users can search certain fields and we're | wondering what the best way to accomplish this is. | | Some records that are being indexed and searched can have certain | fields marked as private. When a field is marked as private, some | querying users should not see/search on it whereas some super users | can. | | Here's the solutions we're considering - | |- Index a separate boolean value into a new _INTERNAL field to |indicate |if the corresponding field value is marked private or not and |include a |filter in the query when the searching user is not a super user. | | So for eg., consider that a record can contain 3 fields - field[123] | where | field1 and field2 can be marked as private but field3 cannot. | | Record A has only field1 marked as private, record B has both field1 | and | field2 marked as private. | | When we index these records here's what we'd end up with in the index | - | | Record A - | field1:something, field1_INTERNAL:1, field2:something, | field2_INTERNAL:0, field3:something Record B - field1:something, | field1_INTERNAL:1, field2:something, field2_INTERNAL:1, | field3:something | | If the searching user is NOT a super user then the query (let's say | it's 'hidden security') needs to look like this- | | ((field3:hidden) OR (field1:hidden AND field1_INTERNAL:0) OR | (field2:hidden AND field2_INTERNAL:0)) AND ((field3:security) OR | (field1:security AND | field1_INTERNAL:0) OR (field2:security AND field2_INTERNAL:0)) | | Manipulating the query this way seems painful and error prone so | we're wondering if Solr provides anything out of the box that would | help with this? | | |- Index the private values themselves into a separate _INTERNAL |field |and then determine which fields to query depending on the |visibility of the |searching user. | | So using the example from above, here's what the indexed records | would look like - | | Record A - field1_INTERNAL:something, field2:something, | field3:something Record B - field1_INTERNAL:something, | field2_INTERNAL:something, field3:something | | If the searching user is NOT a super user then the query just needs | to be against the regular fields whereas if the searching user IS a | super user, the query needs to be against BOTH the regular and | INTERNAL fields. | | The issue with this solution is that since the number of docs that | include the INTERNAL fields is going to be much fewer we're | wondering if relevancy would be messed up when we're querying both | regular and internal fields for super users? | | Thoughts? | | Thanks, | Nalini |
RE: highlighting of text field in Japanese
I am using the following defines and query, and want to hightlight of the title and body elements of HTML documents. FieldTypes defines: = fieldType name=text_ja class=solr.TextField positionIncrementGap=100 autoGeneratePhraseQueries=true analyzer type=index tokenizer class=solr.GosenTokenizerFactory/ /analyzer /fieldType fieldType name=text_ja_bigram class=solr.TextField positionIncrementGap=100 autoGeneratePhraseQueries=true analyzer tokenizer class=solr.CJKTokenizerFactory/ /analyzer /fieldType = Field defines: = field name=title type=text_ja indexed=true stored=true multiValued=true/ field name=body type=text_ja indexed=true stored=true multiValued=true/ field name=title_bigram type=text_ja_bigram indexed=true stored=true multiValued=true/ field name=body_bigram type=text_ja_bigram indexed=true stored=true multiValued=true/ copyField source=title dest=title_bigram/ copyField source=body dest=body_bigram/ = Query: = q=foodefType=edismaxqf=title^2+title_bigram^2+body+body_bigram = If I set the hl.fl of highlightcomponent as title and body, the result of bigram (CJKTokenizer) cannot be highlighted. Regards, Qiao HU -Original Message- From: Erick Erickson [mailto:erickerick...@gmail.com] Sent: Saturday, September 15, 2012 10:06 PM To: solr-user@lucene.apache.org Subject: Re: highlighting of text field in Japanese I'm not quite sure I follow (and I know nothing about how the highlighter works with Japanese). But, you don't highlight fieldTypes, you highlight individual fields and it's just a comma (or space) separated list. You can set these either on the URL or in the solrconfig.xml file for your particular request handler, see the /browse handler for an example. If that doesn't help, show us the field definitions and what you've tried Best Erick On Fri, Sep 14, 2012 at 2:18 AM, chau...@sunmoretec.co.jp wrote: Hi, I am very new for Solr. I am using edismax that combines two fieldTypes of CJKTokenizer and GosenTokenizer to query for japanese text. but do not how to set the hl.fl of highlightcomponent for the two fieldTypes with the same contents. Could you guys offer me some advice please? Regards, Qiao HU
Re: Logging from data-config.xml
I have the same error. can you guide me how to solve this error?my id : bhavesh.jogi...@gmail.com -- View this message in context: http://lucene.472066.n3.nabble.com/Logging-from-data-config-xml-tp3956009p4008540.html Sent from the Solr - User mailing list archive at Nabble.com.