Re: deploy a brand new index in solrcloud
I've thought in setting replication in solrCloud: http://www.searchworkings.org/forum/-/message_boards/view_message/339527#_19_message_339527 What I don't know is if while replication is being handled, the replica slaves (that are not the master in replication) can keep handling puts via transaction log -- View this message in context: http://lucene.472066.n3.nabble.com/deploy-a-brand-new-index-in-solrcloud-tp3988731p3988757.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: How to do custom sorting in Solr?
Hi All I have an index which contains a Catalog of Products and Categories, with Solr 4.0 from trunk Data is organized like this: Category: Books Sub Category: Programming Products: Product # 1, Price: Regular Sort Order:1 Product # 2, Price: Markdown, Sort Order:2 Product # 3 Price: Regular, Sort Order:3 Product # 4 Price: Regular, Sort Order:4 . ... Product # 100 Price: Regular, Sort Order:100 Sub Category: Fiction Products: Product # 1, Price: Markdown, Sort Order:1 Product # 2, Price: Regular, Sort Order:2 Product # 3 Price: Regular, Sort Order:3 Product # 4 Price: Markdown, Sort Order:4 . ... Product # 70 Price: Regular, Sort Order:70 I want to query Solr and sort these products within each of the sub-category in a such a way that products which are on markdown, are at the bottom of the documents list and other products which are on regular price, are sorted as per their sort order in their sub-category. Expected Results are Category: Books Sub Category: Programming Products: Product # 1, Price: Regular Sort Order:1 Product # 2, Price: Markdown, Sort Order:101 Product # 3 Price: Regular, Sort Order:3 Product # 4 Price: Regular, Sort Order:4 . ... Product # 100 Price: Regular, Sort Order:100 Sub Category: Fiction Products: Product # 1, Price: Markdown, Sort Order:71 Product # 2, Price: Regular, Sort Order:2 Product # 3 Price: Regular, Sort Order:3 Product # 4 Price: Markdown, Sort Order:71 . ... Product # 70 Price: Regular, Sort Order:70 My query is like this: q=*:*fq=category:Books What are the options to implement custom sorting and how do I do it? - Define a Custom Function query? - Define a Custom Comparator? Or, - Define a Custom Collector? Please let me know the best way to go about it and any pointers to customize Solr 4. Thanks Saroj
x most similar documents
Hi there, i have a solr server running containing tweets. my schema.xml contains following fields: fields field name=id type=string indexed=true stored=true required=true / field name=tweet type=text_general indexed=true stored=true termVectors=true/ field name=hashtags type=text_general indexed=true stored=true termVectors=true/ /fields my problem is actually quite simple; somewhere in my GUI the user types text and i want to retrieve tweets that are most similar to it. Therefore, i tried the morelikethis functionality. My problem is that currently, mlt finds additional tweets to every tweet found by the select handler. I'm not sure however if the select handler finds the most fitting tweet or just returns the first match. currently, i am using following query: http://localhost:8983/solr/select/?q=tweet:heavenmlt=truemlt.fl=tweet,hashtagswt=jsonindent=true Am i missing something critical? So eventually, i just want to retrieve x tweets with the most similar text, sorted by their similarity (cosine of termVectors). Is MoreLikeThis the way to go? Thanks in advance!
Re: How to do custom sorting in Solr?
Skimming this, I two options come to mind: 1 Simply apply primary, secondary, etc sorts. Something like sort=subcategory asc,markdown_or_regular desc,sort_order asc 2 You could also use grouping to arrange things in groups and sort within those groups. This has the advantage of returning some members of each of the top N groups in the result set, which makes it easier to get some of each group rather than having to analyze the whole list But your example is somewhat contradictory. You say products which are on markdown, are at the bottom of the documents list But in your examples, products on markdown are intermingled Best Erick On Sun, Jun 10, 2012 at 3:36 AM, roz dev rozde...@gmail.com wrote: Hi All I have an index which contains a Catalog of Products and Categories, with Solr 4.0 from trunk Data is organized like this: Category: Books Sub Category: Programming Products: Product # 1, Price: Regular Sort Order:1 Product # 2, Price: Markdown, Sort Order:2 Product # 3 Price: Regular, Sort Order:3 Product # 4 Price: Regular, Sort Order:4 . ... Product # 100 Price: Regular, Sort Order:100 Sub Category: Fiction Products: Product # 1, Price: Markdown, Sort Order:1 Product # 2, Price: Regular, Sort Order:2 Product # 3 Price: Regular, Sort Order:3 Product # 4 Price: Markdown, Sort Order:4 . ... Product # 70 Price: Regular, Sort Order:70 I want to query Solr and sort these products within each of the sub-category in a such a way that products which are on markdown, are at the bottom of the documents list and other products which are on regular price, are sorted as per their sort order in their sub-category. Expected Results are Category: Books Sub Category: Programming Products: Product # 1, Price: Regular Sort Order:1 Product # 2, Price: Markdown, Sort Order:101 Product # 3 Price: Regular, Sort Order:3 Product # 4 Price: Regular, Sort Order:4 . ... Product # 100 Price: Regular, Sort Order:100 Sub Category: Fiction Products: Product # 1, Price: Markdown, Sort Order:71 Product # 2, Price: Regular, Sort Order:2 Product # 3 Price: Regular, Sort Order:3 Product # 4 Price: Markdown, Sort Order:71 . ... Product # 70 Price: Regular, Sort Order:70 My query is like this: q=*:*fq=category:Books What are the options to implement custom sorting and how do I do it? - Define a Custom Function query? - Define a Custom Comparator? Or, - Define a Custom Collector? Please let me know the best way to go about it and any pointers to customize Solr 4. Thanks Saroj
Re: x most similar documents
Yes, it sounds like MLT is the way to go, but sometimes you have to get creative in figuring out how to set the numerous parameters. And sometimes you have to use the MLT request handler rather than /select with the MLT component. You might also encounter issues related to the shortness of the text of tweets. Some of the MLT parameters might be optimized for much larger texts. Can you give us an example of a (very brief) tweet that your query finds, the tweet(s) that MLT returns, and what other tweet(s) you would have expected. MLT will use the first search result from the original query. -- Jack Krupansky -Original Message- From: Benjamin Murauer Sent: Sunday, June 10, 2012 7:32 AM To: solr-user@lucene.apache.org Subject: x most similar documents Hi there, i have a solr server running containing tweets. my schema.xml contains following fields: fields field name=id type=string indexed=true stored=true required=true / field name=tweet type=text_general indexed=true stored=true termVectors=true/ field name=hashtags type=text_general indexed=true stored=true termVectors=true/ /fields my problem is actually quite simple; somewhere in my GUI the user types text and i want to retrieve tweets that are most similar to it. Therefore, i tried the morelikethis functionality. My problem is that currently, mlt finds additional tweets to every tweet found by the select handler. I'm not sure however if the select handler finds the most fitting tweet or just returns the first match. currently, i am using following query: http://localhost:8983/solr/select/?q=tweet:heavenmlt=truemlt.fl=tweet,hashtagswt=jsonindent=true Am i missing something critical? So eventually, i just want to retrieve x tweets with the most similar text, sorted by their similarity (cosine of termVectors). Is MoreLikeThis the way to go? Thanks in advance!
Re: x most similar documents
Oops, I said MLT will use the first search result from the original query, but that is for the MLT handler. For the MLT component you get a separate set of documents for each document in the results of the original query. -- Jack Krupansky -Original Message- From: Jack Krupansky Sent: Sunday, June 10, 2012 1:25 PM To: solr-user@lucene.apache.org Subject: Re: x most similar documents Yes, it sounds like MLT is the way to go, but sometimes you have to get creative in figuring out how to set the numerous parameters. And sometimes you have to use the MLT request handler rather than /select with the MLT component. You might also encounter issues related to the shortness of the text of tweets. Some of the MLT parameters might be optimized for much larger texts. Can you give us an example of a (very brief) tweet that your query finds, the tweet(s) that MLT returns, and what other tweet(s) you would have expected. MLT will use the first search result from the original query. -- Jack Krupansky -Original Message- From: Benjamin Murauer Sent: Sunday, June 10, 2012 7:32 AM To: solr-user@lucene.apache.org Subject: x most similar documents Hi there, i have a solr server running containing tweets. my schema.xml contains following fields: fields field name=id type=string indexed=true stored=true required=true / field name=tweet type=text_general indexed=true stored=true termVectors=true/ field name=hashtags type=text_general indexed=true stored=true termVectors=true/ /fields my problem is actually quite simple; somewhere in my GUI the user types text and i want to retrieve tweets that are most similar to it. Therefore, i tried the morelikethis functionality. My problem is that currently, mlt finds additional tweets to every tweet found by the select handler. I'm not sure however if the select handler finds the most fitting tweet or just returns the first match. currently, i am using following query: http://localhost:8983/solr/select/?q=tweet:heavenmlt=truemlt.fl=tweet,hashtagswt=jsonindent=true Am i missing something critical? So eventually, i just want to retrieve x tweets with the most similar text, sorted by their similarity (cosine of termVectors). Is MoreLikeThis the way to go? Thanks in advance!
Building a heat map from geo data in index
I had a request from a customer which to this point I have not seen much similar so I figured I'd pose the question here. I've been asked if it was possible to build a heat map from the results of a query. I can imagine a process to do this through some post processing, but that sounds very expensive for large/distributed indices so I was wondering if with all of the new geospatial support that is being added to lucene/solr there was a way to do geospatial faceting. What I am imagining is bounding box being defined and that box being broken into an N by N matrix, each of which would return counts so a heat map could be constructed. Any other thoughts on this would be greatly appreciated, right now I am really just fishing for some ideas.
Re: How to do custom sorting in Solr?
Thanks Erik for your quick feedback When Products are assigned to a category or Sub-Category then they can be in any order and price type can be regular or markdown. So, reg and markdown products are intermingled as per their assignment but I want to sort them in such a way that we ensure that all the products which are on markdown are at the bottom of the list. I can use these multiple sorts but I realize that they are costly in terms of heap used, as they are using FieldCache. I have an index with 2M docs and docs are pretty big. So, I don't want to use them unless there is no other option. I am wondering if I can define a custom function query which can be like this: - check if product is on the markdown - if yes then change its sort order field to be the max value in the given sub-category, say 99 - else, use the sort order of the product in the sub-category I have been looking at existing function queries but do not have a good handle on how to make one of my own. - Another option could be use a custom sort comparator but I am not sure about the way it works Any thoughts? -Saroj On Sun, Jun 10, 2012 at 5:02 AM, Erick Erickson erickerick...@gmail.comwrote: Skimming this, I two options come to mind: 1 Simply apply primary, secondary, etc sorts. Something like sort=subcategory asc,markdown_or_regular desc,sort_order asc 2 You could also use grouping to arrange things in groups and sort within those groups. This has the advantage of returning some members of each of the top N groups in the result set, which makes it easier to get some of each group rather than having to analyze the whole list But your example is somewhat contradictory. You say products which are on markdown, are at the bottom of the documents list But in your examples, products on markdown are intermingled Best Erick On Sun, Jun 10, 2012 at 3:36 AM, roz dev rozde...@gmail.com wrote: Hi All I have an index which contains a Catalog of Products and Categories, with Solr 4.0 from trunk Data is organized like this: Category: Books Sub Category: Programming Products: Product # 1, Price: Regular Sort Order:1 Product # 2, Price: Markdown, Sort Order:2 Product # 3 Price: Regular, Sort Order:3 Product # 4 Price: Regular, Sort Order:4 . ... Product # 100 Price: Regular, Sort Order:100 Sub Category: Fiction Products: Product # 1, Price: Markdown, Sort Order:1 Product # 2, Price: Regular, Sort Order:2 Product # 3 Price: Regular, Sort Order:3 Product # 4 Price: Markdown, Sort Order:4 . ... Product # 70 Price: Regular, Sort Order:70 I want to query Solr and sort these products within each of the sub-category in a such a way that products which are on markdown, are at the bottom of the documents list and other products which are on regular price, are sorted as per their sort order in their sub-category. Expected Results are Category: Books Sub Category: Programming Products: Product # 1, Price: Regular Sort Order:1 Product # 2, Price: Markdown, Sort Order:101 Product # 3 Price: Regular, Sort Order:3 Product # 4 Price: Regular, Sort Order:4 . ... Product # 100 Price: Regular, Sort Order:100 Sub Category: Fiction Products: Product # 1, Price: Markdown, Sort Order:71 Product # 2, Price: Regular, Sort Order:2 Product # 3 Price: Regular, Sort Order:3 Product # 4 Price: Markdown, Sort Order:71 . ... Product # 70 Price: Regular, Sort Order:70 My query is like this: q=*:*fq=category:Books What are the options to implement custom sorting and how do I do it? - Define a Custom Function query? - Define a Custom Comparator? Or, - Define a Custom Collector? Please let me know the best way to go about it and any pointers to customize Solr 4. Thanks Saroj
Re: How to do custom sorting in Solr?
2M docs is actually pretty small. Sorting is sensitive to the number of _unique_ values in the sort fields, not necessarily the number of documents. And sorting only works on fields with a single value (i.e. it can't have more than one token after analysis). So for each field you're only talking 2M values at the vary maximum, assuming that the field in question has a unique value per document, which I doubt very much given your problem description. So with a corpus that size, I'd just try it'. Best Erick On Sun, Jun 10, 2012 at 7:12 PM, roz dev rozde...@gmail.com wrote: Thanks Erik for your quick feedback When Products are assigned to a category or Sub-Category then they can be in any order and price type can be regular or markdown. So, reg and markdown products are intermingled as per their assignment but I want to sort them in such a way that we ensure that all the products which are on markdown are at the bottom of the list. I can use these multiple sorts but I realize that they are costly in terms of heap used, as they are using FieldCache. I have an index with 2M docs and docs are pretty big. So, I don't want to use them unless there is no other option. I am wondering if I can define a custom function query which can be like this: - check if product is on the markdown - if yes then change its sort order field to be the max value in the given sub-category, say 99 - else, use the sort order of the product in the sub-category I have been looking at existing function queries but do not have a good handle on how to make one of my own. - Another option could be use a custom sort comparator but I am not sure about the way it works Any thoughts? -Saroj On Sun, Jun 10, 2012 at 5:02 AM, Erick Erickson erickerick...@gmail.comwrote: Skimming this, I two options come to mind: 1 Simply apply primary, secondary, etc sorts. Something like sort=subcategory asc,markdown_or_regular desc,sort_order asc 2 You could also use grouping to arrange things in groups and sort within those groups. This has the advantage of returning some members of each of the top N groups in the result set, which makes it easier to get some of each group rather than having to analyze the whole list But your example is somewhat contradictory. You say products which are on markdown, are at the bottom of the documents list But in your examples, products on markdown are intermingled Best Erick On Sun, Jun 10, 2012 at 3:36 AM, roz dev rozde...@gmail.com wrote: Hi All I have an index which contains a Catalog of Products and Categories, with Solr 4.0 from trunk Data is organized like this: Category: Books Sub Category: Programming Products: Product # 1, Price: Regular Sort Order:1 Product # 2, Price: Markdown, Sort Order:2 Product # 3 Price: Regular, Sort Order:3 Product # 4 Price: Regular, Sort Order:4 . ... Product # 100 Price: Regular, Sort Order:100 Sub Category: Fiction Products: Product # 1, Price: Markdown, Sort Order:1 Product # 2, Price: Regular, Sort Order:2 Product # 3 Price: Regular, Sort Order:3 Product # 4 Price: Markdown, Sort Order:4 . ... Product # 70 Price: Regular, Sort Order:70 I want to query Solr and sort these products within each of the sub-category in a such a way that products which are on markdown, are at the bottom of the documents list and other products which are on regular price, are sorted as per their sort order in their sub-category. Expected Results are Category: Books Sub Category: Programming Products: Product # 1, Price: Regular Sort Order:1 Product # 2, Price: Markdown, Sort Order:101 Product # 3 Price: Regular, Sort Order:3 Product # 4 Price: Regular, Sort Order:4 . ... Product # 100 Price: Regular, Sort Order:100 Sub Category: Fiction Products: Product # 1, Price: Markdown, Sort Order:71 Product # 2, Price: Regular, Sort Order:2 Product # 3 Price: Regular, Sort Order:3 Product # 4 Price: Markdown, Sort Order:71 . ... Product # 70 Price: Regular, Sort Order:70 My query is like this: q=*:*fq=category:Books What are the options to implement custom sorting and how do I do it? - Define a Custom Function query? - Define a Custom Comparator? Or, - Define a Custom Collector? Please let me know the best way to go about it and any pointers to customize Solr 4. Thanks Saroj
Re: How to do custom sorting in Solr?
Yes, these documents have lots of unique values as the same product could be assigned to lots of other categories and that too, in a different sort order. We did some evaluation of heap usage and found that with kind of queries we generate, heap usage was going up to 24-26 GB. I could trace it to the fact that fieldCache is creating an array of 2M size for each of the sort fields. Since same products are mapped to multiple categories, we incur significant memory overhead. Therefore, any solve where memory consumption can be reduced is a good one for me. In fact, we have situations where same product is mapped to more than 1 sub-category in the same category like Books -- Programming - Java in a nutshell -- Sale (40% off) - Java in a nutshell So,another thought in my mind is to somehow use second pass collector to group books appropriately in Programming and Sale categories, with right sort order. But, i have no clue about that piece :( -Saroj On Sun, Jun 10, 2012 at 4:30 PM, Erick Erickson erickerick...@gmail.comwrote: 2M docs is actually pretty small. Sorting is sensitive to the number of _unique_ values in the sort fields, not necessarily the number of documents. And sorting only works on fields with a single value (i.e. it can't have more than one token after analysis). So for each field you're only talking 2M values at the vary maximum, assuming that the field in question has a unique value per document, which I doubt very much given your problem description. So with a corpus that size, I'd just try it'. Best Erick On Sun, Jun 10, 2012 at 7:12 PM, roz dev rozde...@gmail.com wrote: Thanks Erik for your quick feedback When Products are assigned to a category or Sub-Category then they can be in any order and price type can be regular or markdown. So, reg and markdown products are intermingled as per their assignment but I want to sort them in such a way that we ensure that all the products which are on markdown are at the bottom of the list. I can use these multiple sorts but I realize that they are costly in terms of heap used, as they are using FieldCache. I have an index with 2M docs and docs are pretty big. So, I don't want to use them unless there is no other option. I am wondering if I can define a custom function query which can be like this: - check if product is on the markdown - if yes then change its sort order field to be the max value in the given sub-category, say 99 - else, use the sort order of the product in the sub-category I have been looking at existing function queries but do not have a good handle on how to make one of my own. - Another option could be use a custom sort comparator but I am not sure about the way it works Any thoughts? -Saroj On Sun, Jun 10, 2012 at 5:02 AM, Erick Erickson erickerick...@gmail.com wrote: Skimming this, I two options come to mind: 1 Simply apply primary, secondary, etc sorts. Something like sort=subcategory asc,markdown_or_regular desc,sort_order asc 2 You could also use grouping to arrange things in groups and sort within those groups. This has the advantage of returning some members of each of the top N groups in the result set, which makes it easier to get some of each group rather than having to analyze the whole list But your example is somewhat contradictory. You say products which are on markdown, are at the bottom of the documents list But in your examples, products on markdown are intermingled Best Erick On Sun, Jun 10, 2012 at 3:36 AM, roz dev rozde...@gmail.com wrote: Hi All I have an index which contains a Catalog of Products and Categories, with Solr 4.0 from trunk Data is organized like this: Category: Books Sub Category: Programming Products: Product # 1, Price: Regular Sort Order:1 Product # 2, Price: Markdown, Sort Order:2 Product # 3 Price: Regular, Sort Order:3 Product # 4 Price: Regular, Sort Order:4 . ... Product # 100 Price: Regular, Sort Order:100 Sub Category: Fiction Products: Product # 1, Price: Markdown, Sort Order:1 Product # 2, Price: Regular, Sort Order:2 Product # 3 Price: Regular, Sort Order:3 Product # 4 Price: Markdown, Sort Order:4 . ... Product # 70 Price: Regular, Sort Order:70 I want to query Solr and sort these products within each of the sub-category in a such a way that products which are on markdown, are at the bottom of the documents list and other products which are on regular price, are sorted as per their sort order in their sub-category. Expected Results are Category: Books Sub Category: Programming Products: Product # 1, Price: Regular Sort Order:1 Product # 2, Price: Markdown, Sort Order:101 Product
Issues with whitespace tokenization in QueryParser
According to https://issues.apache.org/jira/browse/LUCENE-2605, the Lucene QueryParser tokenizes on white space before giving any text to the Analyzer. This makes it impossible to use multi-term synonyms because the SynonymFilter only receives one word at a time. Resolution to this would really help with my current project. My project client sells clothing and accessories online. They have plenty of examples of compound words e.g.rain coat. But some of these compound words are really tripping them up. A prime example is that a search for dress shoes returns a list of dresses and random shoes (not necessarily dress shoes). I wish that I was able to synonym compound words to single tokens (e.g. dress shoes = dress_shoes), but with this whitespace tokenization issue, it's impossible. Has anything happened with this bug recently? For a short time I've got a client that would be willing to pay for this issues to be fixed if it's not too much of a rabbit hole. Anyone care to catch me up with what this might entail? LinkedIn http://www.linkedin.com/pub/john-berryman/13/b17/864 Twitter http://twitter.com/#!/jnbrymn
Re: What would cause: SEVERE: java.lang.ClassCastException: com.company.MyCustomTokenizerFactory cannot be cast to org.apache.solr.analysis.TokenizerFactory
Jack, Thanks - this was indeed the issue. I still don't understand exactly why (the same local-nexus-hosted Solr jars were the ones being duplicated on the classpath: included in my custom -with-dependencies jars as well as in the solr war, which was build/distributed/and hosted from the same nexus repo used to host my jars) but shading solr from my -with-dependencies jars fixed the issue. (if anybody could point me to reading on why this happened - e.g. the classes on the classpath would be duplicated but identical, in my naive understanding of the classloader this should have still just worked - it would be appreciated) Thanks again, Aaron On Sat, Jun 9, 2012 at 2:40 PM, Jack Krupansky j...@basetechnology.comwrote: Make sure there are no stray jars/classes in your jar, especially any that might contain BaseTokenizerFactory or TokenizerFactory. I notice that your jar name says -with-dependencies, raising a little suspicion. The exception is as if your class was referring to a BaseTokenizerFactory, which implements TokenizerFactory, coming from your jar (or a contained jar) rather than getting resolved to Solr 3.6's own BaseTokenizerFactory and TokenizerFactory. -- Jack Krupansky -Original Message- From: Aaron Daubman Sent: Saturday, June 09, 2012 12:03 AM To: solr-user@lucene.apache.org Subject: What would cause: SEVERE: java.lang.ClassCastException: com.company.**MyCustomTokenizerFactory cannot be cast to org.apache.solr.analysis.**TokenizerFactory Greetings, I am in the process of updating custom code and schema from Solr 1.4 to 3.6.0 and have run into the following issue with our two custom Tokenizer and Token Filter components. I've been banging my head against this one for far too long, especially since it must be something obvious I'm missing. I have custom Tokenizer and Token Filter components along with corresponding factories. The code for all looks very similar to the Tokenizer and TokenFilter (and Factory) code that is standard with 3.6.0 (and I have also read through http://wiki.apache.org/solr/**AnalyzersTokenizersTokenFilter**shttp://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters I have ensured my custom code is on the classpath, it is in ENSolrComponents-1.0-SNAPSHOT-**jar-with-dependencies.jar: ---output snip--- Jun 8, 2012 10:41:00 PM org.apache.solr.core.**CoreContainer load INFO: loading shared library: /opt/test_artists_solr/jetty-**solr/lib/en Jun 8, 2012 10:41:00 PM org.apache.solr.core.**SolrResourceLoader replaceClassLoader INFO: Adding 'file:/opt/test_artists_solr/**jetty-solr/lib/en/** ENSolrComponents-1.0-SNAPSHOT-**jar-with-dependencies.jar' to classloader Jun 8, 2012 10:41:00 PM org.apache.solr.core.**SolrResourceLoader replaceClassLoader INFO: Adding 'file:/opt/test_artists_solr/**jetty-solr/lib/en/ENUtil-1.0-** SNAPSHOT-jar-with-**dependencies.jar' to classloader Jun 8, 2012 10:41:00 PM org.apache.solr.core.**CoreContainer create --snip--- After successfully parsing the schema and creating many fields, etc.. the following is logged: ---snip--- Jun 8, 2012 10:41:00 PM org.apache.solr.util.plugin.**AbstractPluginLoader load INFO: created : com.company.**MyCustomTokenizerFactory Jun 8, 2012 10:41:00 PM org.apache.solr.common.**SolrException log SEVERE: java.lang.ClassCastException: com.company.** MyCustomTokenizerFactory cannot be cast to org.apache.solr.analysis.**TokenizerFactory at org.apache.solr.schema.**IndexSchema$5.init(**IndexSchema.java:966) at org.apache.solr.util.plugin.**AbstractPluginLoader.load(** AbstractPluginLoader.java:148) at org.apache.solr.schema.**IndexSchema.readAnalyzer(** IndexSchema.java:986) at org.apache.solr.schema.**IndexSchema.access$100(**IndexSchema.java:60) at org.apache.solr.schema.**IndexSchema$1.create(**IndexSchema.java:453) at org.apache.solr.schema.**IndexSchema$1.create(**IndexSchema.java:433) at org.apache.solr.util.plugin.**AbstractPluginLoader.load(** AbstractPluginLoader.java:140) at org.apache.solr.schema.**IndexSchema.readSchema(**IndexSchema.java:490) at org.apache.solr.schema.**IndexSchema.init(**IndexSchema.java:123) at org.apache.solr.core.**CoreContainer.create(**CoreContainer.java:481) at org.apache.solr.core.**CoreContainer.load(**CoreContainer.java:335) at org.apache.solr.core.**CoreContainer.load(**CoreContainer.java:219) at org.apache.solr.core.**CoreContainer$Initializer.** initialize(CoreContainer.java:**161) at org.apache.solr.servlet.**SolrDispatchFilter.init(** SolrDispatchFilter.java:96) at org.eclipse.jetty.servlet.**FilterHolder.doStart(** FilterHolder.java:102) at org.eclipse.jetty.util.**component.AbstractLifeCycle.** start(AbstractLifeCycle.java:**59) at org.eclipse.jetty.servlet.**ServletHandler.initialize(** ServletHandler.java:748) at org.eclipse.jetty.servlet.**ServletContextHandler.**startContext(** ServletContextHandler.java:**249) at
Re: Correct way to deal with source data that may include a multivalued field that needs to be used for sorting?
Hoss, The new FieldValueSubsetUpdateProcessorFactory classes look phenomenal. I haven't looked yet, but what are the chances these will be back-ported to 3.6 (or how hard would it be to backport them?)... I'll have to check out the source in more detail. If stuck on 3.6, what would be the best way to deal with this situation? It's currently looking like it will have to be a custom update handler, but I'd hate to have to go down this route if there are more future-proof options. Thanks again, Aaron On Tue, Jun 5, 2012 at 6:53 PM, Chris Hostetter hossman_luc...@fucit.orgwrote: : The real issue here is that the docs are created externally, and the : producer won't (yet) guarantee that fields that should appear once will : actually appear once. Because of this, I don't want to declare the field as : multiValued=false as I don't want to cause indexing errors. It would be : great for me (and apparently many others after searching) if there were an : option as simple as forceSingleValued=true - where some deterministic : behavior such as use first field encountered, ignore all others, would : occur. This will be trivial in Solr 4.0, using one of the new FieldValueSubsetUpdateProcessorFactory classes that are now available -- just pick your rule... https://builds.apache.org/view/G-L/view/Lucene/job/Solr-trunk/javadoc/org/apache/solr/update/processor/FieldValueSubsetUpdateProcessorFactory.html Direct Known Subclasses: FirstFieldValueUpdateProcessorFactory, LastFieldValueUpdateProcessorFactory, MaxFieldValueUpdateProcessorFactory, MinFieldValueUpdateProcessorFactory -Hoss