Re: Adding new field after data is already indexed
Check Solr: Add new fields with Default Value for Existing Documents http://lifelongprogrammer.blogspot.com/2013/06/solr-use-doctransformer-to-change.html If we only need search and display the new fields, we can do the following steps. 1. add the new field definition in schema.xml: field name=newFiled type=tint indexed=true stored=true default=-1/ 2. We need update search query: when search default value for this newFiled, also search null value: -(-newFiled:defaultValue AND newFiled:[* TO *]) 3. Use DocTransformer to add default value when there is no value in that field for old data. Some functions may not work such as sort, stats. -- View this message in context: http://lucene.472066.n3.nabble.com/Adding-new-field-after-data-is-already-indexed-tp1862575p4103440.html Sent from the Solr - User mailing list archive at Nabble.com.
The way edismax parses colon seems weird
In our application, user may search error code like 12:34. We define default search field, like: str name=qftitle^10 body_stored^8 content^5/str So when user search: 12:34, we want to search the error code in the specified fields. In the code, if we search q=12:34 directly, this can't find anything. It's expected as it'ss to search 34 on 12 field. Then we try to escape the colon, search: 12\:34, the parsedquery would be +12\:34, still can't find the expected page. str name=parsedquery(+12\:34)/no_coord/str str name=parsedquery_toString+12\:34/str str name=QParserExtendedDismaxQParser/str If I type 2 \\, seems it can find the error page: q=12\\:34 str name=parsedquery (+DisjunctionMaxQuery((content:12 34^0.5 | body_stored:(12\:34 12) 34^0.8 | title:12 34^1.1)))/no_coord /str str name=parsedquery_toString +(content:12 34^0.5 | body_stored:(12\:34 12) 34^0.8 | title:12 34^1.1) /str str name=QParserExtendedDismaxQParser/str Is this a bug in Solr edismax or not? -- View this message in context: http://lucene.472066.n3.nabble.com/The-way-edismax-parses-colon-seems-weird-tp4079226.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: The way edismax parses colon seems weird
Thanks very much for the reply. We are querying solr directly from browser: http://localhost:8080/solr/select?q=12\:34defType=edismaxdebug=queryqf=content str name=rawquerystring12\:34/str str name=querystring12\:34/str str name=parsedquery(+12\:34)/no_coord/str str name=parsedquery_toString+12\:34/str str name=QParserExtendedDismaxQParser/str And seems this is not related with which (default) field I use to query. -- View this message in context: http://lucene.472066.n3.nabble.com/The-way-edismax-parses-colon-seems-weird-tp4079226p4079234.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Difference between IntField and TrieIntField in Lucene 4.0
Thanks very much, Yonik. I should read the Javadoc of Solr's IntField and TrieIntField. In Javadoc of Solr's IntField, IntField is marked as legacy field type: A legacy numeric field type that encodes Integer values as simple Strings. This class should not be used except by people with existing indexes that contain numeric values indexed as Strings. New schemas should use TrieIntField. Field values will sort numerically, but Range Queries (and other features that rely on numeric ranges) will not work as expected: values will be evaluated in unicode String order, not numeric order. I remembered I read this a few weeks ago, today when discussed with coworker, and we looked at the javadoc. It is not what I expected. Thanks again for your prompt reply :) -- View this message in context: http://lucene.472066.n3.nabble.com/Difference-between-IntField-and-TrieIntField-in-Lucene-4-0-tp4032938p4032953.html Sent from the Solr - User mailing list archive at Nabble.com.
Solr stats.facet on TrieField doesn't work
This seems an known issue: http://wiki.apache.org/solr/StatsComponent TrieFields has to use a precisionStep of -1 to avoid using UnInvertedField.java. Consider using one field for doing stats, and one for doing range facetting on. To fix this problem. and support dacet search on this field, I have to create another field, with precisionStep=2147483647(Integer,MAX_VALUE), this is not good, as it takes more disk size, and it's hard to explain to customers why we need this field. Seem this problem is already reported and tracked by https://issues.apache.org/jira/browse/SOLR-2976, but there is no update since 03/Jan/12. Does Solr team have any plan to fix this problem? The follwing is my test result: I have 2 fields, one field is effectiveSize_tl, type:TrieLongField, precisionStep=8, default setting. One field is ctime_tdt: type: TrieDateField, precisionStep=6, default setting. I also create 2 another fields: effectiveSize_tlMinus, same as effectiveSize_tl, except precisionStep=2147483647. ctime_tdtMinus, same as ctime_tdt, except precisionStep=2147483647. http://localhost:5678/solr/select?q=*:*rows=0stats=truestats.field=effectiveSize_tlstats.facet=ctime_tdt str name=msgInvalid Date String:'\#8;'/str http://localhost:5678/solr/select?q=*:*rows=0stats=truestats.field=effectiveSize_tl works http://localhost:5678/solr/select?q=*:*rows=0stats=truestats.field=ctime_tdt works This works correctly: - using both precisionStep=2147483647 fields: http://localhost:5678/solr/select?q=*:*rows=0stats=truestats.field=effectiveSize_tlMinusstats.facet=ctime_tdtMinus This doesn't throw error, but the result is totoally not correct. http://localhost:5678/solr/select?q=*:*rows=0stats=truestats.field=effectiveSize_tlMinusstats.facet=ctime_tdt http://localhost:5678/solr/select?q=*:*rows=0stats=truestats.field=effectiveSize_tlstats.facet=ctime_tdt throw exception: str name=msgInvalid Date String:'\#8;'/str http://localhost:5678/solr/select?q=*:*rows=0stats=truestats.field=effectiveSize_tlstats.facet=ctime_tdtMinus still throw exception str name=msgInvalid Date String:' #1;#0;#0;#0; #8;t#1;#20;#0;'/str -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-stats-facet-on-TrieField-doesn-t-work-tp4028175.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: Is there a way to round data when index, but still able to return original content?
Sorry to ask a question again, but I want to round date(TireDate) and TrieLongField, seems they don't support configuring analyzer: charFilter , tokenizer or filter. What I should do? Now I am thinking to write my custom date or long field, is there any other way? :) Thanks :) -- View this message in context: http://lucene.472066.n3.nabble.com/Is-there-a-way-to-round-data-when-index-but-still-able-to-return-original-content-tp4025405p4025793.html Sent from the Solr - User mailing list archive at Nabble.com.
Monitor Deleted Event
When some docs are deleted from Solr server, I want to execute some code - for example, add an record such as {contentid, deletedat} to another solr server or database. How can I do this through Solr or Lucene? Thanks for any reply and help :) -- View this message in context: http://lucene.472066.n3.nabble.com/Monitor-Deleted-Event-tp4015624.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: How to import a part of index from main Solr server(based on a query) to another Solr server and then do incremental import at intervals later(the updated index)?
Hi, all: Sorry for the late response: ) Thanks for your reply. I think Solr Replication may not help in my case, as the central server would store all docs of all users(1000+), and in each client, I only want to copy index of his/her docs created or changed in last 2 weeks(for example), after the first import, make a delta-import each day to get the changed or deleted index from remote central server. In my current implementation, I use DataImportHandler and SOlrEntityProcessor, in short: I write a new request handler: ImportLocalCacheHandler, url: /importcache for first import, I call /importcachequery?command=full-importfrom:jefferyfirst_index_time={first_index_time} In my ImportLocalCacheHandler, I will build a query, such as query=from:jefferylast_modified:{first_index_time TO NOW}, and then call /dataimport?command=full-importquery:{previous_query}. After it succeeds, save last_index_time to a property file. for delta-import, I call /importcachequery?command=delta-import In my ImportLocalCacheHandler, I will build a query like from:jefferylast_modified:{last_index_time TO NOW} and call /dataimport?command=full-importclean=falsequery={previous_query}. This will import index of docs created or changed between last_index_time TO NOW. But Now I am trying to figure out how to remove the index from local cache server that are alredy deleted in remote server but still exist in local cache. -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-import-a-part-of-index-from-main-Solr-server-based-on-a-query-to-another-Solr-server-and-then-tp4013479p4015633.html Sent from the Solr - User mailing list archive at Nabble.com.
How to import a part of index from main Solr server(based on a query) to another Solr server and then do incremental import at intervals later(the updated index)?
I have a main solr server(solr1) which stores indexes of all docs, and want to implement the following function: 1. First make a full import of my doc updated/created recently(last 1 or 2 weeks) from solr1. 2. Make delta import at intervals to copy the change of my doc from solr1 to solr2. - doc may be deleted, updated, created during this period. -- as the function supported by SqlEntityProcessor to import data from DB to Solr. http://wiki.apache.org/solr/DataImportHandler#SolrEntityProcessor SolrEntityProcessor can make a full-import from one Solr to another solr based on a query(using query parameter in config file), but seems can't do delta import later: no deltaImportQuery and deltaQuery configuration, which is supported in SqlEntityProcessor. I have a field last_modified which records the timestamp an doc is created or updated. Task1 can be easily implemented: entity name=sep processor=SolrEntityProcessor query=+from:jeffery +last_modified:[${dataimporter.request.start_time} TO NOW] url=mainsolr:8080/solr//; But how can implement incremental import with SolrEntityProcessor? Seems SolrEntityProcessor doesn't support command=delta-import. Thanks for any reply and help :) -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-import-a-part-of-index-from-main-Solr-server-based-on-a-query-to-another-Solr-server-and-then-tp4013479.html Sent from the Solr - User mailing list archive at Nabble.com.