Fuzzy search in solr

2013-05-24 Thread Sagar Chaturvedi
Hi, How to perform fuzzy search in solr? Which request handler is used for fuzzy search by default? Regards, Sagar DISCLAIMER: --- The contents of this e-mail and any

Why would one not use RemoveDuplicatesTokenFilterFactory?

2013-05-24 Thread Dotan Cohen
I am looking through the schema of a Solr installation that I inherited last year. The original dev, who is unavailable for comment, has two types of text fields: one with RemoveDuplicatesTokenFilterFactory and one without. These fields are intended for full-text search. Why would someone _not_

Re: Scheduling DataImports

2013-05-24 Thread smanad
Thanks for the reply. Regarding second question, actually thats what I am looking for. My use case is, my DIH runs for 2 httpdatasources, api1 and api2 with different ttls returned. I was thinking of saving this in a file something like, url:api1, timestamp:100, expires: 60 url:api2,

zk disconnects and failure to retry?

2013-05-24 Thread Daniel Collins
Had a scenario on a dev system here that has me confused. We have a simple Solr cloud (dev) system running 4.3, 4 shards, running on 2 machines (2 instances per machine), 2 ZKs (external) and no replicas (or 1 replica depending on your definition, we only have 1 instance of each shard!) Yes, we

AW: Core admin action CREATE fails for existing core

2013-05-24 Thread André Widhani
Added the issue. https://issues.apache.org/jira/browse/SOLR-4857 Core admin action RELOAD lacks requests parameters to point core to another config or schema file or data dir Von: André Widhani [andre.widh...@digicol.de] Gesendet: Donnerstag, 23. Mai

Re: Question about Coordination factor

2013-05-24 Thread Upayavira
Have you tried 4.3 yourself? I'm sure it wouldn't be hard to do a simple comparison on that feature. Upayavira On Fri, May 24, 2013, at 03:00 AM, Kazuaki Hiraga wrote: Thank you for your comment. Due to historical reasons, Our organization uses trunk version of Solr-4.0, which is a bit old

Re: Distributed query: strange behavior.

2013-05-24 Thread Luis Cappa Banda
Uhm... that sounds reasonable. My data model may allow duplicate keys, but it's quite difficult. My key is a hash formed by an URL during a crawling process, and it's posible to re-crawl an existing URL. I think that I need to find a new way to compose an unique key to avoid this kind of bad

Nested Facets and distributed shard system.

2013-05-24 Thread ramrrajesh
we are facing an issue with nested facets use case. The data is indexed across multiple shards and being searched by several tomcat instances. Use case : - User wants to navigate the results, category by category. - Eg : Country United status20/UnitedStates State

Re: Russian stopwords

2013-05-24 Thread igiguere
Just so everyone knows : It turns out my stopwords.txt was OK after all. It functions correctly on a Linux (ubuntu), and, strangely, on a colleague's Windows 7. My computer is also Windows 7. The only difference between the 2 Windows is the language of the interface (French for mine, English

Re: Why would one not use RemoveDuplicatesTokenFilterFactory?

2013-05-24 Thread Jack Krupansky
The primary purpose of this filter is in conjunction with the KeywordRepeatFilterFactory and a stemmer, to remove the tokens that did not produce a stem from the original token, so the keyword duplicate is no longer needed. The goal is to index both the stemmed and unstemmed terms at the same

Re: Distributed query: strange behavior.

2013-05-24 Thread Valery Giner
Shawn, How is it possible for more than one document with the same unique key to appear in the index, even in different shards? Isn't it a bug by definition? What am I missing here? Thanks, Val On 05/23/2013 09:55 AM, Shawn Heisey wrote: On 5/23/2013 1:51 AM, Luis Cappa Banda wrote: I've

Re: Fuzzy search in solr

2013-05-24 Thread Jack Krupansky
Fuzzy search is the syntax for a term, not a handler. For example: alpha~1 will match terms that have an editing distance of 0 or 1 from alpha. All of the search handlers support fuzzy search. Some query parsers, such as dismax, do not, but the standard Solr query parser and edismax query

Re: Restaurant availability from database

2013-05-24 Thread Alexandre Rafalovitch
On Thu, May 23, 2013 at 6:47 PM, Amit Nithian anith...@gmail.com wrote: Hossman did a presentation on something similar to this using spatial data at a Solr meetup some months ago. http://people.apache.org/~hossman/spatial-for-non-spatial-meetup-20130117/ This presentation rocks (I like

Re: Distributed query: strange behavior.

2013-05-24 Thread Shalin Shekhar Mangar
The uniqueKey is enforced within the same shard/index only. On Fri, May 24, 2013 at 6:39 PM, Valery Giner valgi...@research.att.comwrote: Shawn, How is it possible for more than one document with the same unique key to appear in the index, even in different shards? Isn't it a bug by

Re: Upgrading from SOLR 3.5 to 4.2.1 Results.

2013-05-24 Thread rishi.easwa...@aol.com
Here you go. No :of cpus per node - 16 on most hosts. No :of disks per host - 1 dedicated disk per SOLR instance as we are pretty write heavy. Vms per host- on most hosts 6 instances running at 4GB RAM each  Gc params -            

Keeping a rolling window of indexes around solr

2013-05-24 Thread Saikat Kanjilal
Hello Solr community folks, I am doing some investigative work around how to roll and manage indexes inside our solr configuration, to date I've come up with an architecture that separates a set of masters that are focused on writes and get replicated periodically and a set of slave shards

error while indexing huge filesystem with data import handler and FileListEntityProcessor

2013-05-24 Thread jerome . dupont
Hello, We are trying to use data import handler and particularly on a collection which contains many file (one xml per document) Our configuration works for a small amount of files, but dataimport fails with OutofMemory Error when running it on 10M files (in several directories...) This is

Re: Keeping a rolling window of indexes around solr

2013-05-24 Thread Shawn Heisey
On 5/24/2013 8:25 AM, Saikat Kanjilal wrote: Anyways would love to hear thoughts and usecases that are similar from the community. Your use-case sounds a lot like what loggly was doing back in 2010. http://loggly.com/videos/lucene-revolution-2010/

Re: Keeping a rolling window of indexes around solr

2013-05-24 Thread Shawn Heisey
On 5/24/2013 8:56 AM, Shawn Heisey wrote: On 5/24/2013 8:25 AM, Saikat Kanjilal wrote: Anyways would love to hear thoughts and usecases that are similar from the community. Your use-case sounds a lot like what loggly was doing back in 2010.

multivalue location_rpt field not indexing with JSON format

2013-05-24 Thread blmak
Hi I am trying to index a multivalue lat/long values in the location_rpt field from a json file, and I am getting the following error, when I attempt to index a json file: {responseHeader:{status:400,QTime:5},error:{msg:ERROR: [doc=054ac6377d6ca4ad387f73b063000910] Error adding field

RE: Keeping a rolling window of indexes around solr

2013-05-24 Thread Saikat Kanjilal
I would like to see something similar to this existing in the solr world or I could gladly help create this: https://github.com/karussell/elasticsearch-rollindex We are evaluating both elasticsearch and our current solr architecture and need to manage write heavy use-cases within a rolling

Re: multivalue location_rpt field not indexing with JSON format

2013-05-24 Thread David Smiley (@MITRE.org)
Hi Barbra, Solr needs to see a String for each point value, not a 2-element array. Your doc should look like: [{id:054ac6377d6ca4ad387f73b063000910,keywords:[time, trouble, exactly],description:a anno is an anno is an anno,

Re: Restaurant availability from database

2013-05-24 Thread David Smiley (@MITRE.org)
Use this reference: http://wiki.apache.org/solr/SpatialForTimeDurations Alexandre Rafalovitch wrote On Thu, May 23, 2013 at 6:47 PM, Amit Nithian lt; anithian@ gt; wrote: Hossman did a presentation on something similar to this using spatial data at a Solr meetup some months ago.

Re: Warning: no uniqueKey specified in schema.

2013-05-24 Thread O. Olson
Thank you Shawn for clearing this up. I was only using the “db” core, and forgot that this example had a few other cores which have their own schema.xml. I commented out this core in the solr.xml and now get no warnings :-). O. O. -- View this message in context:

Re: Can anyone explain this Solr query behavior?

2013-05-24 Thread Shankar Sundararaju
Hi Upayavira, Thank you for your analysis. I thought 'AND' groupings are supported as per documentation: http://docs.lucidworks.com/display/solr/The+Extended+DisMax+Query+Parser http://lucene.apache.org/core/old_versioned_docs/versions/3_5_0/queryparsersyntax.html#Grouping But yes,

Solr java.io.FileNotFoundException

2013-05-24 Thread atuldj.jadhav
Hi Team, I need your help with one of the critical issue I am facing. I end up loosing my segment. more frequently I get below File not Found exception ../data/index/segments_c (No such file or directory) at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1103) Segment name keeps

Re: Russian stopwords

2013-05-24 Thread Alexandre Rafalovitch
Sounds like maybe UTF-specific issue when you are _reading it in_. See if you can change the default locale before starting Java Process (I think it is an environmental variable) and check if that makes an impact. If you have a very easy test-case, I would be happy to check it on Mac and Windows.

Re: Can anyone explain this Solr query behavior?

2013-05-24 Thread Shankar Sundararaju
Hi Jack Krupansky, Thank you for your reply. I would like to know how you got the error logging? Is there any special flag I have to turn on? Because I don't see it in my solr.log even after switching the log level to DEBUG. str name=msgorg.apache.solr.**search.SyntaxError: Cannot parse 'id:*

HTTP Status 503 - Server is shutting down

2013-05-24 Thread srinalluri
Hi, I am unable to setup solr4. I am getting this error: HTTP Status 503 - Server is shutting down. I don't see anything in tomcat logs. conf/Catalina/localhost\solr4new.xml: ?xml version=1.0 encoding=utf-8? Context docBase=/apps/solr1/solr4new/solr.war debug=0 crossContext=true Environment

Re: HTTP Status 503 - Server is shutting down

2013-05-24 Thread Shawn Heisey
On 5/24/2013 11:14 AM, srinalluri wrote: I am unable to setup solr4. I am getting this error: HTTP Status 503 - Server is shutting down. I don't see anything in tomcat logs. conf/Catalina/localhost\solr4new.xml: ?xml version=1.0 encoding=utf-8? Context

RE: Problem with document routing with Solr 4.2.1

2013-05-24 Thread Jean-Sebastien Vachon
Hi All, Evan Sayer from LucidWorks found the problem in our schema so this problem is not related at all to SolrCloud itself. (well it is but as least it is not a bug) I don't why :( but at some point we changed the type of the id field from 'string' to 'text'. Since we are doing custom

Re: HTTP Status 503 - Server is shutting down

2013-05-24 Thread srinalluri
I am using Solr 4.3.0. I have already copied solr4j*.jar files and log4j far file to tomcat lib, and restarted the tomcat. Before copy these jar files, I got 404 error. Now I am getting this 503 error for host:18080/solr4new/ I don't have zookeeper. Is zookeeper must in order to work on solr

Re: Russian stopwords

2013-05-24 Thread igiguere
A colleague stumbled upon this : http://stackoverflow.com/questions/361975/setting-the-default-java-character-encoding The second answer, environment variable JAVA_TOOL_OPTIONS did the job. JAVA_TOOL_OPTIONS : -Dfile.encoding=UTF8 Happy stop-wording ! -- View this message in context:

Re: Multi dimensional spatial search

2013-05-24 Thread Kiran J
Thank you for the excellent explanation David. My use case is in the signal processing area. I have a wave that is in time domain it is converted to frequency domain on 8 different bands (FFT) ie, an 8D point. The question for me is If I have a set of waves (8D points) in the database and I have

Re: HTTP Status 503 - Server is shutting down

2013-05-24 Thread Shawn Heisey
On 5/24/2013 11:55 AM, srinalluri wrote: I am using Solr 4.3.0. I have already copied solr4j*.jar files and log4j far file to tomcat lib, and restarted the tomcat. Before copy these jar files, I got 404 error. Now I am getting this 503 error for host:18080/solr4new/ I don't have

Re: Keeping a rolling window of indexes around solr

2013-05-24 Thread Alexandre Rafalovitch
Would collection aliasing help here? From Solr 4.2 release notes: Collection Aliasing. Got time based data? Want to re-index in a temporary collection and then swap it into production? Done. Stay tuned for Shard Aliasing. Regards, Alex. Personal blog: http://blog.outerthoughts.com/ LinkedIn:

load balancing internal Solr on Azure

2013-05-24 Thread Kevin Osborn
We are looking install SolrCloud on Azure. We want it to be an internal service. For some applications that use SolrJ, we can use ZooKeeper. But for other applications that don't talk to Azure, we will need to go through a load balancer to distribute traffic among the Solr instances (VMs, IaaS).

Re: HTTP Status 503 - Server is shutting down

2013-05-24 Thread srinalluri
the following files are at /apps/tomcat/solr4new/lib: ./annotations-api.jar ./tomcat-jdbc.jar ./slf4j-log4j12-1.6.6.jar ./tomcat-dbcp.jar ./catalina-tribes.jar ./tomcat-i18n-fr.jar ./catalina.jar ./jasper.jar ./el-api.jar ./tomcat-api.jar ./jsp-api.jar ./slf4j-api-1.6.6.jar ./tomcat-i18n-es.jar

Re: Can anyone explain this Solr query behavior?

2013-05-24 Thread Jack Krupansky
Oh, I simply changed the query parser type to lucene, with defType=lucene and then I see essentially the same error that edismax does when it internally tries to parse the query. But, it might be nice if DEBUG level logging for edismax did display the error as well and then told you what

Re: Keeping a rolling window of indexes around solr

2013-05-24 Thread Saikat Kanjilal
This is kind of the approach used by elastic search , if I'm not using solrcloud will I be able to use shard aliasing, also with this approach how would replication work, is it even needed? Sent from my iPhone On May 24, 2013, at 12:00 PM, Alexandre Rafalovitch arafa...@gmail.com wrote:

Re: HTTP Status 503 - Server is shutting down

2013-05-24 Thread Shawn Heisey
On 5/24/2013 1:32 PM, srinalluri wrote: logging.properties is at conf folder, not in lib. The following path I added in that file. But still solr.2013-05-24.log is blank. 5localhost.org.apache.juli.FileHandler.directory = /apps/solr1/solr4new/logs The logging.properties file will not be used.