Re: Possible regression for Solr 4.6.0 - commitWithin does not work with replicas

2014-01-29 Thread Elodie Sannier
I have this configuration for test servers (the order of the instance start leads to this conf.) not for production. Elodie On 01/23/2014 04:35 PM, Shawn Heisey wrote: On 12/11/2013 2:41 AM, Elodie Sannier wrote: collection fr_blue: - shard1 - server-01 (replica1), server-01 (replica2) -

ExtractingUpdateProcessor release date

2014-01-29 Thread neerajp
Hi All, In my application, I have the requirement which can be solved by ExtractingUpdateProcessor. May I know by when it would be released (jIRA ticket: SOLR-1763) ? Can you pls. tell me where I can get ExtractingUpdateProcessor code base ?? -- View this message in context:

Integrating Oauth2 with Solr MailEntityProcessor

2014-01-29 Thread Dileepa Jayakody
Hi All, I'm doing a research project on : Email Reputation Analysis and for this project I'm planning to use Apache Solr, Tika and Mahout projects to analyse, store and query reputation of emails and correspondents. For indexing emails in Solr I'm going to use the MailEntityProcessor [1]. But I

TikaEntityProcessor + multivalue field as url source

2014-01-29 Thread Bustaa
Hello Solr Users, i'm trying to get Tika's BinFileDataSource to take the filenames from a multivalue field (array) but I'm getting the following exception: Debug output from running dataimport (shortenend): query, LONG SQL-QUERY , time-taken, 0:0:0.11,

Re: TikaEntityProcessor + multivalue field as url source

2014-01-29 Thread Ahmet Arslan
Hi Bustaa, Can you paste your data-config.xml?  Also, did you consider using ManifoldCF [1] to crawl/index your CMS? What CMS are you using? [1] http://manifoldcf.apache.org/release/trunk/en_US/end-user-documentation.html#repositoryconnectiontypes On Wednesday, January 29, 2014 1:03 PM,

LUCENE-5388 AbstractMethodError

2014-01-29 Thread Markus Jelsma
Hi, We have a developement environment running trunk but have custom analyzers and token filters built on 4.6.1. Now the constructors have changes somewhat and stuff breaks. Here's a consumer trying to get a TokenStream from an Analyzer object doing TokenStream stream =

Re: TikaEntityProcessor + multivalue field as url source

2014-01-29 Thread Bustaa
Thanks for you suggestions Ahmet. We are using the Typo3 CMS (with custom extensions / db-schemas). We are using Solarium to connect to the Solr instance. The schema is pretty simple: script![CDATA[ function PrependPath(row){ var files =

Re: Where is a canonical SolrJ example(s)?

2014-01-29 Thread Michael Sokolov
If I were starting from a solr distribution (ie not maven), I would extract everything from the solr.war in WEB-INF/lib and WEB-INF/classes, and put that on my classpath. -Mike On 01/28/2014 06:29 PM, Alexandre Rafalovitch wrote: Thanks Mike, Sounds like Maven approach worked, I haven't

KeywordTokenizerFactory with whitespace

2014-01-29 Thread Aleksander Akerø
Hi According to solr documentation the solr.KeywordTokenizerFactory should not do any tokenizing at all, but to me it seems to be splitting on whitespace e.g. space. For example i have the value FE 009 stored in the index to the field number, and what i search for is the exact same string FE 009

high memory usage with small data set

2014-01-29 Thread Johannes Siegert
Hi, we are using Apache Solr Cloud within a production environment. If the maximum heap-space is reached the Solr access time slows down, because of the working garbage collector for a small amount of time. We use the following configuration: - Apache Tomcat as webserver to run the Solr web

Re: KeywordTokenizerFactory with whitespace

2014-01-29 Thread Aleksander Akerø
Sorry guys, please ignore this. It was not ready to be sent but got sent mistakenly. Will send a proper one later on. *Aleksander Akerø* Systemkonsulent Mobil: 944 89 054 E-post: aleksan...@gurusoft.no *Gurusoft AS* Telefon: 92 44 09 99 Østre Kullerød www.gurusoft.no 2014-01-29 Aleksander

Re: Where is a canonical SolrJ example(s)?

2014-01-29 Thread Alexandre Rafalovitch
And then go back and also add servlets jar from examples/lib. And maybe everything from dist/solrj-lib just because they are there, even if they are duplicates to some (all?) of the above mentioned libraries. That's my point. A newbie would not know this. If there a script that has all that on

Re: KeywordTokenizerFactory with whitespace

2014-01-29 Thread Alexandre Rafalovitch
Well, while you are preparing that one, is there any reason you have two analyzers and both are 'index' type? One would probably be query type, no? Regards, Alex. Personal website: http://www.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch - Time is the quality of

KeywordTokenizerFactory - trouble with exact matches

2014-01-29 Thread Aleksander Akerø
Hi, I'll try properly this time. According to solr documentation the solr.KeywordTokenizerFactory should not do any tokenizing at all. Thus, if I understand this correctly, it should only return exact matches given that this is the only analyzer defined in the field type. Such as the following

Re: KeywordTokenizerFactory - trouble with exact matches

2014-01-29 Thread Aruna Kumar Pamulapati
Hi , I think the misunderstanding you are having is about http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.LowerCaseTokenizerFactory lowercase factory. You are correct about KeywordTokenizerFactory but lowercase factory : Creates tokens by lowercasing all letters and dropping

Re: KeywordTokenizerFactory - trouble with exact matches

2014-01-29 Thread Aleksander Akerø
Thanks for the quick answer, but it doesn't help if I remove the lowercase analyzer like so: *fieldType name=keyword class=solr.TextField positionIncrementGap=100* *analyzer type=index* *tokenizer class=solr.KeywordTokenizerFactory/* */analyzer* *

Re: KeywordTokenizerFactory - trouble with exact matches

2014-01-29 Thread Aleksander Akerø
update: Guessing that this has nothing to do with the tokenizer. Tried to use the string fieldtype as well, but still the same results. So this must have to do with some other solr config. What confuses me is that when I search 1005 which is another valid value to search for, it works perfectly,

Re: KeywordTokenizerFactory - trouble with exact matches

2014-01-29 Thread Jack Krupansky
If you change the analyzer for a Solr field, such as adding, removing, or changing attributes of token filters, you must/should reindex all data (add it to the index again to re-analyze it.) In your case, the data was indexed as lower case, so after your changes a query with upper case would

Re: KeywordTokenizerFactory - trouble with exact matches

2014-01-29 Thread Alexandre Rafalovitch
I think the whitespace might also be the issue. The query gets parsed by standard component that splits it on space before passing individual components into the field searches. Try enabling autoGeneratePhraseQueries on the field (or field type) and reindexing. See if that makes a difference.

Solr spatial clustering GPS

2014-01-29 Thread Guido Medina
Hi, Is there a way to cluster groups of near by GPS in a Solr query(GPS inside a quadrant), we have everything in place (location_rpt with Spatial4J and JTS) and we have been using Solr spatial for a while with GPS and Polygons in the map, just wondering if there is a way to return a list of

Not finding part of fulltext field when word ends in dot

2014-01-29 Thread Thomas Michael Engelke
Hello everybody, we have a legacy solr installation in version 3.6.0.1. One of the indices defines a field named content as a fulltext field where a product description will reside. One of the records indexed contains the following data (excerpt): z. B. in der Serie 26KA. I had the problem that

Re: Not finding part of fulltext field when word ends in dot

2014-01-29 Thread Jack Krupansky
What field type and analyzer/tokenizer are you using? -- Jack Krupansky -Original Message- From: Thomas Michael Engelke Sent: Wednesday, January 29, 2014 10:45 AM To: solr-user@lucene.apache.org Subject: Not finding part of fulltext field when word ends in dot Hello everybody,

Re: Not finding part of fulltext field when word ends in dot

2014-01-29 Thread Thomas Michael Engelke
The fieldType definition is a tad on the longer side: fieldType name=text class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/

Re: Dead node, but clusterstate.json says active, won't sync on restart

2014-01-29 Thread Greg Preston
If you removed the tlog and index and restart it should resync, or something is really crazy. It doesn't, or at least if it tries, it's somehow failing. I'd be ok with the sync failing for some reason if the node wasn't also serving queries. -Greg On Tue, Jan 28, 2014 at 11:10 AM, Mark

Re: Not finding part of fulltext field when word ends in dot

2014-01-29 Thread Jack Krupansky
You might want to add autoGeneratePhraseQueries=true to your field type, but I don't think that would cause a break when going from 3.6 to 4.x. The default for that attribute changed in Solr 3.5. What release was your data indexed using? There may have been some subtle word delimiter filter

Re: Dead node, but clusterstate.json says active, won't sync on restart

2014-01-29 Thread Mark Miller
What's in the logs of the node that won't recover on restart after clearing the index and tlog - Mark On Jan 29, 2014, at 11:41 AM, Greg Preston gpres...@marinsoftware.com wrote: If you removed the tlog and index and restart it should resync, or something is really crazy. It doesn't, or

Is there a way to ignore elevate.xml through SOLR parameters?

2014-01-29 Thread Developer
I am currently using elevate.xml to elevate few documents based on some search keywords. We have a requirement to ignore the elevate.xml and boost the documents based on original scoring factors when a user is an authenticated user. Is it something that can be done by using SOLR parameters?

Re: Is there a way to ignore elevate.xml through SOLR parameters?

2014-01-29 Thread Chris Hostetter
: search keywords. We have a requirement to ignore the elevate.xml and boost : the documents based on original scoring factors when a user is an : authenticated user. : : Is it something that can be done by using SOLR parameters?

Re: KeywordTokenizerFactory - trouble with exact matches

2014-01-29 Thread Aleksander Akerø
Thanks a lot, I'll try the autoGeneratePhraseQueries property and see how that works. Regarding the reindexing tip, it's a good tip but due to the my current on the fly setup on the servers at work i basically have do build a project with maven and deploy to tomcat, wherein the index lies, and I

Re: Dead node, but clusterstate.json says active, won't sync on restart

2014-01-29 Thread Greg Preston
I've attached the log of the downed node (truffle-solr-4). This is the relevant log entry from the node it should replicate from (truffle-solr-5): [29 Jan 2014 19:31:29] [qtp1614415528-74] ERROR (org.apache.solr.common.SolrException) - org.apache.solr.common.SolrException: I was asked to wait on

Re: Required local configuration with ZK solr.xml?

2014-01-29 Thread Jeff Wartes
...the differnce between that example and what you are doing here is that in that example, because both of nodes already had collection1 instance dirs, they expected to be part of collection1 when they joined the cluster. And that, I think, is my misunderstanding. I had assumed that the link

Regarding Solr Faceting on the query response.

2014-01-29 Thread Kuchekar
Hi All, Is there a way for faceting only on the resultset returned, rather then all the indexed docs? The response time for solr query with faceting switched on, takes a lot of time to respond. I see that, it tends to get the distinct values in the facet field and then give the

4.6 Core Discovery coreRootDirectory not working

2014-01-29 Thread Sam Batschelet
Hello this is my 1st post to you group I am in the process of setting up a development environment using solr. We will require multiple cores managed by multiple users in the following layout. I am running a fairly vanilla version of 4.6 solrHome /home/camp/example/solr/solr.xml cores

Re: 4.6 Core Discovery coreRootDirectory not working

2014-01-29 Thread Sam Batschelet
On Jan 29, 2014, at 4:31 PM, Sam Batschelet wrote: Hello this is my 1st post to you group I am in the process of setting up a development environment using solr. We will require multiple cores managed by multiple users in the following layout. I am running a fairly vanilla version of 4.6

Re: Use a field without predefining it it the schema

2014-01-29 Thread Hakim Benoudjit
Thanks Steve for the link. It seems very easy to create `new fields` in the `schema` using the `POST request`. But doest mean that I dont have to restart the `solr app`? Is so, is this feature available in latest solr version (`v4.6`)? 2014-01-29 Alexandre Rafalovitch arafa...@gmail.com There

Re: Use a field without predefining it it the schema

2014-01-29 Thread Hakim Benoudjit
I have found this link https://cwiki.apache.org/confluence/display/solr/Managed+Schema+Definition+in+SolrConfig . I dont know if it's required to modify the schema (see the link), to make it editable by the REST API. I wish that it doesnt clear all the fields that I have added manually to the

Re: ExtractingUpdateProcessor release date

2014-01-29 Thread Chris Hostetter
: In my application, I have the requirement which can be solved by : ExtractingUpdateProcessor. May I know by when it would be released (jIRA : ticket: SOLR-1763) ? : : Can you pls. tell me where I can get ExtractingUpdateProcessor code base ExtractingUpdateProcessor does not currently exist

Re: Use a field without predefining it it the schema

2014-01-29 Thread Steve Rowe
Hi Hakim, You don’t have to restart the Solr app to make use of fields added through the Schema REST API. This feature was first available in Solr v4.4 and remains available in Solr v4.6. Steve On Jan 29, 2014, at 7:11 PM, Hakim Benoudjit h.benoud...@gmail.com wrote: Thanks Steve for the

Re: Use a field without predefining it it the schema

2014-01-29 Thread Steve Rowe
Hakim, All the fields you have added manually to the schema will be kept when you switch to using managed schema. From the managed schema page on the Solr Reference Guide you linked to (describing what happens after you add schemaFactory class=ManagedIndexSchemaFactory”…/schemaFactory to your

Re: Regarding Solr Faceting on the query response.

2014-01-29 Thread Alexandre Rafalovitch
On Thu, Jan 30, 2014 at 3:43 AM, Kuchekar kuchekar.nil...@gmail.com wrote: company=Apple Did you mean company:Apple ? Otherwise, that could be the issue. Regards, Alex. Personal website: http://www.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch - Time is the

Re: Regarding Solr Faceting on the query response.

2014-01-29 Thread Nilesh Kuchekar
Yeah it's a typo... I meant company:Apple Thanks Nilesh On Jan 29, 2014, at 8:59 PM, Alexandre Rafalovitch arafa...@gmail.com wrote: On Thu, Jan 30, 2014 at 3:43 AM, Kuchekar kuchekar.nil...@gmail.com wrote: company=Apple Did you mean company:Apple ? Otherwise, that could be the issue.

Re: Integrating Oauth2 with Solr MailEntityProcessor

2014-01-29 Thread Dileepa Jayakody
Hi All, I think Oauth2 integration is a valid usecase for Solr when it comes to importing data from user-accounts like email, social-networks, enterprise stores etc. Do you think Oauth2 integration in Solr will be an useful feature? If so I would like to start working on this. I feel this could

Re: Solr spatial clustering GPS

2014-01-29 Thread Smiley, David W.
Hi Guido, Check this out: http://wiki.apache.org/solr/SpatialClustering It captures some information on the subject. What I really want to do is built-in heatmap-faceting but I have no time right now. ~ David On 1/29/14, 10:38 AM, Guido Medina guido.med...@temetra.com wrote: Hi, Is there a

Re: Regarding Solr Faceting on the query response.

2014-01-29 Thread Mikhail Khludnev
Hello Do you mean setting http://wiki.apache.org/solr/SimpleFacetParameters#facet.mincount to 1 or you want to facet only returned page (rows) instead of full resultset (numFound) ? On Thu, Jan 30, 2014 at 6:24 AM, Nilesh Kuchekar kuchekar.nil...@gmail.comwrote: Yeah it's a typo... I meant

Re: Not finding part of fulltext field when word ends in dot

2014-01-29 Thread Thomas Michael Engelke
I'm not sure I got my problem across. If I understand the snippet of documentation right, autoGeneratePhraseQueries only affects queries that result in multiple tokens, which mine does not. The version also is 3.6.0.1, and we're not planning on upgrading to any 4.x version. 2014-01-29 Jack