Index mysql database using data import handler in solr

2013-07-11 Thread archit2112
I want to index mysql database in solr using the Data Import Handler. I have made two tables. The first table holds the metadata of a file. create table filemetadata ( id varchar(20) primary key , filename varchar(50), path varchar(200), size varchar(10), author

Re: Indexing database in Solr using Data Import Handler

2013-07-11 Thread Gora Mohanty
On 11 July 2013 11:13, archit2112 archit2...@gmail.com wrote: Im trying to index MySql database using Data Import Handler in solr. [...] Everything is working but the favouritedby1 field is not getting indexed , ie, that field does not exist when i run the *:* query. Can you please help me

Performance of cross join vs block join

2013-07-11 Thread mihaela olteanu
Hello, Does anyone know about some measurements in terms of performance for cross joins compared to joins inside a single index? Is it faster the join inside a single index that stores all documents of various types (from parent table or from children tables)with a discriminator field

Re: Performance of cross join vs block join

2013-07-11 Thread Mikhail Khludnev
Mihaela, For me it's reasonable that single core join takes the same time as cross core one. I just can't see which gain can be obtained from in the former case. I hardly able to comment join code, I looked into, it's not trivial, at least. With block join it doesn't need to obtain parentId term

request to be added as a wiki contributor

2013-07-11 Thread Andrew MacKinlay
Hi, My wiki username is AndyMacKinlay. Can I please be added to the ContributorsGroup? Thanks, Andy

Term component regex to remove stopwords

2013-07-11 Thread shruti suri
Hi, Can Termcomponent parameter terms.regex be used to ignore stop words. Regards Shruti -- View this message in context: http://lucene.472066.n3.nabble.com/Term-component-regex-to-remove-stopwords-tp4077196.html Sent from the Solr - User mailing list archive at Nabble.com.

Problem using Term Component in solr

2013-07-11 Thread Parul Gupta(Knimbus)
Hi All I am using *Term component* in solr for searching titles with short form using wild card characters(.*) and [a-z0-9]*. I am using *Term Component* specifically as wild card characters are not working on *select?q=* query search. Examples of some *title *are: 1)Medicine, Health Care

Re: How to make 'fq' optional?

2013-07-11 Thread Mikhail Khludnev
https://lucene.apache.org/solr/4_2_0/solr-core/org/apache/solr/search/SwitchQParserPlugin.html Hoss cares about you! On Wed, Jul 10, 2013 at 10:40 PM, Learner bbar...@gmail.com wrote: I am trying to make a variable in fq optional, Ex: /select?first_name=peterfq=$first_nameq=*:* I don't

Re: Performance of cross join vs block join

2013-07-11 Thread mihaela olteanu
In my current use case I have 4 tables with a one to many relationship between them (one is the parent and the rest are the children ) and I have created for each table a separate Solr core. Now I have the request to return all those parents that match a certain criteria or one of its children

Re: amount of values in a multi value field - is denormalization always the best option?

2013-07-11 Thread Flavio Pompermaier
I also have a similar scenario, where fundamentally I have to retrieve all urls where a userid has been found. So, in my schema, I designed the url as (string) key and a (possible huge) list of attributes automatically mapped to strings. For example: Url1 (key): - language: en - content:userid1

Re: request to be added as a wiki contributor

2013-07-11 Thread Erick Erickson
Done. On Wed, Jul 10, 2013 at 10:25 PM, Andrew MacKinlay admac...@gmail.com wrote: Hi, My wiki username is AndyMacKinlay. Can I please be added to the ContributorsGroup? Thanks, Andy

Applying Sum on Field

2013-07-11 Thread Jamshaid Ashraf
Hi, I'm a new solr user, I wanted to know is there any way to apply sum on a field in a result document of group query? Following is the query and its result set, I wanted to apply sum on 'price' filed grouping on type: *Sample input:* doc str name=id3/str str name=typeCaffe/str str

Re: Commit different database rows to solr with same id value?

2013-07-11 Thread Erick Erickson
Just use the address in the url. You don't have to use the core name if the defaults are set, which is usually collection1. So it's something like http://host:port/solr/core2/update? blah blah blah Erick On Wed, Jul 10, 2013 at 4:17 PM, Jason Huang jason.hu...@icare.com wrote: Thanks David.

Re: Applying Sum on Field

2013-07-11 Thread Peter Sturge
Hi, If you mean adding up numeric values stored in fields - no, Solr doesn't do this by default. We had a similar requirement for this, and created a custom SearchComponent to handle sum, average, stats etc. There are a number of things you need to bear in mind, such as: * Handling errors when

Re: Commit different database rows to solr with same id value?

2013-07-11 Thread Jason Huang
cool. so far I've been using the default collection 1 only. thanks, Jason On Thu, Jul 11, 2013 at 7:57 AM, Erick Erickson erickerick...@gmail.comwrote: Just use the address in the url. You don't have to use the core name if the defaults are set, which is usually collection1. So it's

Solr caching clarifications

2013-07-11 Thread Manuel Le Normand
Hello, As a result of frequent java OOM exceptions, I try to investigate more into the solr jvm memory heap usage. Please correct me if I am mistaking, this is my understanding of usages for the heap (per replica on a solr instance): 1. Buffers for indexing - bounded by ramBufferSize 2. Solr

solr 4.3.0 cloud in Tomcat, link many collections to Zookeeper

2013-07-11 Thread Zhang, Lisheng
Hi, We are testing solr 4.3.0 in Tomcat (considering upgrading solr 3.6.1 to 4.3.0), in WIKI page for solrCloud in Tomcat: http://wiki.apache.org/solr/SolrCloudTomcat we need to link each collection explicitly: /// 8) Link uploaded config with target collection java -classpath

What happens in indexing request in solr cloud if Zookeepers are all dead?

2013-07-11 Thread Zhang, Lisheng
Hi, In solr cloud latest doc, it mentioned that if all Zookeepers are dead, distributed query still works because solr remembers the cluster state. How about the indexing request handling if all Zookeepers are dead, does solr needs Zookeeper to know which box is master and which is slave for

Solr 4.3.0 memory usage is higher than solr 3.6.1?

2013-07-11 Thread Zhang, Lisheng
Hi, We are testing solr 4.3.0 in Tomcat (considering upgrading solr 3.6.1 to 4.3.0), we have many cores (a few thousands). We have noticed solr 4.3.0 memory usage is much higher than solr 3.6.1 (without using solr cloud yet). With 2K cores, solr 3.6.1 is using 1.5G, but solr 4.3.0 is using

nested queries + joins performance

2013-07-11 Thread Marcelo Elias Del Valle
Hello, Continuing to have fun with joins, I finally figured a way to make my joins works. Suppose I have inserted data as bellow, using solrj. If I want to select a parent (room) that has both: - a keyboard and a mouse - a monitor and a tablet In my data, bellow, only room2

Re: amount of values in a multi value field - is denormalization always the best option?

2013-07-11 Thread Marcelo Elias Del Valle
Hello Flavio, Out of curiosity, are you already using this in prod? Would you share your results / benchmarks with us? (not sure if you have some). I wonder how it is performing for you. I was thinking in using a very similar schema, comparing to yours. The thing is: each option has

edismax behaviour with japanese

2013-07-11 Thread Shalom Ben-Zvi Kazaz
Hello, I have a text and text_ja fields where text is english and text_ja is japanese analyzers, i index both with copyfield from other fields. I'm trying to search both fields using edismax and qf parameter, but I see strange behaviour of edismax , I wonder if someone can give me a hist to what's

Re: amount of values in a multi value field - is denormalization always the best option?

2013-07-11 Thread Flavio Pompermaier
Yeah, probably you're right..I have to test different configurations! That is what I'd like to know in advance the available solutions..I'm still developing fortunately so I'm still in the position to investigate the solution. Obviously I'll do some benchmarking on it, but I should know the

How to boost relevance based on distance and age..

2013-07-11 Thread Vineel
Here is the structure of the solr document doc str name=latlong52.401790,4.936660/str date name=dateOfBirth1993-12-09T00:00:00Z/date /doc would like to search for document's based on the following weighted criteria.. - distance 0-10miles weight 40 - distance 10miles

Too many documents, composite IndexReaders cannot exceed 2147483647

2013-07-11 Thread Manuel Ignacio Lopez
Hello everybody, somehow we managed to overload our Solr server 4.2.0 with too many documents (many of which are already deleted, but the index is not optimized). Now Solr cannot be started anymore, see full strack trace below. Caused by: java.lang.IllegalArgumentException: Too many documents,

SolrJ and initializing logger in solr 4.3?

2013-07-11 Thread Jonathan Rochkind
I am using SolrJ in a Java (actually jruby) project, with Solr 4.3. When I instantiate an HttpSolrServer, I get the dreaded: log4j:WARN No appenders could be found for logger (org.apache.solr.client.solrj.impl.HttpClientUtil). log4j:WARN Please initialize the log4j system properly. log4j:WARN

Re: SolrJ and initializing logger in solr 4.3?

2013-07-11 Thread Michael Della Bitta
Hi Jonathan, I think you just need some config on the classpath: http://logging.apache.org/log4j/1.2/manual.html#defaultInit Michael Della Bitta Applications Developer o: +1 646 532 3062 | c: +1 917 477 7906 appinions inc. “The Science of Influence Marketing” 18 East 41st Street New

Re: amount of values in a multi value field - is denormalization always the best option?

2013-07-11 Thread Jack Krupansky
Again, generally, if the number of values is relatively modest and you don't need to discriminate (tell which one matches on a search) and you don't edit the list, a multivalued field makes perfect sense, but if any of those requirements is not true, then you need to represent the items as

Re: What happens in indexing request in solr cloud if Zookeepers are all dead?

2013-07-11 Thread Jack Krupansky
There are no masters or slaves in SolrCloud - it is fully distributed and master-free. Leaders are temporary and can vary over time. The basic idea for quorum is to prevent split brain - two (or more) distinct sets of nodes (zookeeper nodes, that is) each thinking they constitute the

Re: Applying Sum on Field

2013-07-11 Thread Jack Krupansky
Take a look at the stats component that calculates aggregate values. It has a facet parameter that may or may not give you something similar to what you want. Or, just form a query that matches the results of the group, and then get the stats. See: http://wiki.apache.org/solr/StatsComponent

Thousands of cluster state change events per second from zookeeper

2013-07-11 Thread Sundararaju, Shankar
Hi, We have 3 search client nodes connected to a 12x2 Solr 4.2.1 cluster through CloudSolrServer. We are noticing thousands of such events being logged every second on these client nodes and filling up the logs quickly. Are there any known bug in Zookeeper or SolrJ client that can cause this?

RE: What happens in indexing request in solr cloud if Zookeepers are all dead?

2013-07-11 Thread Zhang, Lisheng
Yes, I should not have used word master/slave for solr cloud! So if all Zookeepers are dead, could indexing requests be handled properly (could solr remember the setting for indexing)? Thanks very much for helps, Lisheng -Original Message- From: Jack Krupansky

Re: Moving replica from node to node?

2013-07-11 Thread Mark Miller
Yeah, though CREATE and UNLOAD end up being kind of funny descriptors. You'd think LOAD and UNLOAD or CREATE and DELETE or something... On Wed, Jul 10, 2013 at 11:35 PM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: Thanks Mark. I assume you are referring to using the Core Admin API -

What does too many merges...stalling in indexwriter log mean?

2013-07-11 Thread Tom Burton-West
Hello, We are seeing the message too many merges...stalling in our indexwriter log. Is this something to be concerned about? Does it mean we need to tune something in our indexing configuration? Tom

Leader Election, when?

2013-07-11 Thread aabreur
I have a working Zookeeper ensemble running with 3 instances and also a solrcloud cluster with some solr instances. I've created a collection with settings to 2 shards. Then i: create 1 core on instance1 create 1 core on instance2 create 1 core on instance1 create 1 core on instance2 Just to

Re: Moving replica from node to node?

2013-07-11 Thread Alan Woodward
And CREATE and UNLOAD are almost exactly the wrong descriptors, because CREATE loads up a core that's already there, and UNLOAD can in fact delete it from the filesystem… Alan Woodward www.flax.co.uk On 11 Jul 2013, at 20:15, Mark Miller wrote: Yeah, though CREATE and UNLOAD end up being

SolrJ 4.3 to Solr 1.4

2013-07-11 Thread Jonathan Rochkind
So, trying to use a SolrJ 4.3 to talk to an old Solr 1.4. Specifically to add documents. The wiki at http://wiki.apache.org/solr/Solrj suggests, I think, that this should work, so long as you: server.setParser(new XMLResponseParser()); However, when I do this, I still get a

Re: What happens in indexing request in solr cloud if Zookeepers are all dead?

2013-07-11 Thread Jack Krupansky
Sorry, no updates if no Zookeepers. There would be no way to assure that any node knows the proper configuration. Queries are a little safer using most recent configuration without zookeeper, but update consistency requires accurate configuration information. -- Jack Krupansky -Original

Re: SolrJ 4.3 to Solr 1.4

2013-07-11 Thread Chris Hostetter
: However, when I do this, I still get a org.apache.solr.common.SolrException: : parsing error from : org.apache.solr.client.solrj.impl.XMLResponseParser.processResponse(XMLResponseParser.java:143) it's impossible to guess what the underlying problem might be unless you can provide us the full

Re: Partial Matching in both query and field

2013-07-11 Thread James Bathgate
Jack, This still isn't working. I just upgraded to 3.6.2 to verify that wasn't the issue. Here's query information: lst name=params str name=debugQueryon/str str name=indenton/str str name=start0/str str name=q0_extrafield1_n:20454/str str name=rows10/str str name=version2.2/str /lst /lst

Re: SolrJ 4.3 to Solr 1.4

2013-07-11 Thread Jonathan Rochkind
Huh, that might have been a false problem of some kind. At the moment, it looks like I _do_ have my SolrJ 4.3 succesfully talking to a Solr 1.4, so long as I setParser(new XMLResponseParser()). Not sure what I changed or what wasn't working before, but great! So nevermind. Although if anyone

Re: Partial Matching in both query and field

2013-07-11 Thread James Bathgate
I just noticed I pasted the wrong fieldType with the extra tokenizer not commented out. fieldType name=ngram class=solr.TextField positionIncrementGap=100 autoGeneratePhraseQueries=false analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ filter

Re: Partial Matching in both query and field

2013-07-11 Thread Jack Krupansky
A couple of possibilities: 1. Make sure to reload the core. 2. Check that the Solr schema version is new enough to recognize autoGeneratePhraseQueries. 3. What query parser are you using? -- Jack Krupansky -Original Message- From: James Bathgate Sent: Thursday, July 11, 2013 5:26

Re: What does too many merges...stalling in indexwriter log mean?

2013-07-11 Thread Shawn Heisey
On 7/11/2013 1:47 PM, Tom Burton-West wrote: We are seeing the message too many merges...stalling in our indexwriter log. Is this something to be concerned about? Does it mean we need to tune something in our indexing configuration? It sounds like you've run into the maximum number of

Re: SolrJ 4.3 to Solr 1.4

2013-07-11 Thread Shawn Heisey
On 7/11/2013 2:24 PM, Jonathan Rochkind wrote: (If I _don't_ setParser to XML, and use the binary parser... I get a fully expected error about binary format corruption -- that part is expected and I understand it, that's why you have to use the XMLResponseParser instead). Am I not doing enough

Re: Partial Matching in both query and field

2013-07-11 Thread James Bathgate
1. My general process for a schema change (I know it's overkill) is delete the data directory, reload, index data, reload again. 2. I'm using schema version 1.5 on Solr 3.6.2. schema name=SearchSpringDefault version=1.5 3. LuceneQParser, but I've also tried dismax and edismax. Here's my

How to set a condition over stats result

2013-07-11 Thread Matt Lieber
Hello, I am trying to see how I can test the sum of values of an attribute across docs. I.e. Whether sum(myfieldvalue)100 . I know I can use the stats module which compiles the sum of my attributes on a certain facet , but how can I perform a test this result (i.e. Is sum100) within my stats

POST question

2013-07-11 Thread John Randall
I want to use a browser and use HTTP POST to add a single document (not a file)  to Solr. I don't want to use cURL. I've made several attempts, such as the following:   http://localhost:8080/solr/update?commit=truestream.type=text/xml;adddocfield   name=id61234567/fieldfield name=titleWAR OF THE

RE: POST question

2013-07-11 Thread Roland Villemoes
Hi John, You can't make a browser to a HTTP POST by adding a URL in a browser. You are doing a HTTP GET. So - use curl, or make a small application for doing the HTTP POST. Or even better: Use a browser plugin. Several of these exists. Example: DEV HTTP CLIENT extension for Chrome.

Re: POST question

2013-07-11 Thread Shawn Heisey
On 7/11/2013 4:12 PM, John Randall wrote: I want to use a browser and use HTTP POST to add a single document (not a file) to Solr. I don't want to use cURL. I've made several attempts, such as the following: http://localhost:8080/solr/update?commit=truestream.type=text/xml;adddocfield

Re: POST question

2013-07-11 Thread John Randall
I'll try the plugin. Thanks. From: Roland Villemoes r...@alpha-solutions.dk To: solr-user@lucene.apache.org solr-user@lucene.apache.org; John Randall jmr...@yahoo.com Sent: Thursday, July 11, 2013 6:21 PM Subject: RE: POST question Hi John, You can't make a

Re: POST question

2013-07-11 Thread John Randall
I'll probably move to Solr 4.x, so I'm going to try a plugin instead. Thanks for you insights. From: Shawn Heisey s...@elyograg.org To: solr-user@lucene.apache.org Sent: Thursday, July 11, 2013 6:28 PM Subject: Re: POST question On 7/11/2013 4:12 PM, John

preferred container for running SolrCloud

2013-07-11 Thread Ali, Saqib
1) Jboss 2) Jetty 3) Tomcat 4) Other.. ?

Re: preferred container for running SolrCloud

2013-07-11 Thread Saikat Kanjilal
We're running under jetty. Sent from my iPhone On Jul 11, 2013, at 6:06 PM, Ali, Saqib docbook@gmail.com wrote: 1) Jboss 2) Jetty 3) Tomcat 4) Other.. ?

Re: preferred container for running SolrCloud

2013-07-11 Thread Ali, Saqib
With the embedded Zookeeper or separate Zookeeper? Also have run into any issues with running SolrCloud on jetty? On Thu, Jul 11, 2013 at 7:01 PM, Saikat Kanjilal sxk1...@hotmail.comwrote: We're running under jetty. Sent from my iPhone On Jul 11, 2013, at 6:06 PM, Ali, Saqib

Re: preferred container for running SolrCloud

2013-07-11 Thread Anshum Gupta
On production, I'd highly recommend you to run Zk separately as that'd give you, among other things, the liberty of shutting down a SolrCloud instance. I haven't heard or seen any SolrCloud issues while running it on jetty. On Fri, Jul 12, 2013 at 7:57 AM, Ali, Saqib docbook@gmail.com wrote:

Re: preferred container for running SolrCloud

2013-07-11 Thread Walter Underwood
Embedded Zookeeper is only for dev. Production needs to run a ZK cluster. --wunder On Jul 11, 2013, at 7:27 PM, Ali, Saqib wrote: With the embedded Zookeeper or separate Zookeeper? Also have run into any issues with running SolrCloud on jetty? On Thu, Jul 11, 2013 at 7:01 PM, Saikat

Re: preferred container for running SolrCloud

2013-07-11 Thread Ali, Saqib
Thanks Walter. And the container.. On Thu, Jul 11, 2013 at 7:55 PM, Walter Underwood wun...@wunderwood.orgwrote: Embedded Zookeeper is only for dev. Production needs to run a ZK cluster. --wunder On Jul 11, 2013, at 7:27 PM, Ali, Saqib wrote: With the embedded Zookeeper or

Re: preferred container for running SolrCloud

2013-07-11 Thread Walter Underwood
We use Tomcat for everything. It might not be the best, but it is what our Ops group is used to. wunder On Jul 11, 2013, at 7:58 PM, Ali, Saqib wrote: Thanks Walter. And the container.. On Thu, Jul 11, 2013 at 7:55 PM, Walter Underwood wun...@wunderwood.orgwrote: Embedded

RE: preferred container for running SolrCloud

2013-07-11 Thread Saikat Kanjilal
Separate Zookeeper. Date: Thu, 11 Jul 2013 19:27:18 -0700 Subject: Re: preferred container for running SolrCloud From: docbook@gmail.com To: solr-user@lucene.apache.org With the embedded Zookeeper or separate Zookeeper? Also have run into any issues with running SolrCloud on jetty?

RE: preferred container for running SolrCloud

2013-07-11 Thread Saikat Kanjilal
One last thing, no issues with jetty. The issues we did have was actually running separate zookeeper clusters. From: sxk1...@hotmail.com To: solr-user@lucene.apache.org Subject: RE: preferred container for running SolrCloud Date: Thu, 11 Jul 2013 20:13:27 -0700 Separate Zookeeper.

Re: How to set a condition over stats result

2013-07-11 Thread Jack Krupansky
None that I know of, short of writing a custom search component. Seriously, you could hack up a copy of the stats component with your own logic. Actually... this may be a case for the new, proposed Script Request Handler, which would let you execute a query and then you could do any custom

Re: How to set a condition over stats result

2013-07-11 Thread mihaela olteanu
What if you perform sub(sum(myfieldvalue),100) 0 using frange? From: Jack Krupansky j...@basetechnology.com To: solr-user@lucene.apache.org Sent: Friday, July 12, 2013 7:44 AM Subject: Re: How to set a condition over stats result None that I know of, short

How to set a condition on the number of docs found

2013-07-11 Thread Matt Lieber
Hello there, I would like to be able to know whether I got over a certain threshold of doc results. I.e. Test (Result.numFound 10 ) - true. Is there a way to do this ? I can't seem to find how to do this; (other than have to do this test on the client app, which is not great). Thanks, Matt

Re: Usage of CloudSolrServer?

2013-07-11 Thread sathish_ix
Hi , Iam using cloudsolrserver to connect to solrcloud, im indexing the documents using solrj API using cloudsolrserver object. Index is triggered on master node of a collection, whereas if i need to find the status of the loading , it return the message from replica where status is null. How to