Facets for fields in subdocuments with block join, is it possible?

2014-02-11 Thread Henning Ivan Solberg
Hello, I'm testing block join in solr 4.6.1 and wondering, is it possible to get facets for fields in subdocuments with number of hits based on ROOT documents? See example below: doc documentPartROOT/documentPart texttesting 123/text titletitle/test groupGRP/group

Re: Group.Facet issue in Sharded Solr Setup

2014-02-11 Thread rks_lucene
Quick follow up on my question below and if anyone is using Group.facets in a sharded solr setup ? Based on further testing, the group.facets counts dont seem reliable at all for lesser popular items in the facet list. -- View this message in context:

Re: Facets for fields in subdocuments with block join, is it possible?

2014-02-11 Thread Mikhail Khludnev
Hello Henning, There is no open source facet component for child level of block-join. There is no even open jira for this. Don.t think it helps. 11.02.2014 12:22 пользователь Henning Ivan Solberg h...@lovdata.no написал: Hello, I'm testing block join in solr 4.6.1 and wondering, is it

Set up embedded Solr container and cores programmatically to read their configs from the classpath

2014-02-11 Thread Robert Krüger
Hi, I have an application with an embedded Solr instance (and I want to keep it embedded) and so far I have been setting up my Solr installation programmatically using folder paths to specify where the specific container or core configs are. I have used the CoreContainer methods createAndLoad

How to Learn Linked Configuration for SolrCloud at Zookeeper

2014-02-11 Thread Furkan KAMACI
Hi; I've written a code that I can update a file to Zookeeper for SlorCloud. Currently I have many configurations at Zookeeper for SolrCloud. I want to update synonyms.txt file so I should know the currently linked configuration (I will update the synonyms.txt file under appropriate configuration

Re: How to Learn Linked Configuration for SolrCloud at Zookeeper

2014-02-11 Thread Alan Woodward
For a particular collection or core? There should be a collection.configName property specified for the core or collection which tells you which ZK config directory is being used. Alan Woodward www.flax.co.uk On 11 Feb 2014, at 11:49, Furkan KAMACI wrote: Hi; I've written a code that I

Re: How to Learn Linked Configuration for SolrCloud at Zookeeper

2014-02-11 Thread Furkan KAMACI
I am looking it for a particular collection. 2014-02-11 13:55 GMT+02:00 Alan Woodward a...@flax.co.uk: For a particular collection or core? There should be a collection.configName property specified for the core or collection which tells you which ZK config directory is being used. Alan

Re: How to Learn Linked Configuration for SolrCloud at Zookeeper

2014-02-11 Thread Furkan KAMACI
Hi; OK, I've checked the source code and implemented that: public String readConfigName(SolrZkClient zkClient, String collection) throws KeeperException, InterruptedException { String configName = null; String path = ZkStateReader.COLLECTIONS_ZKNODE + / + collection;

Re: Lowering query time

2014-02-11 Thread Joel Cohen
I'd like to thank you for lending a hand on my query time problem with SolrCloud. By switching to a single shard with replicas setup, I've reduced my query time to 18 msec. My full ingestion of 300k+ documents went down from 2 hours 50 minutes to 1 hour 40 minutes. There are some code changes that

Urgent Help. Best Way to have multiple OR Conditions for same field in SOLR

2014-02-11 Thread rajeev.nadgauda
HI, I am new to SOLR , we have CRM data for Contacts and Companies which are in millions, we have switched to SOLR for fast search results. PROBLEM: We have large inclusion and exclusion lists with names of companies or contacts. Ex: Include or Exclude : company A Company B Company C

solr-query with NOT and OR operator

2014-02-11 Thread Johannes Siegert
Hi, my solr-request contains the following filter-query: fq=((-(field1:value1)))+OR+(field2:value2). I expect solr deliver documents matching to ((-(field1:value1))) and documents matching to (field2:value2). But solr deliver only documents, that are the result of (field2:value2). I

Re: solr-query with NOT and OR operator

2014-02-11 Thread Mikhail Khludnev
http://wiki.apache.org/solr/CommonQueryParameters#debugQuery and http://wiki.apache.org/solr/CommonQueryParameters#explainOther usually help so much On Tue, Feb 11, 2014 at 7:57 PM, Johannes Siegert johannes.sieg...@marktjagd.de wrote: Hi, my solr-request contains the following

Re: Tf-Idf for a specific query

2014-02-11 Thread David Miller
Hi Erick, Slower queries for getting facets can be tolerated, as long as they don't affect those without facets. The requirement is for a separate query which can get me both term vector and facet counts. One issue I am facing is that, for a search query I only want the term vectors and facet

Re: solr-query with NOT and OR operator

2014-02-11 Thread Jack Krupansky
With so many parentheses in there, I wonder what you are really trying to do Try expressing your query in simple English first so that we can understand your goal. But generally, a purely negative nested query must have a *:* term to apply the exclusion against: fq=((*:*

Re: Lowering query time

2014-02-11 Thread Erick Erickson
Hmmm, I'm still a little puzzled BTW. 300K documents, unless they're huge, shouldn't be taking 100 minutes. I can index 11M documents on my laptop (Wikipedia dump) in 45 minutes for instance Of course that's a single core, not cloud and not replicas... So possibly it' on the data acquisition

Re: Urgent Help. Best Way to have multiple OR Conditions for same field in SOLR

2014-02-11 Thread Erick Erickson
right, 10K Boolean clauses are not very efficient. You actually can up the limit here, but still... Consider a post filter, here's a place to start: http://lucene.apache.org/solr/4_3_1/solr-core/org/apache/solr/search/PostFilter.html Best, Erick On Tue, Feb 11, 2014 at 6:47 AM, rajeev.nadgauda

Re: solr-query with NOT and OR operator

2014-02-11 Thread Erick Erickson
Solr/Lucene is not strictly Boolean logic, this trips up a lot of people. Excellent blog on the subject here: http://searchhub.org/dev/2011/12/28/why-not-and-or-and-not/ Best, Erick On Tue, Feb 11, 2014 at 8:22 AM, Jack Krupansky j...@basetechnology.comwrote: With so many parentheses in

Re: Is \'optimize\' necessary for a 45-segment Solr 4.6 index?

2014-02-11 Thread Shawn Heisey
On 2/11/2014 3:27 AM, Jäkel, Guido wrote: Dear Shawn, On 2/9/2014 11:41 PM, Arun Rangarajan wrote: I have a 28 GB Solr 4.6 index with 45 segments. Optimize failed with an 'out of memory' error. Is optimize really necessary, since I read that lucene is able to handle multiple segments well

Re: solr-query with NOT and OR operator

2014-02-11 Thread Johannes Siegert
Hi Jack, thanks! fq=((*:* -(field1:value1)))+OR+(field2:value2). This is the solution. Johannes Am 11.02.2014 17:22, schrieb Jack Krupansky: With so many parentheses in there, I wonder what you are really trying to do Try expressing your query in simple English first so that we can

Re: Is 'optimize' necessary for a 45-segment Solr 4.6 index?

2014-02-11 Thread Arun Rangarajan
Dear Shawn, Thanks for your reply. For now, I did merges in steps with maxSegments param (using HOST:PORT/CORE/update?optimize=truemaxSegments=10). First I merged the 45 segments to 10, and then from 10 to 5. (Merging from 5 to 2 again caused out-of-memory exception.) Now I have a 5-segment index

handleSelect=true with SolrCloud

2014-02-11 Thread Jeff Wartes
I’m working on a port of a Solr service to SolrCloud. (Targeting v4.6.0 at present.) The old query style relied on using /solr/select?qt=foo to select the proper requestHandler. I know handleSelect=true is deprecated now, but it’d be pretty handy for testing to be able to be backwards

boost group doclist members

2014-02-11 Thread David Santamauro
Without falling into the x/y problem area, I'll explain what I want to do: I would like to group my result set by a field, f1 and within each group, I'd like to boost the score of the most appropriate member of the group so it appears first in the doc list. The most appropriate member is

Re: handleSelect=true with SolrCloud

2014-02-11 Thread Shawn Heisey
On 2/11/2014 10:21 AM, Jeff Wartes wrote: I’m working on a port of a Solr service to SolrCloud. (Targeting v4.6.0 at present.) The old query style relied on using /solr/select?qt=foo to select the proper requestHandler. I know handleSelect=true is deprecated now, but it’d be pretty handy for

Re: handleSelect=true with SolrCloud

2014-02-11 Thread Jeff Wartes
Got it in one. Thanks! On 2/11/14, 9:50 AM, Shawn Heisey s...@elyograg.org wrote: On 2/11/2014 10:21 AM, Jeff Wartes wrote: I¹m working on a port of a Solr service to SolrCloud. (Targeting v4.6.0 at present.) The old query style relied on using /solr/select?qt=foo to select the proper

Re: USER NAME Baruch Labunski

2014-02-11 Thread Baruch
Hello Wiki admin,  I would like to some value links. Can you please add me, my user name is Baruch Labunski Thank You, Baruch! On Thursday, January 16, 2014 2:12:32 PM, Baruch bar...@rogers.com wrote: Hello Wiki admin,  I would like to some value links. Can you please add me, my user

Re: Lowering query time

2014-02-11 Thread Joel Cohen
It's a custom ingestion process. It does a big DB query and then inserts stuff in batches. The batch size is tuneable. On Tue, Feb 11, 2014 at 11:23 AM, Erick Erickson erickerick...@gmail.comwrote: Hmmm, I'm still a little puzzled BTW. 300K documents, unless they're huge, shouldn't be taking

Re: handleSelect=true with SolrCloud

2014-02-11 Thread Joel Bernstein
Jeff, I believe the shards.qt parameter is what you're looking for. For example when using the /elevate handler with SolrCloud I use the following url to tell Solr to use the /elevate handler on the shards: http://localhost:8983/solr/collection1/elevate?q=ipodwt=jsonindent=trueshards.qt=/elevate

RE: handleSelect=true with SolrCloud

2014-02-11 Thread EXTERNAL Taminidi Ravi (ETI, Automotive-Service-Solutions)
Hi Jeff, it is not with elevated, I am talking in the link of Relevancy / Boost/ Score. Select productid from products where SKU = 101 Select Productid from products where ManufactureSKU = 101 Select Productid from product where SKU Like 101% Select Productid from Product where ManufactureSKU

Solr Autosuggest - Strange issue with leading numbers in query

2014-02-11 Thread Developer
I have a strange issue with Autosuggest. Whenever I query for a keyword along with numbers (leading) it returns the suggestion corresponding to the alphabets (ignoring the numbers). I was under assumption that it will return an empty result back. I am not sure what I am doing wrong. Can someone

Re: Indexing question on individual field update

2014-02-11 Thread shamik
Eric, Thanks for your reply. I should have given a better context. I'm currently running an incremental crawl daily on this particular source and indexing the documents. Incremental crawl looks for any change since last crawl date based on the document publish date. But, there's no way for me

RE: Solr server requirements for 100+ million documents

2014-02-11 Thread Susheel Kumar
Hi Otis, Just to confirm, the 3 servers you mean here are 2 for shards/nodes and 1 for Zookeeper. Is that correct? Thanks, Susheel -Original Message- From: Otis Gospodnetic [mailto:otis.gospodne...@gmail.com] Sent: Friday, January 24, 2014 5:21 PM To: solr-user@lucene.apache.org

Re: Solr server requirements for 100+ million documents

2014-02-11 Thread Otis Gospodnetic
Hi Susheel, No, we wouldn't want to go with just 1 ZK. :) Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ On Tue, Feb 11, 2014 at 5:18 PM, Susheel Kumar susheel.ku...@thedigitalgroup.net wrote: Hi Otis, Just to confirm,

RE: Solr server requirements for 100+ million documents

2014-02-11 Thread Susheel Kumar
Thanks, Otis for quick reply. So for ZK do you recommend separate servers and if so how many for initial Solr cloud cluster setup. -Original Message- From: Otis Gospodnetic [mailto:otis.gospodne...@gmail.com] Sent: Tuesday, February 11, 2014 4:21 PM To: solr-user@lucene.apache.org

Re: Indexing question on individual field update

2014-02-11 Thread shamik
Ok, I was wrong here. I can always set the indextimestamp field with current time (NOW) for every atomic update. On a similar note, is there any performance constraint with updates compared to add ? -- View this message in context:

Re: Solr server requirements for 100+ million documents

2014-02-11 Thread svante karlsson
ZK needs a quorum to keep functional so 3 servers handles one failure. 5 handles 2 node failures. If you Solr with 1 replica per shard then stick to 3 ZK. If you use 2 replicas use 5 ZK

Replica node down but zookeeper clusterstate not updated

2014-02-11 Thread Gopal Patwa
Solr = 4.6.1, attached solrcloud admin console view Zookeeper 3.4.5 = 3 node ensemble In my test setup, I have 3 Node SolrCloud setup with 2 shard. Today we had power failure and all node went down. I started 3 node zookeeper ensemble first then followed with 3 node solrcloud, and one of

Re: Indexing question on individual field update

2014-02-11 Thread Shawn Heisey
On 2/11/2014 2:37 PM, shamik wrote: Eric, Thanks for your reply. I should have given a better context. I'm currently running an incremental crawl daily on this particular source and indexing the documents. Incremental crawl looks for any change since last crawl date based on the document

Re: Solr server requirements for 100+ million documents

2014-02-11 Thread Jason Hellman
Whether you use the same machines as Solr or separate machines is a matter suited to taste. If you are the CTO, then you should make this decision. If not, inform management that risk conditions are greater when you share function and control on a single piece of hardware. A single failure

Re: Solr server requirements for 100+ million documents

2014-02-11 Thread Shawn Heisey
On 2/11/2014 3:28 PM, Susheel Kumar wrote: Thanks, Otis for quick reply. So for ZK do you recommend separate servers and if so how many for initial Solr cloud cluster setup. In a minimal 3-server setup, all servers would run zookeeper and two of them would also run Solr.With this setup, you

Re: FuzzyLookupFactory with exactMatchFirst not giving the exact match.

2014-02-11 Thread Hamish Campbell
I've tried the new SuggestComponent, however it doesn't work quite as expected. It returns the full field value rather than a list of corrections for the specific term. I can see how SuggestComponent would be excellent for phrase suggestions and document lookups, but it doesn't seem to be suitable

Re: FuzzyLookupFactory with exactMatchFirst not giving the exact match.

2014-02-11 Thread Hamish Campbell
Ah, I think the term frequency is only available for the Spellcheckers rather than the Suggesters - so I tried a DirectSolrSpellChecker. This gave me good spelling suggestions for misspelt terms, but if the term is spelled correctly I, again, get no term information and correctlySpelled is false.

Solr performance on a very huge data set

2014-02-11 Thread neerajp
Hello Dear, I have 1000 GB of data that I want to index. Assuming I have enough space for storing the indexes in a single machine. *I would like to get an idea about Solr performance for searching an item from a huge data set. Do I need to use shards for improving the Solr search efficiency or it

Re: USER NAME Baruch Labunski

2014-02-11 Thread Erick Erickson
Baruch: Is that your Wiki ID? We need that. But sure, we'll be happy to add you to the list... On Tue, Feb 11, 2014 at 11:03 AM, Baruch bar...@rogers.com wrote: Hello Wiki admin, I would like to some value links. Can you please add me, my user name is Baruch Labunski Thank You,

Re: Lowering query time

2014-02-11 Thread Erick Erickson
So my guess is you're spending by far the largest portion of your time doing the DB query(ies), which makes sense On Tue, Feb 11, 2014 at 11:50 AM, Joel Cohen joel.co...@bluefly.com wrote: It's a custom ingestion process. It does a big DB query and then inserts stuff in batches. The batch

Re: Solr Autosuggest - Strange issue with leading numbers in query

2014-02-11 Thread Erick Erickson
Hmmm, the example you post seems correct to me, the returned suggestion is really close to the term. What are you expecting here? The example is inconsistent with it returns the suggestion corresponding to the alphabets (ignoring the numbers) It looks like it's considering the numbers just fine,

Re: Indexing question on individual field update

2014-02-11 Thread Erick Erickson
Update and add are basically the same thing if there's an existing document. There will be some performance consequence since you're getting the stored fields on the server as opposed to getting the full input from the external source and handing it to Solr. However, I know of at least one

Re: Solr performance on a very huge data set

2014-02-11 Thread Erick Erickson
Can't answer that, there are just too many variables. Here's a helpful resource: http://searchhub.org/dev/2012/07/23/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/ Best, Erick On Tue, Feb 11, 2014 at 5:23 PM, neerajp neeraj_star2...@yahoo.com wrote: Hello Dear, I have

Re: Need feedback: Browsing and searching solr-user list emails

2014-02-11 Thread Alexandre Rafalovitch
Hi Durgam, You are asking a hard question. Yes, the idea looks interesting as an experiment. Possibly even useful in some ways. And I love the fact that you are eating your own dogfood (running Solr). And the interface looks nice (I guess this is your hosted Nimeyo offering underneath). Yet, I

Re: Join Scoring

2014-02-11 Thread David Smiley (@MITRE.org)
Hi Anand. Solr's JOIN query, {!join}, constant-scores. It's simpler and faster and more memory efficient (particularly the worse-case memory use) to implement the JOIN query without scoring, so that's why. Of course, you might want it to score and pay whatever penalty is involved. For that

Re: Join Scoring

2014-02-11 Thread anand chandak
Thanks David, really helpful response. You mentioned that if we have to add scoring support in solr then a possible approach would be to add a custom QueryParser, which might be taking Lucene's JOIN module. Curious, if it is possible instead to enhance existing solr's JoinQParserPlugin

Re: Spatial Score by overlap area

2014-02-11 Thread Smiley, David W.
Hi, BBoxStrategy is still only in “trunk” (not the 4x branch). And furthermore… the Solr portion, a FieldType, is over in Spatial-Solr-Sandbox — https://github.com/ryantxu/spatial-solr-sandbox/blob/master/LSE/src/main/ja va/org/apache/solr/spatial/pending/BBoxFieldType.java It should be quite

Unable to index mysql table

2014-02-11 Thread Tarun Sharma
Hi I downloaded solr and without any changes in directory structure i just followed the solr wiki and tried to import mysql table but unable to do... Actualy Im using the directory as is in example folder but copied the contrib jar files and lib tags here and there where required.. Please help in

Re: Unable to index mysql table

2014-02-11 Thread Alexandre Rafalovitch
What's unable to do actually translates to? Are you having troubles writing a particular config file? Are you getting an error message? Are you getting only some of the data in? Tell us exactly where you are stuck. Better, google first for exactly what you are stuck with, maybe it's already been

Re: Indexing question on individual field update

2014-02-11 Thread shamik
Thanks Eric and Shawn, appreciate your help. -- View this message in context: http://lucene.472066.n3.nabble.com/Indexing-question-on-individual-field-update-tp4116605p4116831.html Sent from the Solr - User mailing list archive at Nabble.com.