RE: Need help on Joining and sorting syntax and limitations between multiple documents in solr-4.4.0

2013-11-22 Thread Sukanta Dey
Hi Team, I am attaching all the required files we are using to get the VJOIN functionality along with the actual requirement statement. Hope this would help you understand better the requirement for VJOIN functionality. Thanks, Sukanta From: Sukanta Dey Sent: Wednesday, September 04, 2013

Few Clarification on Apache Solr front

2013-11-22 Thread topgun
We are planning to migrate a website from its proprietary CMS to Drupal. As they have been using a 3rd party enterprise search service(Endeca), we have proposed Apache-solr as replacement. We are in the process of proof of concept with respect to Apache-Solr. We would like to understand certain

Solrcloud: external fields and frequent commits

2013-11-22 Thread Flavio Pompermaier
Hi to all, we're migrating from solr 3.x to solr 4.x to use Solrcloud and I have two big doubts: 1) External fields. When I compute such a file do I have to copy it in the data directory of shards..? The external fields boosts the results of the query to a specific collection, for me it doesn't

useColdSearcher in SolrCloud config

2013-11-22 Thread ade-b
Hi The definition of useColdSearcher config element in solrconfig.xml is If a search request comes in and there is no current registered searcher, then immediately register the still warming searcher and use it. If false then all requests will block until the first searcher is done warming. By

NullPointerException

2013-11-22 Thread Adrien RUFFIE
Hello all, I have perform a full indexation with solr, but when I try to perform an incrementation indexation I get the following exception (cf attachment). Any one have a idea of the problem ? Greate thank 23 oct. 2013 08:34:40 org.apache.solr.handler.dataimport.DataImporter doDeltaImport

Saravanan Chinnadurai/Actionimages is out of the office.

2013-11-22 Thread Saravanan . Chinnadurai
I will be out of the office starting 17/11/2013 and will not return until 01/12/2013. Please email to itsta...@actionimages.com for any urgent issues. Action Images is a division of Reuters Limited and your data will therefore be protected in accordance with the Reuters Group Privacy / Data

Possible parent/child query bug

2013-11-22 Thread Neil Ireson
Note sure if this is a bug but, for me, it was unexpected behaviour. http://localhost:8090/solr/select?q={!child+of=doc_type:parent}*:* returns all the child docs, as expected, however http://localhost:8090/solr/select?q={!child+of=doc_type:parent} returns all the parent docs. This seems

Re: Split shard and stream sub-shards to remote nodes?

2013-11-22 Thread Shalin Shekhar Mangar
The splitting process is nothing but the creation of a bitset with which a LiveDocsReader is created. These readers are then added to the a new index via IW.addIndexes(IndexReader[] readers) method. All this is performed below the IR/IW API and no documents are actually ever read or written

Re: a function query of time, frequency and score.

2013-11-22 Thread Erick Erickson
Not quite sure what you're asking. The field() function query brings the value of a field into the score, something like: http://localhost:8983/solr/select?wt=jsonfl=id%20scoreq={!boost%20b=field(popularity)}ipod Best, Erick On Thu, Nov 21, 2013 at 10:43 PM, sling sling...@gmail.com wrote:

Re: Few Clarification on Apache Solr front

2013-11-22 Thread Erick Erickson
1 indexing a few contents from a node. Well, you build the ingestion pipeline so it's up to the code you build. 2 It's all about analysis. When you build your schema, you determine how you need to treat your data and you you're searching on it and build the analysis chain for each field

Re: Solrcloud: external fields and frequent commits

2013-11-22 Thread Erick Erickson
1 I'm not quite sure I understand. External File Fields are keyed by the unique id of the doc. So every shard _must_ have the eff available for at least the documents in that shard. At first glance this doesn't look simple. Perhaps a bit more explanation of what you're using EFF for? 2 Let's be

Re: SolrCloud unstable

2013-11-22 Thread Martin de Vries
We did some more monitoring and have some new information: Before the issue happens the garbage collector's collection count increases a lot. The increase seems to start about an hour before the real problem occurs: http://www.analyticsforapplications.com/GC.png [1] We tried both the g1

Re: useColdSearcher in SolrCloud config

2013-11-22 Thread Erick Erickson
bq: By the term 'block', I assume SOLR returns a non 200 Pretty sure not. The query just waits around in the queue in the server until the searcher is done warming, then the search is executed and results returned. bq: If a new SOLR server No. Apart from any ugly details about caches and

Solr logs encoding to UTF8

2013-11-22 Thread Ing. Jorge Luis Betancourt Gonzalez
Hi everybody: Is there any way of forcing an UTF-8 conversion on the queries that are logged into the log? I've deployed solr in tomcat7. The file appears to be an UTF-8 file but I'm seeing this in the logs: INFO: [] webapp=/solr path=/select

Re: Solr logs encoding to UTF8

2013-11-22 Thread Erick Erickson
what are you using to view the file? Looks like whatever it is isn't configured to do the proper thing with UTF-8. Best, Erick On Fri, Nov 22, 2013 at 8:28 AM, Ing. Jorge Luis Betancourt Gonzalez jlbetanco...@uci.cu wrote: Hi everybody: Is there any way of forcing an UTF-8 conversion on

Re: Possible parent/child query bug

2013-11-22 Thread Neil Ireson
Some further odd behaviour. For my index http://localhost:8090/solr/select?q={!child+of=doc_type:parent}*:* Returns a numFound=“22984”, when there are only 2910 documents in the index (748 parents, 2162 children). On 22 Nov 2013, at 12:28, Neil Ireson n.ire...@sheffield.ac.uk wrote:

Re: Leading and trailing wildcard with phrase query and positional ordering

2013-11-22 Thread Dmitry Kan
Hi Ankur, For the leading wildcard you may want to try the ReversedWildcardFilterFactory: https://cwiki.apache.org/confluence/display/solr/Filter+Descriptions#FilterDescriptions-ReversedWildcardFilter in the code of CPQ there is a loop over filters of your text field and a specific check:

Re: Possible parent/child query bug

2013-11-22 Thread Mikhail Khludnev
Neil, quick hint. Can't you run Solr (jetty) with -ea ? my feeling is that nested query (which you put *:*http://localhost:8090/solr/select?q=%7B%21child+of=doc_type:parent%7D*:*) should be orthogonal to children, that's confirmed by assert. That's true for {!parent} at least. On Fri, Nov 22,

Re: Possible parent/child query bug

2013-11-22 Thread Neil Ireson
Hi Mikhail, You are right. If the child of” query matches both parent and child docs it returns the child documents but a spurious numFound. For the “parent which” query if it matches both parent and child docs it returns a handy error message “child query must only match non-parent docs...

How to work with remote solr savely?

2013-11-22 Thread Stavros Delisavas
Hello Solr-Friends, I have a question about working with solr which is installed on a remote server. I have a php-project with a very big mysql-database of about 10gb and I am also using solr for about 10,000,000 entries indexed for fast search and access of the mysql-data. I have a local copy

Re: Possible parent/child query bug

2013-11-22 Thread Mikhail Khludnev
On Fri, Nov 22, 2013 at 4:28 PM, Neil Ireson n.ire...@sheffield.ac.ukwrote: returns all the child docs, as expected, however http://localhost:8090/solr/select?q={!child+of=doc_type:parent} returns all the parent docs. aha. I remember it. I implemented this special case for reusing

Re: How to work with remote solr savely?

2013-11-22 Thread michael.boom
Use HTTP basic authentication, setup in your servlet container (jetty/tomcat). That should work fine if you are *not* using SolrCloud. - Thanks, Michael -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-work-with-remote-solr-savely-tp4102612p4102613.html Sent from

RE: How to work with remote solr savely?

2013-11-22 Thread Hoggarth, Gil
We solved this issue outside of Solr. As you've done, restrict the server to localhost access to Solr, add firewall rules to allow your developers on port 80, and proxypass allowed port 80 transfer to Solr. Remember to include the proxypassreverse too. (This runs on linux and apache httpd btw.)

Re: How to work with remote solr savely?

2013-11-22 Thread Stavros Delisavas
Thanks for your fast reply. First of all http basic authentication unfortunatly is not secure. Also this would give every developer full admin priviliges. Anyways, can you tell me where I can do those configurations? Are there any alternative or more secure ways to restrict solr-access? In

Re: How to work with remote solr savely?

2013-11-22 Thread michael.boom
http://wiki.apache.org/solr/SolrSecurity#Path_Based_Authentication Maybe you could achieve write/read access limitation by setting path based authentication: The update handler /solr/core/update should be protected by authentication, with credentials only known to you. But then of course, your

RE: How to work with remote solr savely?

2013-11-22 Thread Hoggarth, Gil
You could also use one of the proxy scripts, such as http://code.google.com/p/solr-php-client/, which is coincidentally linked (eventually) from Michael's suggested SolrSecurity URL. -Original Message- From: michael.boom [mailto:my_sky...@yahoo.com] Sent: 22 November 2013 14:53 To:

Re: Periodic Slowness on Solr Cloud

2013-11-22 Thread Dave Seltzer
Hi Shawn, Wow! Thank you for your considered reply! I'm going to dig into these issues, but I have a few questions: Regarding memory: Including duplicate data in shard replicas the entire index is 350GB. Each server hosts a total of 44GB of data. Each server has 28GB of memory. I haven't been

Solr XSLT Problems

2013-11-22 Thread Furkan KAMACI
I use Solr 4.5.1 and run xslt examples on it. *Whenever* I make a request to q=*:*wt=xslttr=example.xsl Sometimes it says me there are 0 records sometimes 30 (actually there is 30). I tried some other xsls and it is same. I t does not work every time. Are there any body who had same issue?

Re: Solr XSLT Problems

2013-11-22 Thread Furkan KAMACI
Ok, I investigated the reason. There was unstability at some folders of my test system. 2013/11/22 Furkan KAMACI furkankam...@gmail.com I use Solr 4.5.1 and run xslt examples on it. *Whenever* I make a request to q=*:*wt=xslttr=example.xsl Sometimes it says me there are 0 records

Re: Periodic Slowness on Solr Cloud

2013-11-22 Thread Shawn Heisey
On 11/22/2013 8:13 AM, Dave Seltzer wrote: Regarding memory: Including duplicate data in shard replicas the entire index is 350GB. Each server hosts a total of 44GB of data. Each server has 28GB of memory. I haven't been setting -Xmx or -Xms, in the hopes that Java would take the memory it needs

Re: Periodic Slowness on Solr Cloud

2013-11-22 Thread Shawn Heisey
On 11/22/2013 10:01 AM, Shawn Heisey wrote: You can see how much the max heap is in the Solr admin UI dashboard - it'll be the right-most number on the JVM-Memory graph. On my 64-bit linux development machine with 16GB of RAM, it looks like Java defaults to a 4GB max heap. I have the heap

Re: Periodic Slowness on Solr Cloud

2013-11-22 Thread Dave Seltzer
Thanks so much Shawn, I think you (and others) are completely right about this being heap and GC related. I just did a test while not indexing data and the same periodic slowness was observable. On to GC/Memory Tuning! Many Thanks! -Dave On Fri, Nov 22, 2013 at 12:09 PM, Shawn Heisey

Re: Periodic Slowness on Solr Cloud

2013-11-22 Thread Raymond Wiker
You mentioned earlier that you are not setting -Xms/-Xmx; the values actually in use would then depend on the Java version, whether you're running 32- or 64-bit Java, whether Java thinks your machines are servers, and whether you have specified the -server flag – and possibly a few other

Re: Periodic Slowness on Solr Cloud

2013-11-22 Thread Dave Seltzer
Wow. That is one noisy command! Full output is below. The grepped output looks like: [solr@searchtest07 ~]$ java -XX:+PrintFlagsFinal -version | grep -i -E 'heapsize|permsize|version' uintx AdaptivePermSizeWeight= 20 {product} uintx ErgoHeapSizeLimit

can't overwrite and can't delete by id

2013-11-22 Thread Mingfeng Yang
Recently, I found out that I can't delete doc by id or overwrite a doc from/in my SOLR index which is based on SOLR 4.4.0 version. Say, I have a doc http://pastebin.com/GqPP4Uw4 (to make it easier to view, I use pastebin here). And I tried to add a dynamic field rank_ti to it, want to make

Re: Split shard and stream sub-shards to remote nodes?

2013-11-22 Thread Otis Gospodnetic
Ouch :( I guess it's as efficient as it can be but too bad, because writing to a remove node sounds awesomely cool to me at least. :) Thanks for explaining the key bits, Shalin. Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support *

Re: csv does not return custom fields (distance)

2013-11-22 Thread GaneshSe
Any help on this is greatly appreciated. -- View this message in context: http://lucene.472066.n3.nabble.com/csv-does-not-return-custom-fields-distance-tp4102313p4102656.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: How to work with remote solr savely?

2013-11-22 Thread Bill Bell
Do you have a sample jetty XML to setup basic auth for updates in Solr? Sent from my iPad On Nov 22, 2013, at 7:34 AM, michael.boom my_sky...@yahoo.com wrote: Use HTTP basic authentication, setup in your servlet container (jetty/tomcat). That should work fine if you are *not* using

Re: can't overwrite and can't delete by id

2013-11-22 Thread Mingfeng Yang
BTW: it's a 4 shards solorcloud cluster using zookeeper 3.3.5 On Fri, Nov 22, 2013 at 11:07 AM, Mingfeng Yang mfy...@wisewindow.comwrote: Recently, I found out that I can't delete doc by id or overwrite a doc from/in my SOLR index which is based on SOLR 4.4.0 version. Say, I have a doc

removing dead replicas in solrcloud 4.4

2013-11-22 Thread Eric Parish
My 4.4 sorlcloud cluster has several down replicas that need to be removed. I am looking for a solution to clean them up like the deletereplica api available in 4.6. Will manually removing the replicas from the clusterstate.json file in zookeeper accomplish my needs? Thanks, Eric

Re: useColdSearcher in SolrCloud config

2013-11-22 Thread Bill Bell
Wouldn't that be true means use cold searcher? It seems backwards to me... Sent from my iPad On Nov 22, 2013, at 2:44 AM, ade-b adrian.bro...@gmail.com wrote: Hi The definition of useColdSearcher config element in solrconfig.xml is If a search request comes in and there is no current

Re: Boosting documents by categorical preferences

2013-11-22 Thread Chris Hostetter
: I thought about that but my concern/question was how. If I used the pow : function then I'm still boosting the bad categories by a small : amount..alternatively I could multiply by a negative number but does that : work as expected? I'm not sure i understand your concern: negative powers would

Re: NullPointerException

2013-11-22 Thread Bill Bell
It seems to be a modified row and referenced in EvaluatorBag. I am not familiar with either. Sent from my iPad On Nov 22, 2013, at 3:05 AM, Adrien RUFFIE a.ruf...@e-deal.com wrote: Hello all, I have perform a full indexation with solr, but when I try to perform an incrementation

Re: Document Security Model Question

2013-11-22 Thread kchellappa
Thanks Rajinimaski for the reposnse. Agree that if the changes are frequent, then first option wouldn't work efficiently. Also the other challenge is that in our case for each resource, it is easy/efficient to get a list of changes since last checkpoint (because of our model of deployment of

Re: removing dead replicas in solrcloud 4.4

2013-11-22 Thread Timothy Potter
Yes, I've done this ... but I had to build my own utility to update clusterstate.json (for reasons I can't recall now). So make your changes to clusterstate.json manually and then do something like the following with SolrJ: public static void updateClusterstateJsonInZk(CloudSolrServer

Reverse mm(min-should-match)

2013-11-22 Thread Doug Turnbull
Instead of specifying a percentage or number of query terms must match tokens in a field, I'd like to do the opposite -- specify how much of a field must match a query. The problem I'm trying to solve is to boost document titles that closely match the query string. If a title looks something like

Re: csv does not return custom fields (distance)

2013-11-22 Thread Gopal Patwa
if you are using Solr 4.0 there was some issue related to field alias which was fixed in Solr 4.3 https://issues.apache.org/jira/browse/SOLR-4671 you should try to reproduce this issue using latest Solr version 4.5.1 On Fri, Nov 22, 2013 at 11:28 AM, GaneshSe ganeshmail...@gmail.com wrote:

Re: Reverse mm(min-should-match)

2013-11-22 Thread Bill Bell
This is an awesome idea! Sent from my iPad On Nov 22, 2013, at 12:54 PM, Doug Turnbull dturnb...@opensourceconnections.com wrote: Instead of specifying a percentage or number of query terms must match tokens in a field, I'd like to do the opposite -- specify how much of a field must

Re: Reverse mm(min-should-match)

2013-11-22 Thread Erik Hatcher
Does order matter?By exact you mean the same tokens in the same positions? Erik On Nov 22, 2013, at 2:54 PM, Doug Turnbull dturnb...@opensourceconnections.com wrote: Instead of specifying a percentage or number of query terms must match tokens in a field, I'd like to do the

RE: Reverse mm(min-should-match)

2013-11-22 Thread Doug Turnbull
Hmm... Not necessarily. I'd be happy with any ordering for now. Though some notion of order and slop would be nice in the future Sent from my Windows Phone From: Erik Hatcher Sent: 11/22/2013 3:32 PM To: solr-user@lucene.apache.org Subject: Re: Reverse mm(min-should-match) Does order matter?

RE: Reverse mm(min-should-match)

2013-11-22 Thread Doug Turnbull
If I could get at the number of tokens in a query or query norms I might be able to use that in conjunction with field norms to measure how close the query is to the field in terms of number of tokens. Then regular mm could do the trick. Sent from my Windows Phone From: Doug Turnbull Sent:

Re: Periodic Slowness on Solr Cloud

2013-11-22 Thread Dave Seltzer
So I made a few changes, but I still seem to be dealing with this pesky periodic slowness. Changes: 1) I'm now only forcing commits every 5 minutes. This was done by specifying commitWithin=30 when doing document adds. 2) I'm specifying an -Xmx12g to force the java heap to take more memory 3)

Re: Periodic Slowness on Solr Cloud

2013-11-22 Thread Shawn Heisey
On 11/22/2013 2:17 PM, Dave Seltzer wrote: So I made a few changes, but I still seem to be dealing with this pesky periodic slowness. Changes: 1) I'm now only forcing commits every 5 minutes. This was done by specifying commitWithin=30 when doing document adds. 2) I'm specifying an -Xmx12g

Re: Solrcloud: external fields and frequent commits

2013-11-22 Thread Flavio Pompermaier
On Fri, Nov 22, 2013 at 2:21 PM, Erick Erickson erickerick...@gmail.comwrote: 1 I'm not quite sure I understand. External File Fields are keyed by the unique id of the doc. So every shard _must_ have the eff available for at least the documents in that shard. At first glance this doesn't look

Re: Solrcloud: external fields and frequent commits

2013-11-22 Thread Erick Erickson
about 1. Well, at a high level you're right, of course. Having the EFF stuff in a single place seems more elegant. But then ugly details crop up. I.e. one place implies that you'd have to fetch them over the network, potentially a very expensive operation every time there was a commit. Is this

building custom cache - using lucene docids

2013-11-22 Thread Roman Chyla
Hi, docids are 'ephemeral', but i'd still like to build a search cache with them (they allow for the fastest joins). i'm seeing docids keep changing with updates (especially, in the last index segment) - as per https://issues.apache.org/jira/browse/LUCENE-2897 That would be fine, because i could