Re: Is it impossible to update an index that is undergoing an optimize?

2015-11-04 Thread Shawn Heisey
On 11/4/2015 1:17 PM, Yonik Seeley wrote: > On Wed, Nov 4, 2015 at 3:06 PM, Shawn Heisey wrote: >> I had understood that since 4.0, Solr (Lucene) can continue to update an >> index even while that index is optimizing. > Yes, that should be the case. > >> I have discovered in

Re: SolrClient: reuse the client or just call close()?

2015-11-04 Thread Shawn Heisey
On 11/4/2015 12:52 PM, tedsolr wrote: > I'm wondering what the best practice is for the implementations of > SolrClient: CloudSolrClient & HttpSolrClient. I am caching my clients per > collection (core) and reusing them right now. Initially this was prompted by > the old solr wiki page

Re: highlighting on child document

2015-11-04 Thread Mikhail Khludnev
Hello, Highlighter for block join hasn't been implemented. So, far you can call highlighter with children query also passing fq={!child ..}parent-id:. On Wed, Nov 4, 2015 at 7:57 PM, Yangrui Guo wrote: > Hi > > I want to highlight matched terms on child documents because

To update or change your SolrCloud configuration files

2015-11-04 Thread Aswath Srinivasan (TMS)
Hi fellow SOLR developers, https://cwiki.apache.org/confluence/display/solr/Using+ZooKeeper+to+Manage+Configuration+FilesThis link says the below To update or change your SolrCloud configuration files: 1. Download the latest configuration files from ZooKeeper, using the source control

RE: tikaparser docx file fails with exception

2015-11-04 Thread Aswath Srinivasan (TMS)
Trying to index a document. A docx file. Ending up with the below exception. Not sure why it is erroring out. When I opened the docx I was able to see lots of binary data like embedded pictures etc., Is there a possible solution to this or is it a bug? Only one such file fails. Rest of the

Re: tikaparser docx file fails with exception

2015-11-04 Thread Erick Erickson
Possibly a corrupt file? Tika does its best, but bad data is...bad data. You can experiment a bit with using Tika in Java, that might give you a better idea of what's really going on, here's a SolrJ example: https://lucidworks.com/blog/2012/02/14/indexing-with-solrj/ Best, Erick On Wed, Nov 4,

copyField

2015-11-04 Thread Steven White
Hi, I have 100's of fields to search against based on some pre-defined static rules. So fields A, B, C to be searched as group-X, fields A, B, D, E, F as group-Y, fields B, E, F, G as group-Z. Each group is made up of 100's of fields (at least 500). I can use copyField variations to copy into

Re: copyField

2015-11-04 Thread Chris Hostetter
: 1) Give my need, am I losing anything by writing my own copy-field in my : Java code vs. using Solr's copyField in the schema? nope. : 2) How do I prevent a case where when I copy data from field A and B where : A has "Fable of the Throbbing" and B has "Genius of a Tank Town" which get :

Re: To update or change your SolrCloud configuration files

2015-11-04 Thread Erick Erickson
Right, the comments about source control are a bit misleading, that's just where _I'd_ keep my configs ;). You can pull the configs from Zookeeper as well with the "downconfig", see: https://cwiki.apache.org/confluence/display/solr/Command+Line+Utilities As to why the file doesn't seem to change,

Re: FieldCache?

2015-11-04 Thread Chris Hostetter
: What is the implication of this? Should we move all facets to DocValues : when we have high cardinality (lots of values) ? Are we adding it back? 1) Using DocValues is almost certainly a good idea moving forward for situations where the FieldCache was used in the past. : FieldCache is gone

Re: [CONF] Apache Solr Reference Guide > Result Grouping

2015-11-04 Thread Zheng Lin Edwin Yeo
For me, I'm using the signature field grouping method, as shown from this website: https://cwiki.apache.org/confluence/display/solr/De-Duplication You can set the signatureField to be "title", then during the query, instead of using =true=title, you can use =true=signature Regards, Edwin On 4

Solr Features

2015-11-04 Thread Salman Ansari
Hi, I am in the process of looking for a comprehensive list of Solr features in order to assess how much have we implemented, what are some features that we were unaware of that we can utilize etc. I have looked at the following link for Solr features http://lucene.apache.org/solr/features.html

SolrClient: reuse the client or just call close()?

2015-11-04 Thread tedsolr
I'm wondering what the best practice is for the implementations of SolrClient: CloudSolrClient & HttpSolrClient. I am caching my clients per collection (core) and reusing them right now. Initially this was prompted by the old solr wiki page and SOLR-861. Is

OpenNLP plugin or similar NER software for Solr ??? !!!

2015-11-04 Thread liviuchristian
Hi everyone, I need to install a plugin to extract Location (Country/State/City) from free text documents - any professional advice?!? Does OpenNLP really does the job? Is it English only? US only? Or does it cover worldwide places names? Could someone help me with this job - installation,

Is it impossible to update an index that is undergoing an optimize?

2015-11-04 Thread Shawn Heisey
I had understood that since 4.0, Solr (Lucene) can continue to update an index even while that index is optimizing. I have discovered in the logs of my SolrJ index maintenance program that this does not appear to actually be true. My dev index is running Solr 5.2.1 and my production indexes are

Re: OpenNLP plugin or similar NER software for Solr ??? !!!

2015-11-04 Thread Doug Turnbull
David Smiley had a place name and general tagging engine that for the life of me I can't find. It didn't do NER for you (I'm not sure you want to do this in the search engine) but it helps you tag entities in a search engine based on a predefined list. At least that's what I remember. On Wed,

Re: Is it impossible to update an index that is undergoing an optimize?

2015-11-04 Thread Yonik Seeley
On Wed, Nov 4, 2015 at 3:06 PM, Shawn Heisey wrote: > I had understood that since 4.0, Solr (Lucene) can continue to update an > index even while that index is optimizing. Yes, that should be the case. > I have discovered in the logs of my SolrJ index maintenance program

Re: API collection, Migrate deleting all data

2015-11-04 Thread Alessandro Benedetti
Hi Philippa, can you show us an example of document ? In particular i would like to see the ID you are using. I would expect a compositeId in the form: shardkey!id have you verified first of all that the compositeId routing and shardKey is currently working ? This is the first step, as I think

Re: API collection, Migrate deleting all data

2015-11-04 Thread philippa griggs
Hello Alessandro, An example of a document: { "SolrShard": "05/10/2015!02bebd2e-f12b-4787-bef2-57b0c5ed42ee", "ContentVersion": 11, "Session_Browser": "IE 11", "Session_UserAgent": [ "Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko"

Managing ZIP files inside ZIP files

2015-11-04 Thread Frédéric Olier
Hi, I have a ZIP (tar.gz) that contains many (> 100) other tar.gz files inside. Solr takes ages to ingest the document. I'd like to know if other users experienced with such a configuration and what the solution they found ? Is there a way to tell Solr to go '1 level deep' while analysing the

Re: Invalid parsing with solr edismax operators

2015-11-04 Thread Alessandro Benedetti
Here we go : Title^200 TotalField^1 + Jack explanation and you have the parsed query explained ! Cheers On 4 November 2015 at 12:56, Mahmoud Almokadem wrote: > Thank you Alessandro for your reply. > > Here is the request handler > > > > > explicit >10 >

Re: Solr facet query(critical solr query response)

2015-11-04 Thread Alessandro Benedetti
I suggest you to take a look to Yonik blog and in particular to the new Json Faceting approach and Nested document modelling. Hope this can help : http://yonik.com/solr-nested-objects/ ( nested objects ) http://yonik.com/json-facet-api/ ( nested facets) Cheers On 4 November 2015 at 12:15,

Re: Solr facet query(critical solr query response)

2015-11-04 Thread Erik Hatcher
Shouldn't td have s2=1 also? Try this: facet.pivot=vi,vk > On Nov 4, 2015, at 03:23, Mugeesh Husain wrote: > > i am facing issue of how to get below response from solr query. > > In my table, i have a two column vi and vk which has below values: > > row1:

Re: Solr facet query(critical solr query response)

2015-11-04 Thread Mugeesh Husain
Thanks Erik >>Shouldn't td have s2=1 also? yes was my mistake I have also a search concern I have a 3 table Resturant, Review,User retuarant-[restid, restname,location...] User-[userid, uname,] Review-[reviewID, userid,restid,comment...] I am searching based on restaurant, how could i will

Re: collection API timeout

2015-11-04 Thread Julien David
I forgot to mention that we are using Solr 4.9.0 and zookeeper 3.4.6 Thanks Julien Le 04/11/2015 11:37, Julien DAVID - Decalog a écrit : Hi all, We have a production environment composed by 6 solrcloud server and 3 zookeeper. We've got around 30 collections, with 6 shards each. We recently

API collection, Migrate deleting all data

2015-11-04 Thread philippa griggs
Hello, Solr 5.2.1, Zookeeper 3.4.6 I'm trying the use the solr Collection API to migrate documents in a test environment. I have two collections set up HotSessions - two shards, no replicas ColdSessions - 1 shard, no replicas. I've upload some sample data and using document routing with

Re: Invalid parsing with solr edismax operators

2015-11-04 Thread Alessandro Benedetti
Hi Mahmoud, can you send us the solrconfig.xml snippet of your request handler please ? It's kinda strange you get a boost factor for the Title field and that parsing query, according to your config. Cheers On 4 November 2015 at 08:39, Mahmoud Almokadem wrote: > Hello,

Re: Invalid parsing with solr edismax operators

2015-11-04 Thread Mahmoud Almokadem
Thank you Alessandro for your reply. Here is the request handler explicit 10 TotalField AND edismax Title^200 TotalField^1 Mahmoud > On Nov 4, 2015, at 2:43 PM, Alessandro Benedetti > wrote: > > Hi

Re: Invalid parsing with solr edismax operators

2015-11-04 Thread Jack Krupansky
It is debatable whether this is a bug or just a poorly documented interaction of q.op, mm, and nested queries (within parentheses.) Personally, I'd say it is a bug. Edismax is only obeying q.op and mm for the top-level of the query - once you nest within parentheses the default operator reverts to

Re: Invalid parsing with solr edismax operators

2015-11-04 Thread Mahmoud Almokadem
I removed the q.op=“AND” and add the mm=2 when searching for (public libraries) I got 19 with "parsedquery_toString": "+(((Title:public^200.0 | TotalField:public^0.1) (Title:libraries^200.0 | TotalField:libraries^0.1))~2)", and when adding + and searching for +(public libraries) I got 1189

Re: collection API timeout

2015-11-04 Thread Erick Erickson
You may be hitting: https://issues.apache.org/jira/browse/SOLR-7459 (linked to https://issues.apache.org/jira/browse/SOLR-7049)... Best, Erick On Wed, Nov 4, 2015 at 5:00 AM, Julien David wrote: > I forgot to mention that we are using Solr 4.9.0 and zookeeper 3.4.6 > >

Re: Invalid parsing with solr edismax operators

2015-11-04 Thread Jack Krupansky
I think you should go ahead and file a Jira ticket for this as a bug since either it is an actual bug or some behavior nuance that needs to be documented better. -- Jack Krupansky On Wed, Nov 4, 2015 at 8:24 AM, Mahmoud Almokadem wrote: > I removed the q.op=“AND” and

phrase query

2015-11-04 Thread Brian Narsi
I have the following field type: I am trying to use dismax query parser (because it seems to have better phrase query support compared with standard query parser?) I have mm = 1 ps = 4 I have the following data indexed: 1) acute care pharmaceuticals 2) carefusion llc When

Re: Managing ZIP files inside ZIP files

2015-11-04 Thread Alexandre Rafalovitch
How are you injesting them now? I'd probably use Java8 with SolrJ and use new Virtual File System approach to read right out of the zip and gzip . http://docs.oracle.com/javase/8/docs/api/java/nio/file/FileSystems.html#newFileSystem-java.nio.file.Path-java.lang.ClassLoader- Tar is a bit harder,

Re: how to change uniqueKey?

2015-11-04 Thread Steve Rowe
Hi Oleksandr, > On Nov 3, 2015, at 9:24 AM, Oleksandr Yermolenko wrote: > > Hello, All, > > I can't find the way to change uniqueKey in "managed-schema" environment!!! […] > 7. The first and the last question: the correct way changing uniqueKey in > schemaless environment?

Re: phrase query

2015-11-04 Thread Alessandro Benedetti
Hi Brian, this is a relevancy problem. I would suggest you to use : http://splainer.io ( it's a really nice Doug tool, related to Quepid) to get why you have one doc in front of another. Then Please attach in here the result. If you want to give us all the details please also attach the

Re: OpenNLP plugin or similar NER software for Solr

2015-11-04 Thread Alessandro Benedetti
Hi Christian. This was quite easy to have, since 2011. But you can complicate this as much as you want. Or customise it as much as you want. Take a look : https://cwiki.apache.org/confluence/display/solr/UIMA+Integration https://wiki.apache.org/solr/SolrUIMA This is a good painless starting

Spacial search returns us wrong results

2015-11-04 Thread Frederic MERCEUR
Dear All, I have a problem with the spacial search that returns us wrong results. Here is the way I declare our spacial bbox index with Solr 5.4 : id distanceUnits="kilometers" numberType="_bbox_coord"/> precisionStep="8" docValues="true" stored="false"/> [...] Here

Re: Spacial search returns us wrong results

2015-11-04 Thread fmerceur
Hi Alessandro, Thanks for answer. Yes, I have tried to removed the second query. Most of the results are corrects but some as this one are not. The second query are just to point on a specific record. I will try to debug query tomorrow. Cheers, Fred Le 2015-11-04 18:25, Alessandro

Re: logical steps to configuring file-based spell-check

2015-11-04 Thread Alessandro Benedetti
Hi Mark, it should be that simple. Just download the solr you want ( 5.3.0 for example is fine). Use this as a baseline : /Users/alessandro/Downloads/solr-5.3.1/server/solr/configsets/sample_techproducts_configs/conf Please report any weird behaviour. Cheers On 1 November 2015 at 16:03, Mark

highlighting on child document

2015-11-04 Thread Yangrui Guo
Hi I want to highlight matched terms on child documents because I need to determine which field matched the search terms. However when I use block join solr returned empty highlight fields. How can I use highlight with nested document? Or is there anyway to tell which field matched the query

Re: Spacial search returns us wrong results

2015-11-04 Thread Alessandro Benedetti
Hi Frederic, Have you tried to debug the query ? Have you tried to remove the second filter query ? I assume the spatial filter query is not working properly, so only the second filter query is evaluated. Please give us some additional details and let's try to help you ! Cheers On 4 November

Re: highlighting on child document

2015-11-04 Thread Alessandro Benedetti
My colleagues will correct me if i am wrong. Solr Join is actually not the same as Relational Join. This means that you can return in the result only one layer of entities ( the parent layer or the child layer ) even if your original search was on a different layer. You can search on children and

Re: Securing field level access permission by filtering the query itself

2015-11-04 Thread Alessandro Benedetti
Of course it depends of all the query parameter you use and you process in the response. The list you wrote should be ok if you use only those components. For example if you use highlight, it's not ok and you need to take care of the highlighted fields as well. Cheers On 30 October 2015 at

Re: [CONF] Apache Solr Reference Guide > Result Grouping

2015-11-04 Thread Jan Høydahl
I second Toke’s recommendation to ensure you have a pure string-version of your title. For pure de-duplication you could also consider the lighter-weight CollapseComponent Instead of =true=title, use ={!collapse field=title_string} See

Invalid parsing with solr edismax operators

2015-11-04 Thread Mahmoud Almokadem
Hello, I'm using solr 4.8.1. Using edismax as the parser we got the undesirable parsed queries and results. The following is two different cases with strange behavior: Searching with these parameters "mm":"2", "df":"TotalField", "debug":"true", "indent":"true", "fl":"Title",

Solr facet query(critical solr query response)

2015-11-04 Thread Mugeesh Husain
i am facing issue of how to get below response from solr query. In my table, i have a two column vi and vk which has below values: row1: vi:["ta"] ,vk:["s1"] row2: vi:["tb"] ,vk:["s2"] row3: vi:["tc"] ,vk:["s0"] row4: vi:["td"] ,vk:["s2"] row5: vi:["ta"] ,vk:["s1"] row5: vi:["tc"] ,vk:["s0"]

Re: Apache Solr SpellChecker Integration with the default select request handler

2015-11-04 Thread Rajani Maski
The attached exception seems to be stripped off. Anyways, >>I want to integrate spellcheck handler with default select handler. Please guide me how can I achieve this. If you were unable to follow the steps mentioned on reference guide[2], here is another link[1] that gives same but quick setup

Re: Securing field level access permission by filtering the query itself

2015-11-04 Thread Douglas McGilvray
Thanks Alessandro, I had overlooked the highlighting component. I will also add a reminder to exclude these fields from spellcheck fields, (or maintain different spellcheck fields for different roles). @Scott - Once I started planning my code the penny finally dropped regarding your point

Re: Solr hard commit

2015-11-04 Thread Alessandro Benedetti
Just thinking loud, but Mapping is a OS feature. For example , if a recovery is happening ( and it involves less than 100 docs of difference), the Tlog will be read to reproduce the data. This will cause the Tlog to be accessed and ideally memory mapped. Am I correct ? Cheers On 28 October 2015

collection API timeout

2015-11-04 Thread Julien DAVID - Decalog
Hi all, We have a production environment composed by 6 solrcloud server and 3 zookeeper. We've got around 30 collections, with 6 shards each. We recently moved from 3 solr to 6, splitting the shards (3 to 6). As the last weeks were a low period we didn't noticed any problem. But since monday,