Re: SolrException: Error trying to proxy request for url: solr/sync-status/admin/system

2017-06-20 Thread S G
Got no response on the solr-user mailing list and so trying the dev-mailing list. Please guide me if this should not be done. But I thought that the issue looks strange enough to post it here. Thanks SG On Mon, Jun 19, 2017 at 8:13 PM, S G wrote: > Hi, > > We are

Re: How are people using the ICUTokenizer?

2017-06-20 Thread Alexandre Rafalovitch
I used it in a demo where I searched for Thai words using approximate English sound-equivalent: https://github.com/arafalov/solr-thai-test/blob/master/collection1/conf/schema.xml#L34 I thought that was pretty cool and unexpectedly powerful :-) Regards, Alex. http://www.solr-start.com/ -

Re: Problems with elevation component configuration

2017-06-20 Thread Jeffery Yuan
Created https://issues.apache.org/jira/browse/SOLR-10928 Support elevate.q in QueryElevationComponent to track this. -- View this message in context: http://lucene.472066.n3.nabble.com/Problems-with-elevation-component-configuration-tp3993204p4342075.html Sent from the Solr - User mailing list

RE: Estimating CPU

2017-06-20 Thread Lewin Joy (TMS)
Hmm. Thanks Erick and Markus. I'll check this. -Lewin -Original Message- From: Markus Jelsma [mailto:markus.jel...@openindex.io] Sent: Tuesday, June 20, 2017 1:04 PM To: solr-user@lucene.apache.org Subject: RE: Estimating CPU To add on Erick, First thing that comes to mind, you also

RE: Indexing PDF files with Solr 6.6 while allowing highlighting matched text with context

2017-06-20 Thread Allison, Timothy B.
>http - however, the big advantage of doing your indexing on different machine >is that the heavy lifting that tika does in extracting text from documents, >finding metadata etc is not happening on the server. If the indexer crashes, >it doesn’t affect Solr either. +1 for what can go wrong:

Re: DIH delta import with cache 5.3.1 issue

2017-06-20 Thread Sujay Bawaskar
Hi, Did not encounter this issue with solr 6.x. But delta import with cache executes nested query for every element encountered in parent query. Since this select does not have where clause because we are using cache, it takes long time. So delta import witch cache is very slow. My observation is

DataImportHandler - Delta Import

2017-06-20 Thread Roopesh Uniyal
Hello, I am running *Solr 3.5* and using Data Import Handler. I am using the following query - Although the FULL Import is running fine but the delta import is having trouble. Here is what I am experiencing - 1. Delta Imports are working in cumulative fashion - any increment (delta) is

Re: Facet is not working while querying with group

2017-06-20 Thread Aman Deep Singh
Again the same problem started to occur and I haven't change any schema It's only coming to the Numeric data types only (tint,tdouble) and that too in group query only If I search with string field type it works fine. Steps which i have followed 1. drop the old collection 2. create the new

Re: Will Solr support google like organic search ?

2017-06-20 Thread Alexandre Rafalovitch
I think you are still several steps away from having an actual Solr question. Yes, you could use Solr to search a different data set (Ads). The devil is in the details. Where do those ads come from, what do they match (same keywords as search?), how are they ranked (for Google by Auction, I

Re: Facet is not working while querying with group

2017-06-20 Thread Shawn Heisey
On 6/20/2017 12:07 AM, Aman Deep Singh wrote: > Again the same problem started to occur and I haven't change any schema > It's only coming to the Numeric data types only (tint,tdouble) and that too > in group query only > If I search with string field type it works fine. > > Steps which i have

Automatically Restart Solr

2017-06-20 Thread rojerick luna
Hi, I'm trying to automate Solr restart every week. I created a stop.bat and updated the start.bat which I found on an article online. Using stop.bat and start.bat is working fine. However when I created a Task Scheduler (Windows Scheduler) and setup the frequency to stop and start (using the

Re: Indexing PDF files with Solr 6.6 while allowing highlighting matched text with context

2017-06-20 Thread ZiYuan
Dear Erick and Timothy, I also took a look at the Python clients (say, SolrClient and pysolr) because Python is my main programming language. I have an impression that 1. they send HTTP requests to the server according to the server APIs; 2. they are not official and thus possibly not up to date.

Could not load collection from ZK:

2017-06-20 Thread Aman Deep Singh
I'm facing a issue in solr sometimes zookeeper failes to load the solr collection stating org.apache.solr.common.SolrException: Could not load collection from ZK: My current setup details is 1. 5 Nodes with 4 cores ,7.6 GB RAM each which contains solr node and zookeeper 2. No

Languages dialects

2017-06-20 Thread Moshe Recanati | KMS
Hi, We've a request to support the following dialects using Solr. Let me know if this is supported dialects or we need to implement something in our code. 1. Chinese - Mandarin 2. French - Canadian 3. Portuguese - European 4. Spanish - European Thank you, Regards,

RE: Indexing PDF files with Solr 6.6 while allowing highlighting matched text with context

2017-06-20 Thread Allison, Timothy B.
Yeah, Chris knows a thing or two about Tika. :) -Original Message- From: ZiYuan [mailto:ziyu...@gmail.com] Sent: Tuesday, June 20, 2017 8:00 AM To: solr-user@lucene.apache.org Subject: Re: Indexing PDF files with Solr 6.6 while allowing highlighting matched text with context No

Re: Indexing PDF files with Solr 6.6 while allowing highlighting matched text with context

2017-06-20 Thread ZiYuan
No intention of spamming but I also want to mention tika-python in the toolchain. Ziyuan On Tue, Jun 20, 2017 at 2:29 PM, ZiYuan wrote: > Dear Erick and Timothy, > > I also took a look at the Python clients (say, SolrClient and

Re: Facet is not working while querying with group

2017-06-20 Thread Aman Deep Singh
Hi Shawn, If I am using docValues=false getting this exception java.lang.IllegalStateException: Type mismatch: isBlibliShipping was indexed with multiple values per document, use SORTED_SET instead at

Re: Give boost only if entire value is present in Query

2017-06-20 Thread alessandro.benedetti
Interesting. it seems almost correct to me. Have you explored the content of the field ( for example using the schema browser) ? When you say " don't match" it means you don't get results at all or just the boost is not applied ? I would recommend to simply the request handler, maybe just

Re: Swapping indexes on disk

2017-06-20 Thread Shawn Heisey
On 6/14/2017 12:26 PM, Mike Lissner wrote: > We are replacing a drive mounted at /old with one mounted at /new. Our > index currently lives on /old, and our plan was to: > > 1. Create a new index on /new > 2. Reindex from our database so that the new index on /new is properly > populated. > 3.

Re: install solr service possible bug

2017-06-20 Thread Shawn Heisey
On 6/14/2017 7:47 AM, Susheel Kumar wrote: > Can anyone confirm if this "service --version" command works ? For me > to install in SUSE distribution, "service --version" commands always > fail and abort the solr installation with printing the error "Script > requires the 'service' command" To make

Tlogs not being deleted/truncated

2017-06-20 Thread Webster Homer
We have a solr cloud collection that gets a full update every morning, via cdcr replication. We see that the target tlogs do not seem to get truncated or deleted as described here https://lucidworks.com/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/ I checked we

Re: Could not load collection from ZK:

2017-06-20 Thread Shawn Heisey
On 6/20/2017 6:08 AM, Aman Deep Singh wrote: > I'm facing a issue in solr sometimes zookeeper failes to load the solr > collection stating > > org.apache.solr.common.SolrException: Could not load collection from ZK: This is not the full error message. It will be dozens of lines long, and may

Re: Could not load collection from ZK:

2017-06-20 Thread Aman Deep Singh
This error is coming in the application which is using solrj to communicate to the solr full stacktrace is Request processing failed; nested exception is com.gdn.solr620.org.apache. solr.common.SolrException: Could not load collection from ZK: productCollection

Re: Tlogs not being deleted/truncated

2017-06-20 Thread Erick Erickson
Maybe irrelevant, but soft commits don't truncate transaction logs, only hard commits do (openSearcher true|false doesn't matter). Full background here: https://lucidworks.com/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/ Not entirely sure how that interacts with

Re: Swapping indexes on disk

2017-06-20 Thread Mike Lissner
Thanks for the suggestions everybody. Some responses to Shawn's questions: > Does your solr.xml file contain core definitions, or is that information in a core.properties file in each instanceDir? Were using core.properties files. > How did you install Solr Solr is installed just by

Is it possible to support context filtering for FuzzyLookupFactory?

2017-06-20 Thread Jeffery Yuan
FuzzyLookupFactory is great as it can still find matches even if users mis-spell. context filtering is also great, as we can only show suggestions based on user's languages, doc types etc But its a pity that (seems) FuzzyLookupFactory and context filtering don't work together.

Re: How are people using the ICUTokenizer?

2017-06-20 Thread Joel Bernstein
What got me interested was that under the covers the ICUTokenizer is using http://icu-project.org/apiref/icu4j/com/ibm/icu/text/BreakIterator.html. Looks like we can get sentences and titles fairly easily and paragraphs with some extra work. Joel Bernstein http://joelsolr.blogspot.com/ On

Re: SolrJ - How to add a blocked document without child documents

2017-06-20 Thread Jeffery Yuan
Mikhail Khludnev provided the workaround in https://issues.apache.org/jira/browse/SOLR-6096: So, far the workaround is to nest empty child w/o fields or with id only field. -- It works -- View this message in context:

Re: Tlogs not being deleted/truncated

2017-06-20 Thread Webster Homer
Yes, soft commits are irrelevant for this. What is relevant about soft commits is that we can search the data. We have autoCommit set to 10 minutes and never see tlogs truncated. Apparently autoCommit doesn't fire, ever. Neither in our source collection nor in our target collections. The more I

RE: Indexing PDF files with Solr 6.6 while allowing highlighting matched text with context

2017-06-20 Thread Phil Scadden
http - however, the big advantage of doing your indexing on different machine is that the heavy lifting that tika does in extracting text from documents, finding metadata etc is not happening on the server. If the indexer crashes, it doesn’t affect Solr either. -Original Message-

Re: Tlogs not being deleted/truncated

2017-06-20 Thread Erick Erickson
bq: Neither in our source collection nor in our target collections. Hmmm. You should see messages similar to the following which I just generated on Solr 6.2 (stand-alone I admit but that code should be the same): INFO - 2017-06-20 21:11:55.424; [ x:techproducts]

Re: Estimating CPU

2017-06-20 Thread Erick Erickson
In a word, "stress test". Here's the blog I wrote on topic outlining why it's hard to give a more helpful answer https://lucidworks.com/2012/07/23/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/ You might want to explore the hyper-log-log approach which provides pretty

RE: Estimating CPU

2017-06-20 Thread Markus Jelsma
To add on Erick, First thing that comes to mind, you also have a huge heap, do you really need it to be that large, if not absolutely necessary, reduce it. If you need it because of FieldCache, consider DocValues instead and reduce the heap again. Use tools like VisualVM to see what the CPU is

Re: Give boost only if entire value is present in Query

2017-06-20 Thread Aman Deep Singh
It was not matching the results for that particular field below is the debug data (+DisjunctionMaxQuerynameSearchNoSyn:7 nameSearchNoSyn:armour)~2)^9.0 | ((brandSearch:7 brandSearch:armour)~2) | ((nameSearch:7 nameSearch:armour)~2)^4.0 | (keywords:7 armour)^11.0 | ((descSearchNoSyn:7

How are people using the ICUTokenizer?

2017-06-20 Thread Joel Bernstein
It seems that there are some powerful capabilities in the ICUTokenizer. I was wondering how the community is making use of it. Does anyone have experience working with the ICUTokenizer that they can share? Joel Bernstein http://joelsolr.blogspot.com/

can you fix this index?

2017-06-20 Thread Zhang, Ziqi
I am running a program that crawls the web and saves data into a solr index. for mysterious reasons, the solr server crashed. And now I end up with a corrupted index that has no segment files and hence risking losing all my data collected for 5 days The error message reads as below when

RE: How are people using the ICUTokenizer?

2017-06-20 Thread Davis, Daniel (NIH/NLM) [C]
Joel, I think the issue is doing word-breaking according to ICU rules. So, if you are trying to make sure your index breaks words properly on eastern languages, just use ICU Tokenizer. Unless your text is already in an ICU normal form, you should always use the ICUNormalizer character

Re: Could not load collection from ZK:

2017-06-20 Thread Aman Deep Singh
Sorry Shawn, It didn't copy entire stacktrace I put the stacktrace at https://www.dropbox.com/s/zf8b87m24ei2ils/solr%20exception2?dl=0 Note: I have shaded the solr library under com.gdn.solr620 so all solr class will be appear as com.gdn.solr620.org.apache.solr.* On Tue, Jun 20, 2017 at 8:09

[ANNOUNCE] Apache Solr Reference Guide for 6.6 Released

2017-06-20 Thread Cassandra Targett
The Lucene PMC is pleased to announce the release of the Solr Reference Guide for Solr 6.6. This 966-page PDF is the definitive guide to using Apache Solr, the search server built on Apache Lucene. The Guide can be downloaded from:

Re: [ANNOUNCE] Apache Solr Reference Guide for 6.6 Released

2017-06-20 Thread Cassandra Targett
I wanted to add a follow-up to my announcement for the Solr 6.6 Reference Guide. This release of the Guide is the first with a new publication process [1] and A LOT has changed. First, we have migrated the Ref Guide completely out of Confluence (aka CWIKI) and now follow a "docs with code" model.

Re: How are people using the ICUTokenizer?

2017-06-20 Thread David Hastings
Have you successfully used the shingles with the MoreLikeThis query? Really curious about if this would to return the "interesting Phrases" On Tue, Jun 20, 2017 at 12:01 PM, Davis, Daniel (NIH/NLM) [C] < daniel.da...@nih.gov> wrote: > Joel, > > I think the issue is doing word-breaking according

RE: How are people using the ICUTokenizer?

2017-06-20 Thread Davis, Daniel (NIH/NLM) [C]
The GUI is not built yet, so the jury is out. I plan to include switches to do the MoreLikeThis both ways, but I think it will do a better job because this is a specific case study/example in classification in the book Taming Text by Grant Ingersoll. It is a reasonable assumption that he

RE: How are people using the ICUTokenizer?

2017-06-20 Thread Allison, Timothy B.
> So, if you are trying to make sure your index breaks words properly on > eastern languages, just use ICU Tokenizer. I defer to the expertise on this list, but last I checked ICUTokenizer uses dictionary lookup to tokenize CJK. This may work well for some tasks, but I haven't evaluated

Re: Tlogs not being deleted/truncated

2017-06-20 Thread Webster Homer
:Looking at our cdcr source collection, it too doesn't look like a commit occurred, so I sent one manually. >From this I believe that autoCommit isn't working in Solr 6.2 Our ETL doesn't send commits, we rely upon autoCommit, and for CDCR you have to have autoCommit. We also have autoSoftCommit

Re: Solr 6: how to get SortedSetDocValues from index by field name

2017-06-20 Thread SOLR4189
Hi, Tomas. It helped. Thank you. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-6-how-to-get-SortedSetDocValues-from-index-by-field-name-tp4340388p4342002.html Sent from the Solr - User mailing list archive at Nabble.com.

Estimating CPU

2017-06-20 Thread Lewin Joy (TMS)
** PROTECTED 関係者外秘 Hi, Is there anyway to estimate the CPU needed to setup solr environment? We use pivot facets extensively. We use it in json facet api and also native queries. For our 150 million record collection, we are seeing high CPU usage of 100% with small loads. If we have to