Increasing filterCache size and Java Heap size

2016-08-16 Thread Zheng Lin Edwin Yeo
Hi, Would like to check, do I need to increase my Java Heap size for Solr, if I plan to increase my filterCache size in solrconfig.xml? I'm using Solr 6.1.0 Regards, Edwin

Error During Indexing - org.apache.solr.common.SolrException; org.apache.solr.common.SolrException: early EOF

2016-08-16 Thread Jaspal Sawhney
Hello We are running solr 4.6 in master-slave configuration where in our master is used entirely for indexing. No search traffic comes to master ever. Off late we have started to get the early EOF error on the solr Master which results in a Broken Pipe error on the commerce application from

Re: The most efficient way to get un-inverted view of the index?

2016-08-16 Thread Joel Bernstein
You'll want to use org.apache.lucene.index.DocValues. The DocValues api has replaced the field cache. Joel Bernstein http://joelsolr.blogspot.com/ On Tue, Aug 16, 2016 at 8:18 PM, Roman Chyla wrote: > I need to read data from the index in order to build a special

The most efficient way to get un-inverted view of the index?

2016-08-16 Thread Roman Chyla
I need to read data from the index in order to build a special cache. Previously, in SOLR4, this was accomplished with FieldCache or DocTermOrds Now, I'm struggling to see what API to use, there is many of them: on lucene level: UninvertingReader.getNumericDocValues (and others)

Re: Creating a SolrJ Data Service to send JSON to Solr

2016-08-16 Thread Anshum Gupta
I would also suggest sending the JSON directly to the JSON end point, with the mapping : https://cwiki.apache.org/confluence/display/solr/Uploading+Data+with+Index+Handlers#UploadingDatawithIndexHandlers-JSONUpdateConveniencePaths On Tue, Aug 16, 2016 at 4:43 PM Alexandre Rafalovitch

Re: Creating a SolrJ Data Service to send JSON to Solr

2016-08-16 Thread Alexandre Rafalovitch
Why do you need a POJO? For Solr purposes, you could just get the field names from schema and use those to map directly from JSON to the 'addField' calls in SolrDocument. Do you need it for non-Solr purposes? Then you can search for generic Java dynamic POJO generation solution. Also, you could

Re: solr date range query

2016-08-16 Thread Alexandre Rafalovitch
Solr does support a Date Range field, though it is not super documented: https://cwiki.apache.org/confluence/display/solr/Working+with+Dates http://wiki.apache.org/solr/DateRangeField https://issues.apache.org/jira/browse/SOLR-6103 There is also an older trick of using Spatial to index date

Re: Modified stat of index

2016-08-16 Thread Alexandre Rafalovitch
I believe you can get that via Luke REST API: http://localhost:8983/solr//admin/luke Regards, Alex. Newsletter and resources for Solr beginners and intermediates: http://www.solr-start.com/ On 17 August 2016 at 07:18, Scott Derrick wrote: > I need to retrieve the

Re: Indexing (posting document) taking a lot of time

2016-08-16 Thread Alexandre Rafalovitch
What format are those documents? Solr XML? Custom JSON? Or are you sending PDF/binary documents to Solr's extract handler and asking it to do the extraction of the useful stuff? If later, you could take that step out of Solr with a custom client using Tika (what Solr has under the hood) and only

Re: Request to add probabilistic Query Parser Request Handler

2016-08-16 Thread Walter Underwood
In a search engine, “probabilistic” usually refers to a ranking model, as opposed to a vector space model. This name will almost certainly confuse people. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On Aug 16, 2016, at 3:16 PM, Akash Mehta

Re: Request to add probabilistic Query Parser Request Handler

2016-08-16 Thread Akash Mehta
UserName is mehtakash93. On 16 August 2016 at 15:11, Akash Mehta wrote: > The main aim of this request Handler is to get the best parsing for a > given query. This basically means recognizing different phrases within the > query. We need some kind of training data to

Request to add probabilistic Query Parser Request Handler

2016-08-16 Thread Akash Mehta
The main aim of this request Handler is to get the best parsing for a given query. This basically means recognizing different phrases within the query. We need some kind of training data to generate these phrases.

Modified stat of index

2016-08-16 Thread Scott Derrick
I need to retrieve the last modified timestamp of my search index. Is there a query I can use or is it stored in a particular file? thansk, Scott -- One man's "magic" is another man's engineering. "Supernatural" is a null word.” Robert A. Heinlein

Re: What's the best practices for indexing XML Content with dynamic XML Elements (SOLR 6.1) ?

2016-08-16 Thread Stan Lee
Sorry for not being specific. I believe this SOLR plugin (LUX) may fit my scenario (query without knowing the tag in advance). http://luxdb.org/README.html On Tue, Aug 16, 2016 at 12:18 PM, Erick Erickson wrote: > You haven't really described the scenario you want > to

Re: Need to understand solr merging and commit relationship

2016-08-16 Thread kshitij tyagi
i have 2 solr cores on a machine with same configs. Problem is I am getting faster indexing speed on core1 and slower on core2. Both cores have same index size and configuration. On Tue, Aug 16, 2016 at 11:34 PM, Erick Erickson wrote: > Why? What is the problem

Re: Need to understand solr merging and commit relationship

2016-08-16 Thread Erick Erickson
Why? What is the problem you're facing that you hope understanding more about these will help? Here are two places to start: http://blog.mikemccandless.com/2011/02/visualizing-lucenes-segment-merges.html

Need to understand solr merging and commit relationship

2016-08-16 Thread kshitij tyagi
I need to understand clearly that is there any relationship between solr merging and solr commit? If there is then what is it? Also i need to understand how both of these affect indexing speed on the core?

Re: Multiple rollups/facets in one streaming aggregation?

2016-08-16 Thread Joel Bernstein
For the initial implementation we could skip the merge piece if that helps get things done faster. In this scenario the metrics could be gathered after some parallel operation, then there would be no need for a merge. Sample syntax: metrics(parallel(join()) Joel Bernstein

Re: Multiple rollups/facets in one streaming aggregation?

2016-08-16 Thread Joel Bernstein
The concept of a MetricStream was in the early designs but hasn't yet been implemented. Now might be a good time to work on the implementation. The MetricStream wraps a stream and gathers metrics in memory, continuing to emit the tuples from the underlying stream. This allows multiple

Creating a SolrJ Data Service to send JSON to Solr

2016-08-16 Thread Jennifer Coston
Hello, I am trying to write a data service using SolrJ that will allow me to accept JSON through a REST API, create a Solr document ,and write it to multiple different Solr cores (depending on the core name specified). The problem I am running into is that each core is going to have a different

Re: SolrJ for .NET / C#

2016-08-16 Thread Joe Lawson
On Tue, Aug 16, 2016 at 12:24 PM, GW wrote: > Interesting, I managed to do Solr SQL > > It is true that pretty much all operations still work by calling a collection API directly. The benefits I'm referring to are dynamic cluster state discovery, routing of requests

Re: Delete replica on down node, after start down node, the deleted replica comes back.

2016-08-16 Thread Erick Erickson
Right, when you restart the downed node, all the structure is still on disk, i.e. the index is there, the core.properties file is there etc. I'm assuming you use the collections DELETEREPLICA command. Now when Solr starts up on that node, it uses "core discovery" to find all the "core.properties"

Re: SolrJ for .NET / C#

2016-08-16 Thread GW
Interesting, I managed to do Solr SQL On 16 August 2016 at 12:22, Joe Lawson wrote: > The sad part of doing plain old REST requests is you basically miss out on > all the SolrCloud features that are inherent in client call optimization > and collection

Re: SolrJ for .NET / C#

2016-08-16 Thread Joe Lawson
The sad part of doing plain old REST requests is you basically miss out on all the SolrCloud features that are inherent in client call optimization and collection discovery. It would be nice if some companies made /contrib offerings for different languages that could be better maintained. Most

Re: What's the best practices for indexing XML Content with dynamic XML Elements (SOLR 6.1) ?

2016-08-16 Thread Erick Erickson
You haven't really described the scenario you want to implement. I get that you have raw XML of an unknown structure. What do you want to _do_ with that? 1> if all you want to do is index the data (i.e. strip the tags) try HtmlStripCharFilterFactory. 2> If you want to intelligently take content

Re:

2016-08-16 Thread Erick Erickson
Please follow the unsubscribe instructions here: http://lucene.apache.org/solr/resources.html You must use the _exact_ e-mail address you first subscribed with. Let us know if that doesn't work. Best, Erick On Tue, Aug 16, 2016 at 7:41 AM, Rose, John B wrote: > unsubscribe

Multiple rollups/facets in one streaming aggregation?

2016-08-16 Thread Radu Gheorghe
Hello Solr users :) Right now it seems that if I want to rollup on two different fields with streaming expressions, I would need to do two separate requests. This is too slow for our use-case, when we need to do joins before sorting and rolling up (because we'd have to re-do the joins). Since in

Re: SolrJ for .NET / C#

2016-08-16 Thread GW
The client that comes with PHP is lame. If installed you should un-install php5-solr and install the Pecl/Pear libs which are good to the end of 5.x and 6.01. It tanks with 6.1. I defer to my own effort of changing everything to plain old REST requests. On 16 August 2016 at 10:39, GW

solr-user@lucene.apache.org

2016-08-16 Thread Rose, John B
unsubscribe

Re: SolrJ for .NET / C#

2016-08-16 Thread GW
As long as you are .NET you will be last in line. You try using the REST API. All you get with a .NET/C# lib is a wrapper for the REST API. On 16 August 2016 at 09:08, Joe Lawson wrote: > All I have seen is SolrNET, forks of SolrNET and people using

Re: SolrJ for .NET / C#

2016-08-16 Thread Shawn Heisey
On 8/16/2016 7:01 AM, Eirik Hungnes wrote: > I have been looking around for a library for .NET / C#. We are > currently using SolrNet, but that is ofc not as well equipped as > SolrJ, and have heard rumors occasionally about someone, also Lucene, > has been working on a port to other languages?

Re: Inconsistent results with solr admin ui and solrj

2016-08-16 Thread Jan Høydahl
I’m not sure of the root cause for your problem. Solr is built to stay in sync automatically, so there is no need to script anything in that regard. There may be something with your environement, network, ZooKeeper setup or similar that caused the state you were in. I would need to dig further

We are not the leader

2016-08-16 Thread Tamás Barta
Hi, We have two Solr 5.4.1 instances running in a ZK cluster. The system worked well for month but now something happened. Node1 is in "recovery" state (we didn't restarted it and didn't do anything with it) and Node2 is the only active. The problem is that Node2 says that "We are not the

Re: Inconsistent results with solr admin ui and solrj

2016-08-16 Thread Pranaya Behera
Hi, I did as you said, now it is coming ok. And what are the things to look for while checking about these kind of issues, such as mismatch count, lukerequest not returning all the fields etc. The doc sync is one, how can I programmatically use the info and sync them ? Is there any method

Re: Solr 6 Configuration - java.net.SocketTimeoutException

2016-08-16 Thread slee
Thanks. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-6-Configuration-java-net-SocketTimeoutException-tp4291813p4291935.html Sent from the Solr - User mailing list archive at Nabble.com.

What's the best practices for indexing XML Content with dynamic XML Elements (SOLR 6.1) ?

2016-08-16 Thread Stan Lee
We currently have a Microsoft SQL table with a XML datatype. We use DIH to import the XML Content as is, that is not using the XPathEntityProcessor. If the elements of the XML content is known, XPathEntity make sense. Could someone kindly suggest the right way of handling such scenario, without

Re: Indexing (posting document) taking a lot of time

2016-08-16 Thread Emir Arnautovic
That is quite big document! You need to minitor Solr to see if you are feeding documents fast enough or if you are saturating it with large number of large requests. Play with batch size and number of threads to find sweet spot. Maybe try extremes first (one doc/one thread, one doc many

Re: SolrJ for .NET / C#

2016-08-16 Thread Joe Lawson
All I have seen is SolrNET, forks of SolrNET and people using RestSharp. On Tue, Aug 16, 2016 at 9:01 AM, Eirik Hungnes wrote: > Hi > > I have been looking around for a library for .NET / C#. We are currently > using SolrNet, but that is ofc not as well equipped as SolrJ,

Re: Need Help Resolving Unknown Shape Definition Error

2016-08-16 Thread Jennifer Coston
Thanks David! I have updated my fieldType to be: And the queries seem to be working now! Thanks again! Jennifer From: David Smiley To: solr-user@lucene.apache.org Date: 08/15/2016 11:48 PM Subject:Re: Need Help Resolving Unknown Shape Definition

SolrJ for .NET / C#

2016-08-16 Thread Eirik Hungnes
Hi I have been looking around for a library for .NET / C#. We are currently using SolrNet, but that is ofc not as well equipped as SolrJ, and have heard rumors occasionally about someone, also Lucene, has been working on a port to other languages? -- Best regards, Eirik

Re: Indexing (posting document) taking a lot of time

2016-08-16 Thread kshitij tyagi
400kb is size of single document and i am sending 100 documents per request. solr heap size is 16gb and running on multithread. On Tue, Aug 16, 2016 at 5:10 PM, Emir Arnautovic < emir.arnauto...@sematext.com> wrote: > Hi, > > 400KB/doc * 100doc = 40MB. If you are running it single threaded, Solr

Re: solr date range query

2016-08-16 Thread GW
This query would indicate two multivalued fields This query will return results if you put in a value for the field eventEnddate of 10 years ago as long as the field eventStartdate is satisfied. On 16 August 2016 at 08:16, solr2020 wrote: >

Re: solr date range query

2016-08-16 Thread solr2020
eventStartdate:[2016-08-02T00:00:00Z TO 2016-08-05T23:59:59.999Z] OR eventEnddate:[2016-08-02T00:00:00Z TO 2016-08-05T23:59:59.999Z] this is my query. -- View this message in context: http://lucene.472066.n3.nabble.com/solr-date-range-query-tp4291918p4291922.html Sent from the Solr - User

Re: solr date range query

2016-08-16 Thread GW
can you send the query you are using? On 16 August 2016 at 08:03, solr2020 wrote: > yes. dates are stored as a single valued date field > > > > -- > View this message in context: http://lucene.472066.n3. > nabble.com/solr-date-range-query-tp4291918p4291920.html > Sent from

Re: solr date range query

2016-08-16 Thread solr2020
yes. dates are stored as a single valued date field -- View this message in context: http://lucene.472066.n3.nabble.com/solr-date-range-query-tp4291918p4291920.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: solr date range query

2016-08-16 Thread GW
Am I to assume these dates are stored in a single multivalued field? On 16 August 2016 at 07:51, solr2020 wrote: > Hi, > > We have list of events with events start date and end date.for eg: > event1 starts @ 2nd Aug 2016 ends @ 3rd Aug 2016 > event2 starts @ 4th Aug 2016

solr date range query

2016-08-16 Thread solr2020
Hi, We have list of events with events start date and end date.for eg: event1 starts @ 2nd Aug 2016 ends @ 3rd Aug 2016 event2 starts @ 4th Aug 2016 ends @ 5th Aug 2016 event3 starts @ 1st Aug 2016 ends @ 7th Aug 2016 event4 starts @ 15th july 2016 ends @ 15th Aug 2016 when user selects a date

Re: Indexing (posting document) taking a lot of time

2016-08-16 Thread Emir Arnautovic
Hi, 400KB/doc * 100doc = 40MB. If you are running it single threaded, Solr will be idle while accepting relatively large request. Or is 400KB 100 doc bulk that you are sending? What is Solr's heap size? I would try increasing number of threads and monitor Solr's heap/CPU/IO to see where is

Re: Indexing (posting document) taking a lot of time

2016-08-16 Thread kshitij tyagi
hi, we are sending about 100 documents per request for indexing? we have autocmmit set to false and commit only when 1 documents are present.solr and the machine sending request are in same pool. On Tue, Aug 16, 2016 at 4:51 PM, Emir Arnautovic < emir.arnauto...@sematext.com> wrote: > Hi,

Re: Indexing (posting document) taking a lot of time

2016-08-16 Thread Emir Arnautovic
Hi, Do you send one doc per request? How frequently do you commit? Where is Solr running? What is network connection between your machine and Solr? What are JVM settings? Is 10-30s for entire indexing or single doc? Regards, Emir On 16.08.2016 11:34, kshitij tyagi wrote: Hi alexandre, 1

Fwd: Solr - search score and tf-idf vector from individual fields

2016-08-16 Thread govind nitk
Hi Developers, down votefavorite This is a fundamental question which I was unable to get from the solr help and other related Stackoverflow queries. I have few hundred thousand documents

Re: Indexing (posting document) taking a lot of time

2016-08-16 Thread kshitij tyagi
Hi alexandre, 1 document of 400kb size is taking approx 10-30 sec and this is varying. I am posting document using curl On Tue, Aug 16, 2016 at 2:11 PM, Alexandre Rafalovitch wrote: > How many records is that and what is 'slow'? Also is this standalone or > cluster setup? >

Re: Inconsistent results with solr admin ui and solrj

2016-08-16 Thread Jan Høydahl
Hi, There is clearly something wrong when your two replicas are not in sync. Could you go to the “Cloud->Tree” tab of admin UI and look in the overseer queue whether you find signs of stuck jobs or something? Btw - what warnings do you see in the logs? Anything repeatedly popping up? I would

Re: Indexing (posting document) taking a lot of time

2016-08-16 Thread Alexandre Rafalovitch
How many records is that and what is 'slow'? Also is this standalone or cluster setup? On 16 Aug 2016 6:33 PM, "kshitij tyagi" wrote: > Hi, > > I am indexing a lot of data about 8GB, but it is taking a lot of time. I > have read about maxBufferedDocs,

Indexing (posting document) taking a lot of time

2016-08-16 Thread kshitij tyagi
Hi, I am indexing a lot of data about 8GB, but it is taking a lot of time. I have read about maxBufferedDocs, ramBufferSizeMB, merge policy ,etc in solrconfig file. It would be helpful if someone could help me out tune the segtting for faster indexing speeds. *I have read the docs but not able

Delete replica on down node, after start down node, the deleted replica comes back.

2016-08-16 Thread Jerome Yang
Hi all, I run into a strange behavior. Both on solr6.1 and solr5.3. For example, there are 4 nodes in cloud mode, one of them is stopped. Then I delete a replica on the down node. After that I start the down node. The deleted replica comes back. Is this a normal behavior? Same situation. 4