Re: Get one fragment of text of field

2016-02-17 Thread Binoy Dalal
Indexing will happen on everything. Just use the copy fields to return data. That's it. They do not need to be indexed just stored. On Thu, 18 Feb 2016, 12:52 Anil wrote: > Thanks Binoy. > > index should happen on everything. > > But retrial/fetch should limit the

Querying data based on field type

2016-02-17 Thread Salman Ansari
Hi, Due to some mis-configuration issues, I have a field that has values as single string and an array of strings. Looks like there are some old values that got indexed as an array of strings while anything new are single valued string. I have checked the configuration and multivalued for that

Re: Get one fragment of text of field

2016-02-17 Thread Anil
Thanks Binoy. index should happen on everything. But retrial/fetch should limit the characters. is it possible ? Regards, Anil On 18 February 2016 at 12:24, Binoy Dalal wrote: > If you are not particular about what part of the field is returned you can > create copy

Re: solr-4.3.1 docValues usage

2016-02-17 Thread Neeraj Lajpal
Can someone please help me with this? I am stuck for past few days. > On 15-Feb-2016, at 6:39 PM, Neeraj Lajpal wrote: > > Hi, > > I recently asked this question on stackoverflow: > > I am trying to access a field in custom request handler. I am accessing it > like

Re: Do all SolrCloud nodes communicate with the database when indexing a collection?

2016-02-17 Thread Anshum Gupta
Hi Colin, As per when I last checked, DIH works with SolrCloud but has it's limitations. It was designed for the non-cloud mode and is single threaded. It runs on whatever node you set it up on and that node might not host the leader for the shard a document belongs to, adding an extra hop for

Re: Get one fragment of text of field

2016-02-17 Thread Binoy Dalal
If you are not particular about what part of the field is returned you can create copy fields and set a limit on those to store only the number of characters you want. This will copy over the first 500 chars of the contents of your SRC field to your dest field. Anything beyond this will be

Do all SolrCloud nodes communicate with the database when indexing a collection?

2016-02-17 Thread Colin Freas
I just set up a SolrCloud instance with 2 Solr nodes & another machine running zookeeper. I’ve imported 200M records from a SQL Server database, and those records are split nicely between the 2 nodes. Everything seems ok. I did the data import via the admin ui. It took not quite 8 hours,

Get one fragment of text of field

2016-02-17 Thread Anil
Hi , we have around 30 fields in solr document. and we search for text in all fields (by creating a record field with copy field). few fields have huge text .. order of mbs. how i can get only a fragment of fields in a configurable way. we have to display each field content on UI. so its must

Re: Best practices for Solr (how to update jar files safely)

2016-02-17 Thread Shawn Heisey
On 2/17/2016 10:38 PM, Brian Wright wrote: > We have a new project to use Solr. Our Solr instance will use Jetty > rather than Tomcat. We plan to extend the Solr core system by adding > additional classes (jar files) to the > /opt/solr/server/solr-webapp/webapp/WEB-INF/lib directory to extend >

Re: Display entire string containing query string

2016-02-17 Thread Binoy Dalal
Append = On Thu, 18 Feb 2016, 11:35 Tom Running wrote: > Hello, > > I am working on a project using Solr to search data from retrieved from > Nutch. > > I have successfully integrated Nutch with Solr, and Solr is able to search > Nutch's data. > > However I am having a

Display entire string containing query string

2016-02-17 Thread Tom Running
Hello, I am working on a project using Solr to search data from retrieved from Nutch. I have successfully integrated Nutch with Solr, and Solr is able to search Nutch's data. However I am having a bit of a problem. If I query Solr, it will bring back the numfound and which document the query

Best practices for Solr (how to update jar files safely)

2016-02-17 Thread Brian Wright
Hello, We have a new project to use Solr. Our Solr instance will use Jetty rather than Tomcat. We plan to extend the Solr core system by adding additional classes (jar files) to the /opt/solr/server/solr-webapp/webapp/WEB-INF/lib directory to extend features. We also plan to run two

Re: Highlight brings the content from the first pages of pdf

2016-02-17 Thread Anil
Thanks Philippe. i am using hl.fl=*, when a field is available in highlight section, is it possible to skip that filed in the main response ? please clarify. Regards, Anil On 18 February 2016 at 08:42, Philippe Soares wrote: > You can put fields that you want to

Re: Highlight brings the content from the first pages of pdf

2016-02-17 Thread Philippe Soares
You can put fields that you want to retrieve without highlighting in the "fl" parameter, and the large fields in the "hl.fl" parameter. Those will go in the highlight section only. It may also be a good idea to add hl.requiresFieldMatch=true. E.g. : fl=id=true=field1,field2=true Note that you

Re: Highlight brings the content from the first pages of pdf

2016-02-17 Thread Anil
Thanks Binoy. But this may not help my usecase. I am storing and indexing huge documents in solr. when no search text matches with that filed text, i should skip that field of the document. when match exists, it should be part of highlight section. fl may not be right option in my case. Any

Re: Running solr as a service vs. Running it as a process

2016-02-17 Thread Binoy Dalal
That's a bummer. Anyhow I'll give it a shot and update this thread if I get anywhere. Thanks for your help. On Thu, 18 Feb 2016, 04:30 Shawn Heisey wrote: > On 2/17/2016 11:37 AM, Binoy Dalal wrote: > > At my project, we aren't that big on directory and user set up but the

Re: Retrieving 1000 records at a time

2016-02-17 Thread Shawn Heisey
On 2/17/2016 3:49 PM, Mark Robinson wrote: > I have around 121 fields out of which 12 of them are indexed and almost all > 121 are stored. > Average size of a doc is 10KB. > > I was checking for start=0, rows=1000. > We were querying a Solr instance which was on another server and I think >

Hitting complex multilevel pivot queries in solr

2016-02-17 Thread Lewin Joy (TMS)
Hi, Is there an efficient way to hit solr for complex time consuming queries? I have a requirement where I need to pivot on 4 fields. Two fields contain facet values close to 50. And the other 2 fields have 5000 and 8000 values. Pivoting on the 4 fields would crash the server. Is there a

Re: Running solr as a service vs. Running it as a process

2016-02-17 Thread Shawn Heisey
On 2/17/2016 11:37 AM, Binoy Dalal wrote: > At my project, we aren't that big on directory and user set up but the fact > that services can be started and stopped automatically on server reboots > and ensuring single running copies of the service is of significance. > Now currently we are running

Re: Errors on master after upgrading to 4.10.3

2016-02-17 Thread Joseph Hagerty
Ahh, makes sense. I did have a feeling I was barking up the wrong tree since it's an Extraction issue, but I thought I'd throw it out there, anyway. Thanks so much for the information! On Wed, Feb 17, 2016 at 4:49 PM, Rachel Lynn Underwood < r.lynn.underw...@gmail.com> wrote: > This is an error

Re: Retrieving 1000 records at a time

2016-02-17 Thread Mark Robinson
Thanks Joel and Chris! I have around 121 fields out of which 12 of them are indexed and almost all 121 are stored. Average size of a doc is 10KB. I was checking for start=0, rows=1000. We were querying a Solr instance which was on another server and I think network lag might have come into the

Re: Retrieving 1000 records at a time

2016-02-17 Thread Chris Hostetter
: I have a requirement where I need to retrieve 1 to 15000 records at a : time from SOLR. : With 20 or 100 records everything happens in milliseconds. : When it goes to 1000, 1 it is taking more time... like even 30 seconds. so far all you've really told us about your setup is that some

Re: Errors on master after upgrading to 4.10.3

2016-02-17 Thread Rachel Lynn Underwood
This is an error being thrown by Apache PDFBox/Tika. You're seeing it now because Solr 4.x uses a different Tika version than Solr 3.x. It looks like this error is thrown when you parse a PDF with Tika, and a font in that PDF doesn't have a ToUnicode mapping.

Re: Negating multiple array fileds

2016-02-17 Thread Jack Krupansky
I actually thought seriously about whether to mention wildcard vs. range, but... it annoys me that the Lucene and query parser folks won't fix either PrefixQuery or the query parsers to do the right/optimal thing for single-asterisk query. I wrote up a Jira for it years ago, but for whatever

Sell more tickets to your next event!

2016-02-17 Thread Cody Rasmus
Hey NYC Apache Lucene/Solr Meetup, Hope all is well. I wanted to follow up on my last email with some interesting data we’ve collected from event organizers similar to NYC Apache Lucene/Solr Meetup. Over the last 6 months more than 70% of payments for events on SquadUP have come from a mobile

Re: Zookeeper upconfig files to upload big config files

2016-02-17 Thread Shawn Heisey
On 2/17/2016 1:04 PM, Aswath Srinivasan (TMS) wrote: > I’m tyring to upconfig my config files and my synonyms.txt file is > about 2 MB. Whenever I try to do this, I get the following expection. > It’s either a “broken pipe” expection or the following expection. Any > advice for me to fix it? The

Re: Adding nodes

2016-02-17 Thread Jeff Wartes
Solrcloud does not come with any autoscaling functionality. If you want such a thing, you’ll need to write it yourself. https://github.com/whitepages/solrcloud_manager might be a useful head start though, particularly the “fill” and “cleancollection” commands. I don’t do *auto* scaling, but I

Zookeeper upconfig files to upload big config files

2016-02-17 Thread Aswath Srinivasan (TMS)
Hi fellow Solr developers, I'm tyring to upconfig my config files and my synonyms.txt file is about 2 MB. Whenever I try to do this, I get the following expection. It's either a "broken pipe" expection or the following expection. Any advice for me to fix it? If I remove most of the synonym

Re: Retrieving 1000 records at a time

2016-02-17 Thread Joel Bernstein
Also are you ranking documents by score Joel Bernstein http://joelsolr.blogspot.com/ On Wed, Feb 17, 2016 at 1:59 PM, Joel Bernstein wrote: > A few questions for you: What types of fields and how many fields will you > be retrieving? What version of Solr are you using? > >

Re: Retrieving 1000 records at a time

2016-02-17 Thread Joel Bernstein
A few questions for you: What types of fields and how many fields will you be retrieving? What version of Solr are you using? Joel Bernstein http://joelsolr.blogspot.com/ On Wed, Feb 17, 2016 at 1:37 PM, Mark Robinson wrote: > Hi, > > I have a requirement where I need

SolrCloud sync issues under server failure

2016-02-17 Thread Håkon Hitland
Hi, We have been testing an installation of SolrCloud under some failure scenarios, and are seeing some issues we would like to fix before putting this into production. Our cluster is 6 servers running Solr 5.4.1, with config stored in our Zookeeper cluster. Our cores currently each have a

Re: Running solr as a service vs. Running it as a process

2016-02-17 Thread Binoy Dalal
Hi Dan, At my project, we aren't that big on directory and user set up but the fact that services can be started and stopped automatically on server reboots and ensuring single running copies of the service is of significance. Now currently we are running Solr 4.4 but pretty soon we're going to

Retrieving 1000 records at a time

2016-02-17 Thread Mark Robinson
Hi, I have a requirement where I need to retrieve 1 to 15000 records at a time from SOLR. With 20 or 100 records everything happens in milliseconds. When it goes to 1000, 1 it is taking more time... like even 30 seconds. Will Solr be able to return 1 records at a time in less than

Re: Negating multiple array fileds

2016-02-17 Thread Shawn Heisey
On 2/17/2016 12:34 AM, Salman Ansari wrote: > 2) "Behind the scenes, Solr will interpret this as "all possible values for > field" --which sounds like it would be exactly what you're looking for, > except that if there are ten million possible values in the field > you're searching, > the

Re: Running solr as a service vs. Running it as a process

2016-02-17 Thread Susheel Kumar
In addition you also get many advantages like you can start/stop/restart solr using "service solr stop|start|restart" as mentioned above. You don't need to launch solr script directly. Also the install scripts take care of installing/setting up Solr nicely for Production environment. Even you

Re: join and NOT together

2016-02-17 Thread Mikhail Khludnev
Sergo, Please provide more debug output, I want to see how query was parsed. On Tue, Feb 16, 2016 at 1:20 PM, Sergio García Maroto wrote: > My debugQuery=true returns related to the NOT: > > 0.06755901 = (MATCH) sum of: 0.06755901 = (MATCH) MatchAllDocsQuery, > product of:

Re: Negating multiple array fileds

2016-02-17 Thread Salman Ansari
Thanks Shawn for explaining in details. Regarding the performance issue you mentioned, there are 2 points 1) "The [* TO *] syntax is an all-inclusive range query, which will usually be much faster than a wildcard query." I will take your statement for granted and let the space for people to

Re: SOLR ranking

2016-02-17 Thread Nitin.K
Hi Binoy, We are searching for both phrases and individual words but we want that only those documents which are having phrases will come first in the order and then the individual app. termPositions = true is also not working in my case. I have also removed the string type from copy fields.

Facet count with expand and collapse

2016-02-17 Thread Anil
HI, will there be any change in the facet count in case of expand and collpase ? please clarify. Regards, Anil

RE: Running solr as a service vs. Running it as a process

2016-02-17 Thread Davis, Daniel (NIH/NLM) [C]
So, running solr as a service also runs it as a process. In typical Linux environments, (based on initscripts), a service is a process installed to meet additional considerations: - Putting logs in predictable places where system operators and administrators expect to see logs - /var/logs -

Null Pointer Exception on distributed search

2016-02-17 Thread Lokesh Chhaparwal
Hi, We are facing NPE while using distributed search (Solr version 4.7.2) (using *shards* parameter in solr query) Exception Trace: ERROR - 2016-02-17 16:44:26.616; org.apache.solr.common.SolrException; null:java.lang.NullPointerException at

Running solr as a service vs. Running it as a process

2016-02-17 Thread Binoy Dalal
Hello everyone, I've read about running solr as a service but I don't understand what it really means. I went through the "Taking solr to production" documentation on the wiki which suggests that solr be installed using the script provided and run as a service. >From what I could glean, the