Re: Highlight brings the content from the first pages of pdf

2016-02-17 Thread Anil
Thanks Philippe. i am using hl.fl=*, when a field is available in highlight section, is it possible to skip that filed in the main response ? please clarify. Regards, Anil On 18 February 2016 at 08:42, Philippe Soares wrote: > You can put fields that you want to

Re: Highlight brings the content from the first pages of pdf

2016-02-17 Thread Philippe Soares
You can put fields that you want to retrieve without highlighting in the "fl" parameter, and the large fields in the "hl.fl" parameter. Those will go in the highlight section only. It may also be a good idea to add hl.requiresFieldMatch=true. E.g. : fl=id=true=field1,field2=true Note that you

Re: Highlight brings the content from the first pages of pdf

2016-02-17 Thread Anil
Thanks Binoy. But this may not help my usecase. I am storing and indexing huge documents in solr. when no search text matches with that filed text, i should skip that field of the document. when match exists, it should be part of highlight section. fl may not be right option in my case. Any

Re: Highlight brings the content from the first pages of pdf

2016-02-16 Thread Binoy Dalal
Yeah. Under an entry like so: fields On Tue, 16 Feb 2016, 13:00 Anil wrote: > you mean default fl ? > > On 16 February 2016 at 12:57, Binoy Dalal wrote: > > > Oh wait. We don't append the fl parameter to the query. > > We've configured it in the

Re: Highlight brings the content from the first pages of pdf

2016-02-15 Thread Anil
you mean default fl ? On 16 February 2016 at 12:57, Binoy Dalal wrote: > Oh wait. We don't append the fl parameter to the query. > We've configured it in the request handler in solrconfig.xml > Maybe that is something that you can do. > > On Tue, 16 Feb 2016, 12:39 Anil

Re: Highlight brings the content from the first pages of pdf

2016-02-15 Thread Binoy Dalal
Oh wait. We don't append the fl parameter to the query. We've configured it in the request handler in solrconfig.xml Maybe that is something that you can do. On Tue, 16 Feb 2016, 12:39 Anil wrote: > Thanks for your response Binoy. > > Yes.I am looking for any alternative to

Re: Highlight brings the content from the first pages of pdf

2016-02-15 Thread Anil
Thanks for your response Binoy. Yes.I am looking for any alternative to this. With long number of fileds, url will become long and might lead to "url too long exception" when using http request. On 16 February 2016 at 11:01, Binoy Dalal wrote: > Filling in the fl

Re: Highlight brings the content from the first pages of pdf

2016-02-15 Thread Binoy Dalal
Filling in the fl parameter with all the required fields is what we do at my project as well, and I don't think there is any alternative to this. Maybe somebody else can advise on this? On Tue, 16 Feb 2016, 10:30 Anil wrote: > Any help on this ? Thanks. > > On 15 February

Re: Highlight brings the content from the first pages of pdf

2016-02-15 Thread Anil
Any help on this ? Thanks. On 15 February 2016 at 19:06, Anil wrote: > Yes. But i have long list of fields. > > i feel adding all the fileds in fl is not good practice unless one > interested in few fields. In my case, i am interested in all fields except > the one . > > is

Re: Highlight brings the content from the first pages of pdf

2016-02-15 Thread Anil
Yes. But i have long list of fields. i feel adding all the fileds in fl is not good practice unless one interested in few fields. In my case, i am interested in all fields except the one . is there any alternative approach ? Thanks in advance. On 15 February 2016 at 17:27, Binoy Dalal

Re: Highlight brings the content from the first pages of pdf

2016-02-15 Thread Binoy Dalal
If I understand correctly, you have already highlighted the field and only want to return the highlights and not the field itself. Well in that case, simply remove the field name from your fl list. On Mon, 15 Feb 2016, 17:04 Anil wrote: > HOw can highlighted field excluded

Re: Highlight brings the content from the first pages of pdf

2016-02-15 Thread Anil
HOw can highlighted field excluded in the main result ? as it is available in the highlight section. In my scenario, One filed (lets say commands) of the each solr document would be around 10 mg. I dont want to fetch that filed in response when its highlight snippets available in the response.

Re: Highlight brings the content from the first pages of pdf

2016-02-15 Thread Evert R.
Hello Mark, Thanks for you reply. All text is indexed (1 pdf file). It works now. Best regard, *--Evert* 2016-02-14 23:47 GMT-02:00 Mark Ehle : > is all the text being indexed? Check to make sure that there's actually the > data you are looking for in the index. Is there

Re: Highlight brings the content from the first pages of pdf

2016-02-15 Thread Evert R.
Binoy, Thank you very much for you reply and explanation. Best regards, *--Evert* 2016-02-14 23:28 GMT-02:00 Binoy Dalal : > What you've done so far will highlight every instance of "nietava" found in > the field, and return it, i.e., your entire field will return

Re: Highlight brings the content from the first pages of pdf

2016-02-14 Thread Mark Ehle
is all the text being indexed? Check to make sure that there's actually the data you are looking for in the index. Is there a setting in tika that limits how much is indexed? I seem to remember confronting this problem myself once, and the data that I wanted just wasn't in the index because it was

Re: Highlight brings the content from the first pages of pdf

2016-02-14 Thread Binoy Dalal
What you've done so far will highlight every instance of "nietava" found in the field, and return it, i.e., your entire field will return with all the "nietava"s in tags. If you do not want the entire field, only portions of your field containing the matched terms, then use hl.snippets parameter

Re: Highlight brings the content from the first pages of pdf

2016-02-14 Thread Evert R.
Binoy, You are the man! =) Thank you very much! Would you by chance know how could I get the second highlight of the same word in the same file? Like: file_1.pdf (has three words "nietava") so..., how can I bring the highlighs for the three occurrences? I am pretty new around, should I send

Re: Highlight brings the content from the first pages of pdf

2016-02-14 Thread Evert R.
Hi Binoy, thanks! Still not working, check the output: { "responseHeader":{ "status":0, "QTime":58, "params":{ "q":"nietava", "hl":"true", "hl.simple.post":"", "indent":"true", "fl":"id", "hl.flagsize":"0", "hl.fl":"content",

Re: Highlight brings the content from the first pages of pdf

2016-02-14 Thread Binoy Dalal
Are you sure you've typed in the parameters correctly? In your response it says flagsize instead of fragsize and maxanalzyedchars instead of maxanalyzedchars. Ohh wait, I see that I made the analyzed typo. Awfully sorry for that, I'm using my phone to send the mail out. On Sun, 14 Feb 2016,

Re: Highlight brings the content from the first pages of pdf

2016-02-14 Thread Binoy Dalal
>From the solr wiki: hl.maxAnalyzedChars How many characters into a document to look for suitable snippets  Solr1.3. This parameter makes sense for the original Highlighter only. The default value is "51200". You can assign a large value to this parameter and use hl.fragsize=0 to return

Re: Highlight brings the content from the first pages of pdf

2016-02-14 Thread Evert R.
Hi Binoy, I could not find this option in my solrconfig.xml file. ] I tryied to add this setting and nothing changed... Here is the code, I might miss placed: 400 409600 200

Re: Highlight brings the content from the first pages of pdf

2016-02-14 Thread Binoy Dalal
Don't add this parameter to the searchComponent definition, because the components where you've added it, GapFragmenter and RegexFragmenter, simply don't use it. Instead, add it to your request handler (/select etc.) if you've configured highlighting in the handler or append it to your query: *=*.

Highlight brings the content from the first pages of pdf

2016-02-14 Thread Evert R.
Hi There, I have a situation where started a techproducts, without any modification, post a pdf file. When searching as: q=text:search_word hl=true hl.fl=content It show the highlight accordingly! =) BUT... *if the "search_word" is after the first pages* in my pdf file, such as page 15... It

Re: Highlight brings the content from the first pages of pdf

2016-02-14 Thread Evert R.
Hi Paul, Sorry my late reply. All the content is inside de docs. It brings the docs and the pdf file that has the search word in it. But the highlight is not showing if the search word is after a few pages. Evert *--Evert* 2016-02-14 8:36 GMT-02:00 Paul Libbrecht : > This

Re: Highlight brings the content from the first pages of pdf

2016-02-14 Thread Paul Libbrecht
This looks like the stored content is shortened. Can it be? Can you see that inside the docs? paul > Evert R. > 14 February 2016 at 11:26 > Hi There, > > I have a situation where started a techproducts, without any modification, > post a pdf file. When searching