Re: Solr NLS custom query parser

2017-06-15 Thread Michael Kuhlmann
Hi Arun,

your question is too generic. What do you mean with nlp search? What do
you expect to happen?

The short answer is: No, there is no such parser because the individual
requirements will vary a lot.

-Michael

Am 14.06.2017 um 16:32 schrieb aruninfo100:
> Hi,
>
> I am trying to configure NLP search with Solr. I am using OpenNLP for the
> same.I am able to index the documents and extract named entities and POS
> using OpenNLP-UIMA support and also by using a UIMA Update request processor
> chain.But I am not able to write a query parser for the same.Is there a
> query parser already written to satisfy the above features(nlp search).
>
> Thanks and Regards,
> Arun
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Solr-NLS-custom-query-parser-tp4340511.html
> Sent from the Solr - User mailing list archive at Nabble.com.




Re: Solr NLS custom query parser

2017-06-15 Thread aruninfo100
Hi Michael,

I have indexed the documents in such a way,I used OpenNLP to extract named
entities and POS and has indexed these data to respective fields. I have
read(my understanding) that for natural language search using Solr,once you
have the entities extracted the next step is to create a custom query parser
which takes advantage of the entite fields.
I have referred the slides and talk-
https://www.slideshare.net/lucenerevolution/teofilie-natural-languagesearchinsolreurocon2011

  
to do the same.

Thanks and Regards,
Arun



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-NLS-custom-query-parser-tp4340511p4340679.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Multiple hashJoin or innerJoin

2017-06-15 Thread Joel Bernstein
It looks like you are running into this bug:
https://issues.apache.org/jira/browse/SOLR-10512. This not been resolved
yet, but I believe there is a work around which is described in the ticket.

Joel Bernstein
http://joelsolr.blogspot.com/

On Wed, Jun 14, 2017 at 10:09 PM, Zheng Lin Edwin Yeo 
wrote:

> I have found that this is possible, but currently I have problems if the
> field name to join in all the 3 collections are different.
>
> For example, if in "people" collection, it is called personId, and in
> "pets" collection, it is called petsId. But in "collectionId", it is called
> collectionName, but it won't work when I place it this way below. Any
> suggestions on how I can handle this?
>
> innerJoin(innerJoin(
>   search(people, q=*:*, fl="personId,name", sort="personId asc"),
>   search(pets, q=type:cat, fl="pertsId,petName", sort="personId asc"),
>   on="personId=petsId"
> ),
>   search(collection1, q=*:*, fl="collectionId,collectionName",
> sort="personId asc"),
> )on="personId=collectionId"
>
>
> Regards,
> Edwin
>
> On 14 June 2017 at 23:13, Zheng Lin Edwin Yeo 
> wrote:
>
> > Hi,
> >
> > I'm using Solr 6.5.1.
> >
> > Is it possible to have multiple hashJoin or innerJoin in the query?
> >
> > An example will be something like this for innerJoin:
> >
> > innerJoin(innerJoin(
> >   search(people, q=*:*, fl="personId,name", sort="personId asc"),
> >   search(pets, q=type:cat, fl="personId,petName", sort="personId asc"),
> >   on="personId"
> > ),
> >   search(collection1, q=*:*, fl="personId,personName", sort="personId
> > asc"),
> > )
> >
> > Regards,
> > Edwin
> >
>


cursor with sort value along with unique key

2017-06-15 Thread Preeti Chhabra

Hello,

With respects to cursors, when using a computed sort value (like score)  
in combination with the unique field sort (score desc, id asc) seems to 
cause some wildly inconsistent and incomplete results.  Can I get any 
help out of it?



Thanks & Regards

Preeti chhabra



Phrase Exact Match with Margin of Error

2017-06-15 Thread Max Bridgewater
Hi,

I am trying to do phrase exact match. For this, I use
KeywordTokenizerFactory. This basically does what I want to do. My field
type is defined as follows:


  


  
  


  



In addition to this, I want to tolerate typos of two or three letters. I
thought fuzzy search could allow me to accept this margin of error. But
this doesn't seem to work.

A typical query I would have is:

q=subjet:"Bridge the gap between your skills and your goals"

Now, in this query, if I replace gap with gat, I was hoping I could do
something such as:

q=subjet:"Bridge the gat between your skills and your goals"~0.8

But this doesn't quite do what I am trying to achieve.

Any suggestion?


Re: Phrase Exact Match with Margin of Error

2017-06-15 Thread Susheel Kumar
CompledPhraseQuery parser is what you need to look
https://cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers-ComplexPhraseQueryParser.
See below for e.g.



http://localhost:8983/solr/techproducts/select?debugQuery=on=on=manu:%22Bridge%20the%20gat~1%20between%20your%20skills%20and%20your%20goals%22=complexphrase

On Thu, Jun 15, 2017 at 5:59 AM, Max Bridgewater 
wrote:

> Hi,
>
> I am trying to do phrase exact match. For this, I use
> KeywordTokenizerFactory. This basically does what I want to do. My field
> type is defined as follows:
>
>  positionIncrementGap="100">
>   
> 
> 
>   
>   
> 
> 
>   
> 
>
>
> In addition to this, I want to tolerate typos of two or three letters. I
> thought fuzzy search could allow me to accept this margin of error. But
> this doesn't seem to work.
>
> A typical query I would have is:
>
> q=subjet:"Bridge the gap between your skills and your goals"
>
> Now, in this query, if I replace gap with gat, I was hoping I could do
> something such as:
>
> q=subjet:"Bridge the gat between your skills and your goals"~0.8
>
> But this doesn't quite do what I am trying to achieve.
>
> Any suggestion?
>


Re: cursor with sort value along with unique key

2017-06-15 Thread Mikhail Khludnev
Hello,
http://lucene.472066.n3.nabble.com/Pagination-bug-when-sorting-by-a-field-not-unique-field-tp4327408p4327524.html
might be relevant.

On Thu, Jun 15, 2017 at 12:40 PM, Preeti Chhabra <
preeti.chha...@karexpert.com> wrote:

> Hello,
>
> With respects to cursors, when using a computed sort value (like score)
> in combination with the unique field sort (score desc, id asc) seems to
> cause some wildly inconsistent and incomplete results.  Can I get any help
> out of it?
>
>
> Thanks & Regards
>
> Preeti chhabra
>
>


-- 
Sincerely yours
Mikhail Khludnev


Re: CSV output

2017-06-15 Thread Erik Hatcher
Is it the proxy affecting the output?What do you get going directly to 
Solr's endpoint?

   Erik

> On Jun 14, 2017, at 22:13, Phil Scadden  wrote:
> 
> If I try
> /getsolr? 
> fl=id,title,datasource,score=true=9000=unified=Wainui-1=AND=csv
> 
> The response I get is:
> id,title,datasource,scoreW:\PR_Reports\OCR\PR869.pdf,,Petroleum 
> Reports,8.233313W:\PR_Reports\OCR\PR3440.pdf,,Petroleum 
> Reports,8.217836W:\PR_Reports\OCR\PR4313.pdf,,Petroleum 
> Reports,8.206703W:\PR_Reports\OCR\PR3906.pdf,,Petroleum 
> Reports,8.185147W:\PR_Reports\OCR\PR1592.pdf,,Petroleum 
> Reports,8.167614W:\PR_Reports\OCR\PR998.pdf,,Petroleum 
> Reports,8.161142W:\PR_Reports\OCR\PR2457.pdf,,Petroleum 
> Reports,8.155497W:\PR_Reports\OCR\PR2433.pdf,,Petroleum 
> Reports,8.152924W:\PR_Reports\OCR\PR1184.pdf,,Petroleum 
> Reports,8.124402W:\PR_Reports\OCR\PR3551.pdf,,Petroleum Reports,8.124402
> 
> ie no newline separators at all (Solr 6.5.1) (/getsolr is api that proxy to 
> the solr server).
> Changing it to
> /getsolr?csv.newline=%0A=id,title,datasource,score=true=9000=unified=Wainui-1=AND=csv
> 
> Makes no difference. What I am doing wrong here? Is there another way to 
> specify csv parameters? It says default is \n but I am not seeing that.
> 
> Notice: This email and any attachments are confidential and may not be used, 
> published or redistributed without the prior written consent of the Institute 
> of Geological and Nuclear Sciences Limited (GNS Science). If received in 
> error please destroy and immediately notify GNS Science. Do not copy or 
> disclose the contents.


Re: Multiple hashJoin or innerJoin

2017-06-15 Thread Zheng Lin Edwin Yeo
Hi Joel,

Yes, I got this error:

{"result-set":{"docs":[{"EXCEPTION":"Invalid JoinStream - all incoming
stream comparators (sort) must be a superset of this stream's
equalitor.","EOF":true}]}}


Ok, will try out the work around first.

Regards,
Edwin


On 15 June 2017 at 20:16, Joel Bernstein  wrote:

> It looks like you are running into this bug:
> https://issues.apache.org/jira/browse/SOLR-10512. This not been resolved
> yet, but I believe there is a work around which is described in the ticket.
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
> On Wed, Jun 14, 2017 at 10:09 PM, Zheng Lin Edwin Yeo <
> edwinye...@gmail.com>
> wrote:
>
> > I have found that this is possible, but currently I have problems if the
> > field name to join in all the 3 collections are different.
> >
> > For example, if in "people" collection, it is called personId, and in
> > "pets" collection, it is called petsId. But in "collectionId", it is
> called
> > collectionName, but it won't work when I place it this way below. Any
> > suggestions on how I can handle this?
> >
> > innerJoin(innerJoin(
> >   search(people, q=*:*, fl="personId,name", sort="personId asc"),
> >   search(pets, q=type:cat, fl="pertsId,petName", sort="personId asc"),
> >   on="personId=petsId"
> > ),
> >   search(collection1, q=*:*, fl="collectionId,collectionName",
> > sort="personId asc"),
> > )on="personId=collectionId"
> >
> >
> > Regards,
> > Edwin
> >
> > On 14 June 2017 at 23:13, Zheng Lin Edwin Yeo 
> > wrote:
> >
> > > Hi,
> > >
> > > I'm using Solr 6.5.1.
> > >
> > > Is it possible to have multiple hashJoin or innerJoin in the query?
> > >
> > > An example will be something like this for innerJoin:
> > >
> > > innerJoin(innerJoin(
> > >   search(people, q=*:*, fl="personId,name", sort="personId asc"),
> > >   search(pets, q=type:cat, fl="personId,petName", sort="personId asc"),
> > >   on="personId"
> > > ),
> > >   search(collection1, q=*:*, fl="personId,personName", sort="personId
> > > asc"),
> > > )
> > >
> > > Regards,
> > > Edwin
> > >
> >
>


Re: Phrase Exact Match with Margin of Error

2017-06-15 Thread Susheel Kumar
Agree, that's the challenge. Since ComplexPhraseQuery parser needs terms
analyzed/tokenized and if don't, it can't really operate at individual
tokens with fuzzy or wildcard matches.  The solution I can think of is to
execute query against both the fields (KeywordTokenized..) and
Non-KeywordTokenized fields and then boost the KeywordTokenized field
higher...

On Thu, Jun 15, 2017 at 1:20 PM, Max Bridgewater 
wrote:

> Thanks Susheel. The challenge is that if I search for the word "between"
> alone, I still get plenty of results. In a way I want the query to  match
> the document title exactly (up to a few characters) and the document title
> match the query exactly (up to a few characters). KeywordTokenizer allows
> that. But complexphrase does not seem to work with KeywordTokenizer.
>
> On Thu, Jun 15, 2017 at 10:23 AM, Susheel Kumar 
> wrote:
>
> > CompledPhraseQuery parser is what you need to look
> > https://cwiki.apache.org/confluence/display/solr/Other+
> > Parsers#OtherParsers-ComplexPhraseQueryParser.
> > See below for e.g.
> >
> >
> >
> > http://localhost:8983/solr/techproducts/select?
> debugQuery=on=on=
> > manu:%22Bridge%20the%20gat~1%20between%20your%20skills%
> > 20and%20your%20goals%22=complexphrase
> >
> > On Thu, Jun 15, 2017 at 5:59 AM, Max Bridgewater <
> > max.bridgewa...@gmail.com>
> > wrote:
> >
> > > Hi,
> > >
> > > I am trying to do phrase exact match. For this, I use
> > > KeywordTokenizerFactory. This basically does what I want to do. My
> field
> > > type is defined as follows:
> > >
> > >  > > positionIncrementGap="100">
> > >   
> > > 
> > > 
> > >   
> > >   
> > > 
> > > 
> > >   
> > > 
> > >
> > >
> > > In addition to this, I want to tolerate typos of two or three letters.
> I
> > > thought fuzzy search could allow me to accept this margin of error. But
> > > this doesn't seem to work.
> > >
> > > A typical query I would have is:
> > >
> > > q=subjet:"Bridge the gap between your skills and your goals"
> > >
> > > Now, in this query, if I replace gap with gat, I was hoping I could do
> > > something such as:
> > >
> > > q=subjet:"Bridge the gat between your skills and your goals"~0.8
> > >
> > > But this doesn't quite do what I am trying to achieve.
> > >
> > > Any suggestion?
> > >
> >
>


Re: Phrase Exact Match with Margin of Error

2017-06-15 Thread Max Bridgewater
Thanks Susheel. The challenge is that if I search for the word "between"
alone, I still get plenty of results. In a way I want the query to  match
the document title exactly (up to a few characters) and the document title
match the query exactly (up to a few characters). KeywordTokenizer allows
that. But complexphrase does not seem to work with KeywordTokenizer.

On Thu, Jun 15, 2017 at 10:23 AM, Susheel Kumar 
wrote:

> CompledPhraseQuery parser is what you need to look
> https://cwiki.apache.org/confluence/display/solr/Other+
> Parsers#OtherParsers-ComplexPhraseQueryParser.
> See below for e.g.
>
>
>
> http://localhost:8983/solr/techproducts/select?debugQuery=on=on=
> manu:%22Bridge%20the%20gat~1%20between%20your%20skills%
> 20and%20your%20goals%22=complexphrase
>
> On Thu, Jun 15, 2017 at 5:59 AM, Max Bridgewater <
> max.bridgewa...@gmail.com>
> wrote:
>
> > Hi,
> >
> > I am trying to do phrase exact match. For this, I use
> > KeywordTokenizerFactory. This basically does what I want to do. My field
> > type is defined as follows:
> >
> >  > positionIncrementGap="100">
> >   
> > 
> > 
> >   
> >   
> > 
> > 
> >   
> > 
> >
> >
> > In addition to this, I want to tolerate typos of two or three letters. I
> > thought fuzzy search could allow me to accept this margin of error. But
> > this doesn't seem to work.
> >
> > A typical query I would have is:
> >
> > q=subjet:"Bridge the gap between your skills and your goals"
> >
> > Now, in this query, if I replace gap with gat, I was hoping I could do
> > something such as:
> >
> > q=subjet:"Bridge the gat between your skills and your goals"~0.8
> >
> > But this doesn't quite do what I am trying to achieve.
> >
> > Any suggestion?
> >
>


Re: Phrase Exact Match with Margin of Error

2017-06-15 Thread simon
I think that's because the KeywordTokenizer by definition produces a single
token (not a phrase).

Perhaps you could create two fields by a copyField - the one you already
have(field1), and one tokenized using StandardTokenizer or
WhiteSpaceTokenizer(field2) which will produce a phrase with multiple
tokens. Then construct a query which searches both  field1 for an exact
match, and field2 using ComplexQueryParser (use the localparams syntax) to
combine them. Boost the field1 (exact match).

HTH

-Simon

On Thu, Jun 15, 2017 at 1:20 PM, Max Bridgewater 
wrote:

> Thanks Susheel. The challenge is that if I search for the word "between"
> alone, I still get plenty of results. In a way I want the query to  match
> the document title exactly (up to a few characters) and the document title
> match the query exactly (up to a few characters). KeywordTokenizer allows
> that. But complexphrase does not seem to work with KeywordTokenizer.
>
> On Thu, Jun 15, 2017 at 10:23 AM, Susheel Kumar 
> wrote:
>
> > CompledPhraseQuery parser is what you need to look
> > https://cwiki.apache.org/confluence/display/solr/Other+
> > Parsers#OtherParsers-ComplexPhraseQueryParser.
> > See below for e.g.
> >
> >
> >
> > http://localhost:8983/solr/techproducts/select?
> debugQuery=on=on=
> > manu:%22Bridge%20the%20gat~1%20between%20your%20skills%
> > 20and%20your%20goals%22=complexphrase
> >
> > On Thu, Jun 15, 2017 at 5:59 AM, Max Bridgewater <
> > max.bridgewa...@gmail.com>
> > wrote:
> >
> > > Hi,
> > >
> > > I am trying to do phrase exact match. For this, I use
> > > KeywordTokenizerFactory. This basically does what I want to do. My
> field
> > > type is defined as follows:
> > >
> > >  > > positionIncrementGap="100">
> > >   
> > > 
> > > 
> > >   
> > >   
> > > 
> > > 
> > >   
> > > 
> > >
> > >
> > > In addition to this, I want to tolerate typos of two or three letters.
> I
> > > thought fuzzy search could allow me to accept this margin of error. But
> > > this doesn't seem to work.
> > >
> > > A typical query I would have is:
> > >
> > > q=subjet:"Bridge the gap between your skills and your goals"
> > >
> > > Now, in this query, if I replace gap with gat, I was hoping I could do
> > > something such as:
> > >
> > > q=subjet:"Bridge the gat between your skills and your goals"~0.8
> > >
> > > But this doesn't quite do what I am trying to achieve.
> > >
> > > Any suggestion?
> > >
> >
>


Re: Issue with highlighter

2017-06-15 Thread Ali Husain
Thanks for the replies. Let me try and explain this a little better.


I haven't modified anything in solrconfig. All I did was get a fresh instance 
of solr 6.4.1 and create a core testHighlight. I then created a content field 
of type text_en via the Solr Admin UI. id was already there, and that is of 
type string.


I then use the UI, once again to check the hl checkbox, hl.fl is set to * 
because I want any and every match.


I push the following content into this new solr instance:

id:91101

content:'I am adding something to the core field and we will try and find it. 
We want to make sure the highlighter works!

This is short so fragsize and max characters shouldn\'t be an issue.'

As you can see, very few characters, fragsize, maxAnalyzedChars, all that 
should not be an issue.


I then send this query:

http://localhost:8983/solr/testHighlight/select?hl.fl=*=on=on=something=json


My results:


"response":{"numFound":1,"start":0,"docs":[

{"id":"91101",

"content":"I am adding something to the core field and we will try and 
find it. We want to make sure the highlighter works! This is short so fragsize 
and max characters shouldn't be an issue.",
"_version_":1570302668841156608}]


},


"highlighting":{
"91101":{}}


I change q to be core instead of something.


http://localhost:8983/solr/testHighlight/select?hl.fl=*=on=on=core=json


{
"id":"91101",
"content":"I am adding something to the core field and we will try and 
find it. We want to make sure the highlighter works! This is short so fragsize 
and max characters shouldn't be an issue.",
"_version_":1570302668841156608},



"highlighting":{
"91101":{
  "content":["I am adding something to the core field and we will 
try and find it. We want to make sure"]}}

I've tried a bunch of queries. 'adding', 'something' both don't return any 
highlights. 'core' 'am' 'field' all work.

Am I doing a better job of explaining this? Quite puzzling why this would be 
happening. My guess is there is some file/config somewhere that is ignoring 
some words? It isn't stopwords.txt in my case though. If that isn't the case 
then it definitely seems like a bug to me.

Thanks, Ali



From: David Smiley 
Sent: Thursday, June 15, 2017 12:33:39 AM
To: solr-user@lucene.apache.org
Subject: Re: Issue with highlighter

> Beware of NOT plus OR in a search. That will certainly produce no
highlights. (eg test -results when default op is OR)

Seems like a bug to me; the default operator shouldn't matter in that case
I think since there is only one clause that has no BooleanQuery.Occur
operator and thus the OR/AND shouldn't matter.  The end effect is "test" is
effectively required and should definitely be highlighted.

Note to Ali: Phil's comment implies use of hl.method=unified which is not
the default.

On Wed, Jun 14, 2017 at 10:22 PM Phil Scadden  wrote:

> Just had similar issue - works for some, not others. First thing to look
> at is hl.maxAnalyzedChars is the query. The default is quite small.
> Since many of my documents are large PDF files, I opted to use
> storeOffsetsWithPositions="true" termVectors="true" on the field I was
> searching on.
> This certainly did increase my index size but not too bad and certainly
> fast.
> https://cwiki.apache.org/confluence/display/solr/Highlighting
>
> Beware of NOT plus OR in a search. That will certainly produce no
> highlights. (eg test -results when default op is OR)
>
>
> -Original Message-
> From: Ali Husain [mailto:alihus...@outlook.com]
> Sent: Thursday, 15 June 2017 11:11 a.m.
> To: solr-user@lucene.apache.org
> Subject: Issue with highlighter
>
> Hi,
>
>
> I think I've found a bug with the highlighter. I search for the word
> "something" and I get an empty highlighting response for all the documents
> that are returned shown below. The fields that I am searching over are
> text_en, the highlighter works for a lot of queries. I have no
> stopwords.txt list that could be messing this up either.
>
>
>  "highlighting":{
> "310":{},
> "103":{},
> "406":{},
> "1189":{},
> "54":{},
> "292":{},
> "309":{}}}
>
>
> Just changing the search term to "something like" I get back this:
>
>
> "highlighting":{
> "310":{},
> "309":{
>   "content":["1949 Convention, like those"]},
> "103":{},
> "406":{},
> "1189":{},
> "54":{},
> "292":{},
> "286":{
>   "content":["persons in these classes are treated like
> combatants, but in other respects"]},
> "336":{
>   "content":["   be treated like engagement"]}}}
>
>
> So I know that I have it setup correctly, but I can't figure this out.
> I've searched through JIRA/Google and haven't been able to find a similar
> issue.
>
>
> Any ideas?
>
>
> Thanks,
>
> Ali
> Notice: This email and any attachments are confidential and may not be
> used, published or redistributed without the prior 

Query not working with DatePointField

2017-06-15 Thread Saurabh Sethi
Hi,

We have a fieldType specified for date. Earlier it was using TrieDateField
and we changed it to DatePointField.





Here are the fields used in the query and one of them uses the dateType:





The following query was returning correct results when the field type was
Trie but not with Point:

field1:value1 AND ((*:* NOT field2:*) AND field3:value3)

Any idea why field2:* does not return results anymore?

Thanks,
Saurabh


Re: Solr 6: how to get SortedSetDocValues from index by field name

2017-06-15 Thread Chris Hostetter

https://people.apache.org/~hossman/#xyproblem
XY Problem

Your question appears to be an "XY Problem" ... that is: you are dealing
with "X", you are assuming "Y" will help you, and you are asking about "Y"
without giving more details about the "X" so that we can understand the
full issue.  Perhaps the best solution doesn't involve "Y" at all?
See Also: http://www.perlmonks.org/index.pl?node_id=542341




: How do I get SortedSetDocValues from index by field name?
: 
: I try it and it works for me but I didn't understand why to use
: leaves.get(0)? What does it mean? (I saw such using in
: TestUninvertedReader.java of SOLR-6.5.1):
: 
: *Map mapping = new HashMap<>();
: mapping.put(fieldName, UninvertingReader.Type.SORTED);
: 
: SolrIndexSearcher searcher = req.getSearcher();
: 
: DirectoryReader dReader = searcher.getIndexReader();
: LeafReader reader = null;
: 
: if (!dReader.leaves.isEmpty()) {
:   reader = dReader.leaves().get(0).reader;
:   return null;
: }
: 
: SortedSetDocValues sourceIndex = reader.getSortedSetDocValues(fieldName);*
: 
: Maybe do I need to use SlowAtomicReader, like it:
: 
: *
: UninvertingReader reader = new
: UninvertingReader(searcher.getSlowAtomicReader(), mapping)*;
: 
: What is right way to get SortedSetDocValues and why?
: 
: 
: 
: --
: View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-6-how-to-get-SortedSetDocValues-from-index-by-field-name-tp4340388.html
: Sent from the Solr - User mailing list archive at Nabble.com.
: 

-Hoss
http://www.lucidworks.com/


RE: CSV output

2017-06-15 Thread Phil Scadden
Embarassing. Yes, it was the proxy. Very old code that has now had a 
considerable refresh.

-Original Message-
From: Erik Hatcher [mailto:erik.hatc...@gmail.com]
Sent: Thursday, 15 June 2017 7:13 p.m.
To: solr-user@lucene.apache.org
Subject: Re: CSV output

Is it the proxy affecting the output?What do you get going directly to 
Solr's endpoint?

   Erik

> On Jun 14, 2017, at 22:13, Phil Scadden  wrote:
>
> If I try
> /getsolr? 
> fl=id,title,datasource,score=true=9000=unified=Wainui-1=AND=csv
>
> The response I get is:
> id,title,datasource,scoreW:\PR_Reports\OCR\PR869.pdf,,Petroleum 
> Reports,8.233313W:\PR_Reports\OCR\PR3440.pdf,,Petroleum 
> Reports,8.217836W:\PR_Reports\OCR\PR4313.pdf,,Petroleum 
> Reports,8.206703W:\PR_Reports\OCR\PR3906.pdf,,Petroleum 
> Reports,8.185147W:\PR_Reports\OCR\PR1592.pdf,,Petroleum 
> Reports,8.167614W:\PR_Reports\OCR\PR998.pdf,,Petroleum 
> Reports,8.161142W:\PR_Reports\OCR\PR2457.pdf,,Petroleum 
> Reports,8.155497W:\PR_Reports\OCR\PR2433.pdf,,Petroleum 
> Reports,8.152924W:\PR_Reports\OCR\PR1184.pdf,,Petroleum 
> Reports,8.124402W:\PR_Reports\OCR\PR3551.pdf,,Petroleum Reports,8.124402
>
> ie no newline separators at all (Solr 6.5.1) (/getsolr is api that proxy to 
> the solr server).
> Changing it to
> /getsolr?csv.newline=%0A=id,title,datasource,score=true=9000=unified=Wainui-1=AND=csv
>
> Makes no difference. What I am doing wrong here? Is there another way to 
> specify csv parameters? It says default is \n but I am not seeing that.
>
> Notice: This email and any attachments are confidential and may not be used, 
> published or redistributed without the prior written consent of the Institute 
> of Geological and Nuclear Sciences Limited (GNS Science). If received in 
> error please destroy and immediately notify GNS Science. Do not copy or 
> disclose the contents.
Notice: This email and any attachments are confidential and may not be used, 
published or redistributed without the prior written consent of the Institute 
of Geological and Nuclear Sciences Limited (GNS Science). If received in error 
please destroy and immediately notify GNS Science. Do not copy or disclose the 
contents.


Re: Query not working with DatePointField

2017-06-15 Thread Tomas Fernandez Lobbe
The query field:* doesn't work with point fields (numerics or dates), only 
exact or range queries are supported, so an equivalent query would be field:[* 
TO *]


Sent from my iPhone

> On Jun 15, 2017, at 5:24 PM, Saurabh Sethi  wrote:
> 
> Hi,
> 
> We have a fieldType specified for date. Earlier it was using TrieDateField
> and we changed it to DatePointField.
> 
>  sortMissingLast="true" precisionStep="6"/>
> 
> 
> 
> Here are the fields used in the query and one of them uses the dateType:
> 
>  stored="false" required="true" multiValued="false"/>
>  stored="false" docValues="false" />
>  stored="false" multiValued="true" />
> 
> The following query was returning correct results when the field type was
> Trie but not with Point:
> 
> field1:value1 AND ((*:* NOT field2:*) AND field3:value3)
> 
> Any idea why field2:* does not return results anymore?
> 
> Thanks,
> Saurabh


Does Solr support multiple collapse filters - throws NullPointerException

2017-06-15 Thread Jeffery Yuan
I am trying to use solr collapse function: 
But seems when we use 2 or more collapse filters, it will throw
NullPointerException?

Does anyone have idea whether solr supports multiple collapse filters?

- it doesn't mention about using 2 or more collapse filters at all at
https://cwiki.apache.org/confluence/display/solr/Collapse+and+Expand+Results

One example: 
When run
q=docType:channel={!collapse field=seriesId
nullPolicy=collapse}={!collapse field=programId nullPolicy=collapse}

Channel doesn't have seriesId or programId at all, so it throws NPE.
The query may match different kinds of docs, which may don't have these
fields.

Exception from log:
Caused by:
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error
from server at http://localhost:8983/solr/thecollection_shard1_replica3:
java.lang.NullPointerException
at
org.apache.solr.search.CollapsingQParserPlugin$OrdScoreCollector.finish(CollapsingQParserPlugin.java:617)
at
org.apache.solr.search.CollapsingQParserPlugin$OrdScoreCollector.finish(CollapsingQParserPlugin.java:667)
at
org.apache.solr.search.SolrIndexSearcher.buildAndRunCollectorChain(SolrIndexSearcher.java:256)
at
org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:1823)
at
org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1640)
at
org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:611)
at
org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:533)
at
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:295)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:166)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:2299)
at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:658)
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:464)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:345)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:296)
at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1691)
at
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
at
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
at
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
at
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
at
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
at
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
at
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
at
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Does-Solr-support-multiple-collapse-filters-throws-NullPointerException-tp4340812.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Issue with highlighter

2017-06-15 Thread Damien Kamerman
Ali, does adding a 'hl.q' param help?  q=something=something&...

On 16 June 2017 at 06:21, Ali Husain  wrote:

> Thanks for the replies. Let me try and explain this a little better.
>
>
> I haven't modified anything in solrconfig. All I did was get a fresh
> instance of solr 6.4.1 and create a core testHighlight. I then created a
> content field of type text_en via the Solr Admin UI. id was already there,
> and that is of type string.
>
>
> I then use the UI, once again to check the hl checkbox, hl.fl is set to *
> because I want any and every match.
>
>
> I push the following content into this new solr instance:
>
> id:91101
>
> content:'I am adding something to the core field and we will try and find
> it. We want to make sure the highlighter works!
>
> This is short so fragsize and max characters shouldn\'t be an issue.'
>
> As you can see, very few characters, fragsize, maxAnalyzedChars, all that
> should not be an issue.
>
>
> I then send this query:
>
> http://localhost:8983/solr/testHighlight/select?hl.fl=*;
> hl=on=on=something=json
>
>
> My results:
>
>
> "response":{"numFound":1,"start":0,"docs":[
>
> {"id":"91101",
>
> "content":"I am adding something to the core field and we will try
> and find it. We want to make sure the highlighter works! This is short so
> fragsize and max characters shouldn't be an issue.",
> "_version_":1570302668841156608}]
>
>
> },
>
>
> "highlighting":{
> "91101":{}}
>
>
> I change q to be core instead of something.
>
>
> http://localhost:8983/solr/testHighlight/select?hl.fl=*;
> hl=on=on=core=json
>
>
> {
> "id":"91101",
> "content":"I am adding something to the core field and we will try
> and find it. We want to make sure the highlighter works! This is short so
> fragsize and max characters shouldn't be an issue.",
> "_version_":1570302668841156608},
>
>
>
> "highlighting":{
> "91101":{
>   "content":["I am adding something to the core field and we
> will try and find it. We want to make sure"]}}
>
> I've tried a bunch of queries. 'adding', 'something' both don't return any
> highlights. 'core' 'am' 'field' all work.
>
> Am I doing a better job of explaining this? Quite puzzling why this would
> be happening. My guess is there is some file/config somewhere that is
> ignoring some words? It isn't stopwords.txt in my case though. If that
> isn't the case then it definitely seems like a bug to me.
>
> Thanks, Ali
>
>
> 
> From: David Smiley 
> Sent: Thursday, June 15, 2017 12:33:39 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Issue with highlighter
>
> > Beware of NOT plus OR in a search. That will certainly produce no
> highlights. (eg test -results when default op is OR)
>
> Seems like a bug to me; the default operator shouldn't matter in that case
> I think since there is only one clause that has no BooleanQuery.Occur
> operator and thus the OR/AND shouldn't matter.  The end effect is "test" is
> effectively required and should definitely be highlighted.
>
> Note to Ali: Phil's comment implies use of hl.method=unified which is not
> the default.
>
> On Wed, Jun 14, 2017 at 10:22 PM Phil Scadden 
> wrote:
>
> > Just had similar issue - works for some, not others. First thing to look
> > at is hl.maxAnalyzedChars is the query. The default is quite small.
> > Since many of my documents are large PDF files, I opted to use
> > storeOffsetsWithPositions="true" termVectors="true" on the field I was
> > searching on.
> > This certainly did increase my index size but not too bad and certainly
> > fast.
> > https://cwiki.apache.org/confluence/display/solr/Highlighting
> >
> > Beware of NOT plus OR in a search. That will certainly produce no
> > highlights. (eg test -results when default op is OR)
> >
> >
> > -Original Message-
> > From: Ali Husain [mailto:alihus...@outlook.com]
> > Sent: Thursday, 15 June 2017 11:11 a.m.
> > To: solr-user@lucene.apache.org
> > Subject: Issue with highlighter
> >
> > Hi,
> >
> >
> > I think I've found a bug with the highlighter. I search for the word
> > "something" and I get an empty highlighting response for all the
> documents
> > that are returned shown below. The fields that I am searching over are
> > text_en, the highlighter works for a lot of queries. I have no
> > stopwords.txt list that could be messing this up either.
> >
> >
> >  "highlighting":{
> > "310":{},
> > "103":{},
> > "406":{},
> > "1189":{},
> > "54":{},
> > "292":{},
> > "309":{}}}
> >
> >
> > Just changing the search term to "something like" I get back this:
> >
> >
> > "highlighting":{
> > "310":{},
> > "309":{
> >   "content":["1949 Convention, like those"]},
> > "103":{},
> > "406":{},
> > "1189":{},
> > "54":{},
> > "292":{},
> > "286":{
> >   "content":["persons in these classes are treated like
> > combatants, but in other