Solr Index to Helio Search

2014-10-08 Thread Norgorn
When I try to simple copy index from native SOLR to Heliosearch, i get
exception:

Caused by: java.lang.IllegalArgumentException: A SPI class of type
org.apache.lu
cene.codecs.Codec with name 'Lucene410' does not exist. You need to add the
corr
esponding JAR file supporting this SPI to your classpath.The current
classpath s
upports the following names: [Lucene40, Lucene3x, Lucene41, Lucene42,
Lucene45,
Lucene46, Lucene49]

Is there any proper way to add index from native SOLR to Heliosearch?

The problem with native SOLR is that there are lot of OOM Exceptions (cause
of large index).



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Index-to-Helio-Search-tp4163446.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Having an issue with pivot faceting

2014-10-08 Thread Chris Hostetter

: Subject: Having an issue with pivot faceting

Ok - first off -- your example request doens't include any "facet.pivot" 
params, so you aren't using pivot faceting .. which makes me concerned 
that if you aren't using the feature you think you are, or don't 
understand the feature you are using.

: I'm having an issue getting pivot faceting working as expected.  I'm trying
: to filter by a specific criteria, and then first facet by one of my document
: attributes called item_generator, then facet those results into 2 sets each:
: the first set is the count of documents satisfying that facet with
: number_of_items_generated set to 0, the other set counting the documents
: satisfying that facet with number_of_items_generated greater than 0.  Seems

second:: interval faceting is just a fancy, more efficient, way of 
using "facet.query" if your queries are always over ranges.  there's 
nothing about interval faceting that is directly related to pivot 
faceting.

third: there isn't currently any generic support for faceting by a field, 
and then facet "those results" by some other field/criteria.  This is 
actively being worked on in issues like SOLR-6348 - but it doens't exist 
yet.

fourth: because you ultimately have a specific citeria for how 
you want to divide the facets, something similar to the behavior you are 
asking is available using "taged exclusions" on facets

https://cwiki.apache.org/confluence/display/solr/Faceting#Faceting-LocalParametersforFaceting

...the basic idea you could follow is that you send additional fq params 
for each of the 2 criteria you want to lump things into 
(number_of_items_generated<0 and number_of_items_generated>0) but you tag 
those filters so they can individuall be "excluded" from facets -- then 
you use facet.field on your item_generator field twice (with different 
keys)  and in each case you exclude only one of those filters.

Here's a similar example to what you describe using the sample data that 
comes with solr...

http://localhost:8983/solr/select?rows=0&debug=query&q=inStock:true&fq={!tag=pricey}price:[100%20TO%20*]&fq={!tag=cheap}price:[*%20TO%20100}&facet=true&facet.field={!key=cheap_cats%20ex=pricey}cat&facet.field={!key=pricey_cats%20ex=cheap}cat

so "cheap_cats" gives you facet counts on the "cat" field but only for the 
"cheap" products (because it excludes the "pricey" fq) and "pricey_cats" 
gives you facet counts on the "cat" field for the "pricey" products by 
excluding the "cheap" fq.

note however that the numFound is 0 -- this works fine for getting the 
facet counts you wnat, but you'd need a second query q/ the filters to get 
the main result set since (i'm pretty sure) it's not possible to use "ex" 
on the main query to exclude filters from affecting the main result set.


-Hoss
http://www.lucidworks.com/


Re: Add multiple JSON documents with boost

2014-10-08 Thread Chris Hostetter

: i try to add documents to the index and boost them (hole document) but i
: get this error message:
: 
: ERROR org.apache.solr.core.SolrCore  –
: org.apache.solr.common.SolrException: Error parsing JSON field value.
: Unexpected OBJECT_START
: 
: Any ideas?

The top level structure you are sending is a JSON array (because you start 
with "[") which is how you tell solr you want to send a simple list of 
documents to add.

In order to send explicit commands (like "add") your top level JSON 
structure needs to be JSON Object (aka: Map), which contains "add" as a 
key.

there are examples of this in the ref guide...

https://cwiki.apache.org/confluence/display/solr/Uploading+Data+with+Index+Handlers#UploadingDatawithIndexHandlers-SendingArbitraryJSONUpdateCommands

so basically, just take your list containing 2 objects that each have 1 
key of "add" and replace it with a single obejct that has 2 "add" keys...

: {
: "add": {
: "boost": 1,
: "doc": {
: "store_id": 1,
: "created_at": "2007-08-23T01:03:05Z",
: "sku": {"boost": 10, "value": "n2610"},
: "status": "1",
: "tax_class_id_t": "2",
: "color_t": "Black",
: "visibility": "4",
: "name": {"boost": -60, "value": "Nokia 2610 Phone"},
: "url_key": "nokia-2610-phone",
: "image": "\/n\/o\/nokia-2610-phone-2.jpg",
: "small_image": "\/n\/o\/nokia-2610-phone-2.jpg",
: "thumbnail": "\/n\/o\/nokia-2610-phone-2.jpg",
: "msrp_enabled_t": "2",
: "msrp_display_actual_price_type_t": "4",
: "model_t": "2610",
: "dimension_t": "4.1 x 1.7 x 0.7 inches",
: "meta_keyword_t": "Nokia 2610, cell, phone,",
: "short_description": "The words \"entry level\" no longer
: mean \"low-end,\" especially when it comes to the Nokia 2610. Offering
: advanced media and calling features without breaking the bank",
: "price": "149.99",
: "in_stock": "1",
: "id": "16_1",
: "product_id": "16",
: "content_type": "product",
: "attribute_set_id": "38",
: "type_id": "simple",
: "has_options": "0",
: "required_options": "0",
: "entity_type_id": "10",
: "category": [
: 8,
: 13
: ]
: }
: }
  ,
: "add": {
: "boost": 1,
: "doc": {
: "store_id": 1,
: "created_at": "2007-08-23T03:40:26Z",
: "sku": {"boost": 10, "value": "bb8100"},
: "color_t": "Silver",
: "status": "1",
: "tax_class_id_t": "2",
: "visibility": "4",
: "name": {"boost": -60, "value": "BlackBerry 8100 Pearl"},
: "url_key": "blackberry-8100-pearl",
: "thumbnail": "\/b\/l\/blackberry-8100-pearl-2.jpg",
: "small_image": "\/b\/l\/blackberry-8100-pearl-2.jpg",
: "image": "\/b\/l\/blackberry-8100-pearl-2.jpg",
: "model_t": "8100",
: "dimension_t": "4.2 x 2 x 0.6 inches",
: "meta_keyword_t": "Blackberry, 8100, pearl, cell, phone",
: "short_description": "The BlackBerry 8100 Pearl is a
: departure from the form factor of previous BlackBerry devices. This
: BlackBerry handset is far more phone-like, and RIM's engineers have managed
: to fit a QWERTY keyboard onto the handset's slim frame.",
: "price": "349.99",
: "in_stock": "1",
: "id": "17_1",
: "product_id": "17",
: "content_type": "product",
: "attribute_set_id": "38",
: "type_id": "simple",
: "has_options": "0",
: "required_options": "0",
: "entity_type_id": "10",
: "category": [
: 8,
: 13
: ]
: }
: }
: }


-Hoss
http://www.lucidworks.com/

Re: Custom Solr Query Post Filter

2014-10-08 Thread Joel Bernstein
Also just took a quick look at the code. This will likely be a performance
problem if you have a large result set:

String classif = context.reader().document(docId).get("classification");

Instead of using the stored field, you'll want to get the BytesRef for the
field using either the FieldCache or DocValues. Recent releases of
DocValues will likely be the fastest docID->BytesRef lookup.



Joel Bernstein
Search Engineer at Heliosearch

On Wed, Oct 8, 2014 at 2:20 PM, Christopher Gross  wrote:

> That did the trick!  Thanks Joel.
>
> -- Chris
>
> On Wed, Oct 8, 2014 at 2:05 PM, Joel Bernstein  wrote:
>
> > The results are being cached in the QueryResultCache most likely. You
> need
> > to implement equals() and hashCode() on the query object, which is part
> of
> > the cache key. In your case the creds param must be included in the
> > hashCode and equals logic.
> >
> > Joel Bernstein
> > Search Engineer at Heliosearch
> >
> > On Wed, Oct 8, 2014 at 1:17 PM, Christopher Gross 
> > wrote:
> >
> > > Code:
> > > http://pastebin.com/tNjzDbmy
> > >
> > > Solr 4.9.0
> > > Tomcat 7
> > > Java 7
> > >
> > > I took Erik Hatcher's example for creating a PostFilter and have
> modified
> > > it so it would work with Solr 4.x.  Right now it works...the first
> time.
> > > If I were to run this query it would work right:
> > >
> > >
> >
> http://localhost:8080/solr/plugintest/select?q=*:*&sort=uniqueId%20desc&fq={!classif%20creds=ABC}
> > > However, if I ran this one:
> > >
> > >
> >
> http://localhost:8080/solr/plugintest/select?q=*:*&sort=uniqueId%20desc&fq={!classif%20creds=XYZ}
> > > I would get the results from the first query.  I could do a different
> > > query, like:
> > > http://localhost:8080/solr/plugintest/select?q=uniqueId[* TO
> > > *]&sort=uniqueId%20desc&fq={!classif%20creds=XYZ}
> > > and I'd get the XYZ tagged items.  But if I tried to find ABC with that
> > > one:
> > > http://localhost:8080/solr/plugintest/select?q=uniqueId[* TO
> > > *]&sort=uniqueId%20desc&fq={!classif%20creds=ABC}
> > > it would just list the XYZ items.
> > >
> > > I'm not sure what is persisting where to cause this to happen.  Anybody
> > > have some tips/pointers for building filters like this for Solr 4.x?
> > >
> > > Thanks!
> > >
> > > -- Chris
> > >
> >
>


Re: Best way to index wordpress blogs in solr

2014-10-08 Thread Jack Krupansky
The LucidWorks product has builtin crawler support so you could crawl one or 
more web sites.


http://lucidworks.com/product/fusion/

-- Jack Krupansky

-Original Message- 
From: Vishal Sharma

Sent: Tuesday, October 7, 2014 2:08 PM
To: solr-user@lucene.apache.org
Subject: Best way to index wordpress blogs in solr

Hi,

I am trying to get some help on finding out if there is any best practice
to index wordpress blogs in solr index? Can someone help with architecture
I shoudl be setting up?

Do, I need to write separate scripts to crawl wordpress and then pump posts
back to Solr using its API?




*Vishal Sharma**TL, Grazitti Interactive*T: +1 650­ 641 1754
E: vish...@grazitti.com
www.grazitti.com [image: Description: LinkedIn]
[image: Description:
Twitter] [image: fbook]
*dreamforce®*Oct 13-16,
2014 *Meet
us at the Cloud Expo*
Booth N2341 Moscone North,
San Francisco
Schedule a Meeting

  |   Follow us ZakCalendar
Dreamforce® Featured
App
 



Re: Using Velocity with Child Documents?

2014-10-08 Thread Chris Hostetter

: I am trying to index a collection that has child documents.  I have 
: successfully loaded the data into my index using SolrJ, and I have 
: verified that I can search correctly using the "child of" method in my 
: fq variable.  Now, I would like to use Velocity (Solritas) to display 
: the parent records with some details of the child records underneath.  
: Is there an easy way to do this?  Is there an example somewhere that I 
: can look at?

Step #1 is to forget about velocity and focus on getting the data you want 
about the children into the response.  

To do that you'll need to use the [child] DocTransformer...

https://cwiki.apache.org/confluence/display/solr/Transforming+Result+Documents

ala...

fl=id,[child parentFilter=doc_type:book childFilter=doc_type:chapter limit=100]

If you are using this in conjunction with a block join query, you can use 
local params to eliminate some redundency...

q=some_parent_field:foo
parents=content_type:parentDoc
fq={!parent which=$parents}child_field:bar
fl=id,[child parentFilter=$parents childFilter=content_type:childDoc limit=100]


Step #2: once you have the children in the response data, then you can use 
velocity to access each of the children of the docs that match your query 
via SolrDocument.getChildDocuments()



-Hoss
http://www.lucidworks.com/


Re: Edismax parser and boosts

2014-10-08 Thread Jack Krupansky
Definitely sounds like a bug! File a Jira. Thanks for reporting this. What 
release of Solr?




-- Jack Krupansky
-Original Message- 
From: Pawel Rog

Sent: Wednesday, October 8, 2014 3:57 PM
To: solr-user@lucene.apache.org
Subject: Edismax parser and boosts

Hi,
I use edismax query with q parameter set as below:

q=foo^1.0+AND+bar

For such a query for the same document I see different (lower) scoring
value than for

q=foo+AND+bar

By default boost of term is 1 as far as i know so why the scoring differs?

When I check debugQuery parameter in parsedQuery for "foo^1.0+AND+bar" I
see Boolean query which one of clauses is a phrase query "foo 1.0 bar". It
seems that edismax parser takes whole q parameter as a phrase without
removing boost value and add it as a boolean clause. Is it a bug or it
should work like that?

--
Paweł Róg 



Re: eDisMax parser and special characters

2014-10-08 Thread Jack Krupansky
Hyphen is a "prefix operator" and is normally followed by a term to indicate 
that the term "must not" be present. So, your query has a syntax error. The 
two query parsers differ in how they handle various errors. In the case of 
edismax, it quotes operators and then tries again, so the hyphen gets 
quoted, and then analyzed to nothing for text fields but is still a string 
for string fields.


-- Jack Krupansky

-Original Message- 
From: Lanke,Aniruddha

Sent: Wednesday, October 8, 2014 4:38 PM
To: solr-user@lucene.apache.org
Subject: Re: eDisMax parser and special characters

Sorry for a delayed reply here is more information -

Schema that we are using - http://pastebin.com/WQAJCCph
Request Handler in config - http://pastebin.com/Y0kP40WF

Some analysis -

Search term: red -
Parser eDismax
No results show up
(+((DisjunctionMaxQuery((name_starts_with:red^9.0 | 
name_parts_starts_with:red^6.0 | s_detail:red | name:red^12.0 | 
s_detail_starts_with:red^3.0 | s_detail_parts_starts_with:red^2.0)) 
DisjunctionMaxQuery((name_starts_with:-^9.0 | 
s_detail_starts_with:-^3.0)))~2))/no_coord


Search term: red -
Parser dismax
Results are returned
(+DisjunctionMaxQuery((name_starts_with:red^9.0 | 
name_parts_starts_with:red^6.0 | s_detail:red | name:red^12.0 | 
s_detail_starts_with:red^3.0 | s_detail_parts_starts_with:red^2.0)) 
())/no_coord


Why do we see the variation in the results between dismax and eDismax?


On Oct 8, 2014, at 8:59 AM, Erick Erickson 
mailto:erickerick...@gmail.com>> wrote:


There's not much information here.
What's the doc look like?
What is the analyzer chain for it?
What is the output when you add &debug=query?

Details matter. A lot ;)

Best,
Erick

On Wed, Oct 8, 2014 at 6:26 AM, Michael Joyner 
mailto:mich...@newsrx.com>> wrote:

Try escaping special chars with a "\"


On 10/08/2014 01:39 AM, Lanke,Aniruddha wrote:

We are using a eDisMax parser in our configuration. When we search using
the query term that has a ‘-‘ we don’t get any results back.

Search term: red - yellow
This doesn’t return any data back but




CONFIDENTIALITY NOTICE This message and any included attachments are from 
Cerner Corporation and are intended only for the addressee. The information 
contained in this message is confidential and may constitute inside or 
non-public information under international, federal, or state securities 
laws. Unauthorized forwarding, printing, copying, distribution, or use of 
such information is strictly prohibited and may be unlawful. If you are not 
the addressee, please promptly delete this message and notify the sender of 
the delivery error by e-mail or you may call Cerner's corporate offices in 
Kansas City, Missouri, U.S.A at (+1) (816)221-1024. 



RE: Solr configuration, memory usage and MMapDirectory

2014-10-08 Thread Simon Fairey
Hi

Thanks for this I will investigate further after reading a number of your 
points in more detail, I do have a feeling they've setup too many entries in 
the filter cache (1000s) so will revisit that.

Just a note on numbers, those were valid when I made the post but obviously 
they change as the week progresses before a regular clean-up of content, 
current numbers for info (if it's at all relevant) from the index admin view on 
one of the 2 nodes is:

Last Modified:  18 minutes ago
Num Docs:   24590368
Max Doc:29139255
Deleted Docs:   4548887
Version:1297982
Segment Count:  28

   Version  Gen Size
Master: 1412798583558 402364 52.98 GB

Top:
2996 tomcat6   20   0  189g  73g 1.5g S   15 58.7  58034:04 java

And the only GC option I can see that is on is "- XX:+UseConcMarkSweepGC"

Regarding the XY problem, you are very likely correct, unfortunately I wasn't 
involved in the config and I very much suspect when it was done many of the 
defaults were used and then if it didn't work or there was say an out of memory 
error they just upped the heap to solve the symptom without investigating the 
cause. The luxury of having more than enough RAM I guess!

I'm going to get some late night downtime soon at which point I'm hoping to 
change the heap size, GC settings and add the JMX, it's not exposed to the 
internet so no security is fine.

Right off to do some reading!

Cheers

Si

-Original Message-
From: Shawn Heisey [mailto:apa...@elyograg.org] 
Sent: 08 October 2014 21:09
To: solr-user@lucene.apache.org
Subject: Re: Solr configuration, memory usage and MMapDirectory

On 10/8/2014 4:02 AM, Simon Fairey wrote:
> I'm currently setting up jconsole but as I have to remotely monitor (no gui 
> capability on the server) I have to wait before I can restart solr with a JMX 
> port setup. In the meantime I looked at top and given the calculations you 
> said based on your top output and this top of my java process from the node 
> that handles the querying, the indexing node has a similar memory profile:
> 
> https://www.dropbox.com/s/pz85dm4e7qpepco/SolrTop.png?dl=0
> 
> It would seem I need a monstrously large heap in the 60GB region?
> 
> We do use a lot of navigators/filters so I have set the caches to be quite 
> large for these, are these what are using up the memory?

With a VIRT size of 189GB and a RES size of 73GB, I believe you probably have 
more than 45GB of index data.  This might be a combination of old indexes and 
the active index.  Only the indexes (cores) that are being actively used need 
to be considered when trying to calculate the total RAM needed.  Other indexes 
will not affect performance, even though they increase your virtual memory size.

With MMap, part of the virtual memory size is the size of the index data that 
has been opened on the disk.  This is not memory that's actually allocated.  
There's a very good reason that mmap has been the default in Lucene and Solr 
for more than two years.

http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html

You stated originally that you have 25 million document and 45GB of index data 
on each node.  With those numbers and a conservative configuration, I would 
expect that you need about 4GB of heap, maybe as much as 8GB.  I cannot think 
of any reason that you would NEED a heap 60GB or larger.

Each field that you sort on, each field that you facet on with the default 
facet.method of fc, and each filter that you cache will use a large block of 
memory.  The size of that block of memory is almost exclusively determined by 
the number of documents in the index.

With 25 million documents, each filterCache entry will be approximately 3MB -- 
one bit for every document.  I do not know how big each FieldCache entry is for 
a sort field and a facet field, but assume that they are probably larger than 
the 3MB entries on the filterCache.

I've got a filterCache sized at 64, with an autowarmCount of 4.  With larger 
autowarmCount values, I was seeing commits take 30 seconds or more, because 
each of those filters can take a few seconds to execute.
Cache sizes in the thousands are rarely necessary, and just chew up a lot of 
memory with no benefit.  Large autowarmCount values are also rarely necessary.  
Every time a new searcher is opened by a commit, add up all your autowarmCount 
values and realize that the searcher likely needs to execute that many queries 
before it is available.

If you need to set up remote JMX so you can remotely connect jconsole, I have 
done this in the redhat init script I've built -- see JMX_OPTS here:

http://wiki.apache.org/solr/ShawnHeisey#Init_script

It's never a good idea to expose Solr directly to the internet, but if you use 
that JMX config, *definitely* don't expose it to the Internet.
It doesn't use any authentication.

We might need to back up a little bit and start with the problem that you are 
trying to figure 

Re: eDisMax parser and special characters

2014-10-08 Thread Lanke,Aniruddha
Sorry for a delayed reply here is more information -

Schema that we are using - http://pastebin.com/WQAJCCph
Request Handler in config - http://pastebin.com/Y0kP40WF

Some analysis -

Search term: red -
Parser eDismax
No results show up
(+((DisjunctionMaxQuery((name_starts_with:red^9.0 | 
name_parts_starts_with:red^6.0 | s_detail:red | name:red^12.0 | 
s_detail_starts_with:red^3.0 | s_detail_parts_starts_with:red^2.0)) 
DisjunctionMaxQuery((name_starts_with:-^9.0 | 
s_detail_starts_with:-^3.0)))~2))/no_coord

Search term: red -
Parser dismax
Results are returned
(+DisjunctionMaxQuery((name_starts_with:red^9.0 | 
name_parts_starts_with:red^6.0 | s_detail:red | name:red^12.0 | 
s_detail_starts_with:red^3.0 | s_detail_parts_starts_with:red^2.0)) 
())/no_coord

Why do we see the variation in the results between dismax and eDismax?


On Oct 8, 2014, at 8:59 AM, Erick Erickson 
mailto:erickerick...@gmail.com>> wrote:

There's not much information here.
What's the doc look like?
What is the analyzer chain for it?
What is the output when you add &debug=query?

Details matter. A lot ;)

Best,
Erick

On Wed, Oct 8, 2014 at 6:26 AM, Michael Joyner 
mailto:mich...@newsrx.com>> wrote:
Try escaping special chars with a "\"


On 10/08/2014 01:39 AM, Lanke,Aniruddha wrote:

We are using a eDisMax parser in our configuration. When we search using
the query term that has a ‘-‘ we don’t get any results back.

Search term: red - yellow
This doesn’t return any data back but




CONFIDENTIALITY NOTICE This message and any included attachments are from 
Cerner Corporation and are intended only for the addressee. The information 
contained in this message is confidential and may constitute inside or 
non-public information under international, federal, or state securities laws. 
Unauthorized forwarding, printing, copying, distribution, or use of such 
information is strictly prohibited and may be unlawful. If you are not the 
addressee, please promptly delete this message and notify the sender of the 
delivery error by e-mail or you may call Cerner's corporate offices in Kansas 
City, Missouri, U.S.A at (+1) (816)221-1024.


Re: Solr configuration, memory usage and MMapDirectory

2014-10-08 Thread Shawn Heisey
On 10/8/2014 4:02 AM, Simon Fairey wrote:
> I'm currently setting up jconsole but as I have to remotely monitor (no gui 
> capability on the server) I have to wait before I can restart solr with a JMX 
> port setup. In the meantime I looked at top and given the calculations you 
> said based on your top output and this top of my java process from the node 
> that handles the querying, the indexing node has a similar memory profile:
> 
> https://www.dropbox.com/s/pz85dm4e7qpepco/SolrTop.png?dl=0
> 
> It would seem I need a monstrously large heap in the 60GB region?
> 
> We do use a lot of navigators/filters so I have set the caches to be quite 
> large for these, are these what are using up the memory?

With a VIRT size of 189GB and a RES size of 73GB, I believe you probably
have more than 45GB of index data.  This might be a combination of old
indexes and the active index.  Only the indexes (cores) that are being
actively used need to be considered when trying to calculate the total
RAM needed.  Other indexes will not affect performance, even though they
increase your virtual memory size.

With MMap, part of the virtual memory size is the size of the index data
that has been opened on the disk.  This is not memory that's actually
allocated.  There's a very good reason that mmap has been the default in
Lucene and Solr for more than two years.

http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html

You stated originally that you have 25 million document and 45GB of
index data on each node.  With those numbers and a conservative
configuration, I would expect that you need about 4GB of heap, maybe as
much as 8GB.  I cannot think of any reason that you would NEED a heap
60GB or larger.

Each field that you sort on, each field that you facet on with the
default facet.method of fc, and each filter that you cache will use a
large block of memory.  The size of that block of memory is almost
exclusively determined by the number of documents in the index.

With 25 million documents, each filterCache entry will be approximately
3MB -- one bit for every document.  I do not know how big each
FieldCache entry is for a sort field and a facet field, but assume that
they are probably larger than the 3MB entries on the filterCache.

I've got a filterCache sized at 64, with an autowarmCount of 4.  With
larger autowarmCount values, I was seeing commits take 30 seconds or
more, because each of those filters can take a few seconds to execute.
Cache sizes in the thousands are rarely necessary, and just chew up a
lot of memory with no benefit.  Large autowarmCount values are also
rarely necessary.  Every time a new searcher is opened by a commit, add
up all your autowarmCount values and realize that the searcher likely
needs to execute that many queries before it is available.

If you need to set up remote JMX so you can remotely connect jconsole, I
have done this in the redhat init script I've built -- see JMX_OPTS here:

http://wiki.apache.org/solr/ShawnHeisey#Init_script

It's never a good idea to expose Solr directly to the internet, but if
you use that JMX config, *definitely* don't expose it to the Internet.
It doesn't use any authentication.

We might need to back up a little bit and start with the problem that
you are trying to figure out, not the numbers that are being reported.

http://people.apache.org/~hossman/#xyproblem

Your original note said that you're sanity checking.  Toward that end,
the only insane thing that jumps out at me is that your max heap is
*VERY* large, and you probably don't have the proper GC tuning.

My recommendations for initial action are to use -Xmx8g on the servlet
container startup and include the GC settings you can find on the wiki
pages I've given you.  It would be a very good idea to set up remote JMX
so you can use jconsole or jvisualvm remotely.

Thanks,
Shawn



Edismax parser and boosts

2014-10-08 Thread Pawel Rog
Hi,
I use edismax query with q parameter set as below:

q=foo^1.0+AND+bar

For such a query for the same document I see different (lower) scoring
value than for

q=foo+AND+bar

By default boost of term is 1 as far as i know so why the scoring differs?

When I check debugQuery parameter in parsedQuery for "foo^1.0+AND+bar" I
see Boolean query which one of clauses is a phrase query "foo 1.0 bar". It
seems that edismax parser takes whole q parameter as a phrase without
removing boost value and add it as a boolean clause. Is it a bug or it
should work like that?

--
Paweł Róg


Re: Using Velocity with Child Documents?

2014-10-08 Thread Erick Erickson
Velocity is just taking the Solr response and displaying selected bits
in HTML. So assuming the information you want is in the reponse packet
(which you can tell just by doing the query from the browser) it's
"just" a matter of pulling it out of the response and displaying it.

Mostly when I started down this path I poked around the velocity
directory it was just a bit of hunt int to figure things out, with
some help from the Apache Velocity page.

Not much help, but the short form is there's much of an example that I
know of for your specific problem.

Erick

On Wed, Oct 8, 2014 at 8:54 AM, Edwards, Joshua
 wrote:
> Hi -
>
> I am trying to index a collection that has child documents.  I have 
> successfully loaded the data into my index using SolrJ, and I have verified 
> that I can search correctly using the "child of" method in my fq variable.  
> Now, I would like to use Velocity (Solritas) to display the parent records 
> with some details of the child records underneath.  Is there an easy way to 
> do this?  Is there an example somewhere that I can look at?
>
> Thanks,
> Josh Edwards
> 
>
> The information contained in this e-mail is confidential and/or proprietary 
> to Capital One and/or its affiliates. The information transmitted herewith is 
> intended only for use by the individual or entity to which it is addressed.  
> If the reader of this message is not the intended recipient, you are hereby 
> notified that any review, retransmission, dissemination, distribution, 
> copying or other use of, or taking of any action in reliance upon this 
> information is strictly prohibited. If you have received this communication 
> in error, please contact the sender and delete the material from your 
> computer.


Re: Custom Solr Query Post Filter

2014-10-08 Thread Christopher Gross
That did the trick!  Thanks Joel.

-- Chris

On Wed, Oct 8, 2014 at 2:05 PM, Joel Bernstein  wrote:

> The results are being cached in the QueryResultCache most likely. You need
> to implement equals() and hashCode() on the query object, which is part of
> the cache key. In your case the creds param must be included in the
> hashCode and equals logic.
>
> Joel Bernstein
> Search Engineer at Heliosearch
>
> On Wed, Oct 8, 2014 at 1:17 PM, Christopher Gross 
> wrote:
>
> > Code:
> > http://pastebin.com/tNjzDbmy
> >
> > Solr 4.9.0
> > Tomcat 7
> > Java 7
> >
> > I took Erik Hatcher's example for creating a PostFilter and have modified
> > it so it would work with Solr 4.x.  Right now it works...the first time.
> > If I were to run this query it would work right:
> >
> >
> http://localhost:8080/solr/plugintest/select?q=*:*&sort=uniqueId%20desc&fq={!classif%20creds=ABC}
> > However, if I ran this one:
> >
> >
> http://localhost:8080/solr/plugintest/select?q=*:*&sort=uniqueId%20desc&fq={!classif%20creds=XYZ}
> > I would get the results from the first query.  I could do a different
> > query, like:
> > http://localhost:8080/solr/plugintest/select?q=uniqueId[* TO
> > *]&sort=uniqueId%20desc&fq={!classif%20creds=XYZ}
> > and I'd get the XYZ tagged items.  But if I tried to find ABC with that
> > one:
> > http://localhost:8080/solr/plugintest/select?q=uniqueId[* TO
> > *]&sort=uniqueId%20desc&fq={!classif%20creds=ABC}
> > it would just list the XYZ items.
> >
> > I'm not sure what is persisting where to cause this to happen.  Anybody
> > have some tips/pointers for building filters like this for Solr 4.x?
> >
> > Thanks!
> >
> > -- Chris
> >
>


Re: Custom Solr Query Post Filter

2014-10-08 Thread Joel Bernstein
The results are being cached in the QueryResultCache most likely. You need
to implement equals() and hashCode() on the query object, which is part of
the cache key. In your case the creds param must be included in the
hashCode and equals logic.

Joel Bernstein
Search Engineer at Heliosearch

On Wed, Oct 8, 2014 at 1:17 PM, Christopher Gross  wrote:

> Code:
> http://pastebin.com/tNjzDbmy
>
> Solr 4.9.0
> Tomcat 7
> Java 7
>
> I took Erik Hatcher's example for creating a PostFilter and have modified
> it so it would work with Solr 4.x.  Right now it works...the first time.
> If I were to run this query it would work right:
>
> http://localhost:8080/solr/plugintest/select?q=*:*&sort=uniqueId%20desc&fq={!classif%20creds=ABC}
> However, if I ran this one:
>
> http://localhost:8080/solr/plugintest/select?q=*:*&sort=uniqueId%20desc&fq={!classif%20creds=XYZ}
> I would get the results from the first query.  I could do a different
> query, like:
> http://localhost:8080/solr/plugintest/select?q=uniqueId[* TO
> *]&sort=uniqueId%20desc&fq={!classif%20creds=XYZ}
> and I'd get the XYZ tagged items.  But if I tried to find ABC with that
> one:
> http://localhost:8080/solr/plugintest/select?q=uniqueId[* TO
> *]&sort=uniqueId%20desc&fq={!classif%20creds=ABC}
> it would just list the XYZ items.
>
> I'm not sure what is persisting where to cause this to happen.  Anybody
> have some tips/pointers for building filters like this for Solr 4.x?
>
> Thanks!
>
> -- Chris
>


Custom Solr Query Post Filter

2014-10-08 Thread Christopher Gross
Code:
http://pastebin.com/tNjzDbmy

Solr 4.9.0
Tomcat 7
Java 7

I took Erik Hatcher's example for creating a PostFilter and have modified
it so it would work with Solr 4.x.  Right now it works...the first time.
If I were to run this query it would work right:
http://localhost:8080/solr/plugintest/select?q=*:*&sort=uniqueId%20desc&fq={!classif%20creds=ABC}
However, if I ran this one:
http://localhost:8080/solr/plugintest/select?q=*:*&sort=uniqueId%20desc&fq={!classif%20creds=XYZ}
I would get the results from the first query.  I could do a different
query, like:
http://localhost:8080/solr/plugintest/select?q=uniqueId[* TO
*]&sort=uniqueId%20desc&fq={!classif%20creds=XYZ}
and I'd get the XYZ tagged items.  But if I tried to find ABC with that one:
http://localhost:8080/solr/plugintest/select?q=uniqueId[* TO
*]&sort=uniqueId%20desc&fq={!classif%20creds=ABC}
it would just list the XYZ items.

I'm not sure what is persisting where to cause this to happen.  Anybody
have some tips/pointers for building filters like this for Solr 4.x?

Thanks!

-- Chris


Re: solr suggester not working with shards

2014-10-08 Thread Varun Thacker
Hi,

You have defined the suggester in the old way of implementing it but you do
mention the SuggestComponent. Can you try it out using the documentation
given here - https://cwiki.apache.org/confluence/display/solr/Suggester

Secondly how are you firing your queries?

On Wed, Oct 8, 2014 at 12:39 PM, rsi...@ambrac.nl  wrote:

> One more thing :
>
> suggest is not working  with multiple cores using  shard but  'did you
> mean'
> (spell check ) is working fine with multiple cores.
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/solr-suggester-not-working-with-shards-tp4163261p4163265.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>



-- 


Regards,
Varun Thacker
http://www.vthacker.in/


Re: WhitespaceTokenizer to consider incorrectly encoded c2a0?

2014-10-08 Thread Jack Krupansky
The source code uses that Java Character.isWhitespace method which 
specifically excludes the non-breaking white space characters.


The Javadoc contract for WhitespaceTokenizer is too vague, especially since 
Unicode has so many... subtleties.


Personally, I'd go along with treating non-breaking white space as white 
space here.


And update the Lucene Javadoc contract to be more explicit.

-- Jack Krupansky

-Original Message- 
From: Markus Jelsma

Sent: Wednesday, October 8, 2014 10:16 AM
To: solr-user@lucene.apache.org ; solr-user
Subject: RE: WhitespaceTokenizer to consider incorrectly encoded c2a0?

Alexandre - i am sorry if i was not clear, this is about queries, this all 
happens at query time. Yes we can do the substitution in with the regex 
replace filter, but i would propose this weird exception to be added to 
WhitespaceTokenizer so Lucene deals with this by itself.


Markus

-Original message-

From:Alexandre Rafalovitch 
Sent: Wednesday 8th October 2014 16:12
To: solr-user 
Subject: Re: WhitespaceTokenizer to consider incorrectly encoded c2a0?

Is this a suggestion for JIRA ticket? Or a question on how to solve
it? If the later, you could probably stick a RegEx replacement in the
UpdateRequestProcessor chain and be done with it.

As to why? I would look for the rest of the MSWord-generated
artifacts, such as "smart" quotes, extra-long dashes, etc.

Regards,
   Alex.
Personal: http://www.outerthoughts.com/ and @arafalov
Solr resources and newsletter: http://www.solr-start.com/ and @solrstart
Solr popularizers community: https://www.linkedin.com/groups?gid=6713853


On 8 October 2014 09:59, Markus Jelsma  wrote:
> Hi,
>
> For some crazy reason, some users somehow manage to substitute a 
> perfectly normal space with a badly encoded non-breaking space, properly 
> URL encoded this then becomes %c2a0 and depending on the encoding you 
> use to view you probably see  followed by a space. For example:

>
> Because c2a0 is not considered whitespace (indeed, it is not real 
> whitespace, that is 00a0) by the Java Character class, the 
> WhitespaceTokenizer won't split on it, but the WordDelimiterFilter still 
> does, somehow mitigating the problem as it becomes:

>
> HTMLSCF een abonnement
> WT een abonnement
> WDF een eenabonnement abonnement
>
> Should the WhitespaceTokenizer not include this weird edge case?
>
> Cheers,
> Markus





Using Velocity with Child Documents?

2014-10-08 Thread Edwards, Joshua
Hi -

I am trying to index a collection that has child documents.  I have 
successfully loaded the data into my index using SolrJ, and I have verified 
that I can search correctly using the "child of" method in my fq variable.  
Now, I would like to use Velocity (Solritas) to display the parent records with 
some details of the child records underneath.  Is there an easy way to do this? 
 Is there an example somewhere that I can look at?

Thanks,
Josh Edwards


The information contained in this e-mail is confidential and/or proprietary to 
Capital One and/or its affiliates. The information transmitted herewith is 
intended only for use by the individual or entity to which it is addressed.  If 
the reader of this message is not the intended recipient, you are hereby 
notified that any review, retransmission, dissemination, distribution, copying 
or other use of, or taking of any action in reliance upon this information is 
strictly prohibited. If you have received this communication in error, please 
contact the sender and delete the material from your computer.


RE: WhitespaceTokenizer to consider incorrectly encoded c2a0?

2014-10-08 Thread Markus Jelsma
Alexandre - i am sorry if i was not clear, this is about queries, this all 
happens at query time. Yes we can do the substitution in with the regex replace 
filter, but i would propose this weird exception to be added to 
WhitespaceTokenizer so Lucene deals with this by itself.

Markus
 
-Original message-
> From:Alexandre Rafalovitch 
> Sent: Wednesday 8th October 2014 16:12
> To: solr-user 
> Subject: Re: WhitespaceTokenizer to consider incorrectly encoded c2a0?
> 
> Is this a suggestion for JIRA ticket? Or a question on how to solve
> it? If the later, you could probably stick a RegEx replacement in the
> UpdateRequestProcessor chain and be done with it.
> 
> As to why? I would look for the rest of the MSWord-generated
> artifacts, such as "smart" quotes, extra-long dashes, etc.
> 
> Regards,
>Alex.
> Personal: http://www.outerthoughts.com/ and @arafalov
> Solr resources and newsletter: http://www.solr-start.com/ and @solrstart
> Solr popularizers community: https://www.linkedin.com/groups?gid=6713853
> 
> 
> On 8 October 2014 09:59, Markus Jelsma  wrote:
> > Hi,
> >
> > For some crazy reason, some users somehow manage to substitute a perfectly 
> > normal space with a badly encoded non-breaking space, properly URL encoded 
> > this then becomes %c2a0 and depending on the encoding you use to view you 
> > probably see  followed by a space. For example:
> >
> > Because c2a0 is not considered whitespace (indeed, it is not real 
> > whitespace, that is 00a0) by the Java Character class, the 
> > WhitespaceTokenizer won't split on it, but the WordDelimiterFilter still 
> > does, somehow mitigating the problem as it becomes:
> >
> > HTMLSCF een abonnement
> > WT een abonnement
> > WDF een eenabonnement abonnement
> >
> > Should the WhitespaceTokenizer not include this weird edge case?
> >
> > Cheers,
> > Markus
> 


Re: WhitespaceTokenizer to consider incorrectly encoded c2a0?

2014-10-08 Thread Alexandre Rafalovitch
Is this a suggestion for JIRA ticket? Or a question on how to solve
it? If the later, you could probably stick a RegEx replacement in the
UpdateRequestProcessor chain and be done with it.

As to why? I would look for the rest of the MSWord-generated
artifacts, such as "smart" quotes, extra-long dashes, etc.

Regards,
   Alex.
Personal: http://www.outerthoughts.com/ and @arafalov
Solr resources and newsletter: http://www.solr-start.com/ and @solrstart
Solr popularizers community: https://www.linkedin.com/groups?gid=6713853


On 8 October 2014 09:59, Markus Jelsma  wrote:
> Hi,
>
> For some crazy reason, some users somehow manage to substitute a perfectly 
> normal space with a badly encoded non-breaking space, properly URL encoded 
> this then becomes %c2a0 and depending on the encoding you use to view you 
> probably see  followed by a space. For example:
>
> Because c2a0 is not considered whitespace (indeed, it is not real whitespace, 
> that is 00a0) by the Java Character class, the WhitespaceTokenizer won't 
> split on it, but the WordDelimiterFilter still does, somehow mitigating the 
> problem as it becomes:
>
> HTMLSCF een abonnement
> WT een abonnement
> WDF een eenabonnement abonnement
>
> Should the WhitespaceTokenizer not include this weird edge case?
>
> Cheers,
> Markus


Re: SolrCloud with client ssl

2014-10-08 Thread Sindre Fiskaa
Yes, running SolrCloud without SSL it works fine with the createNodeSet
param. I run this with Tomcat application server and 443 enabled.
Although I receive this error message the collection and the shards gets
created and the clusterstate.json updated, but the cores are missing. I
manual add them one by one in the admin console so I get my cloud up
running and the solr-nodes are able to talk to each other - no certificate
issues or SSL handshake error between the nodes.

curl -E solr-ssl.pem:secret12 -k
'https://vt-searchln03:443/solr/admin/collections?action=CREATE&numShards=3
&replicationFactor=2&name=multisharding&createNodeSet=vt-searchln03:443_sol
r,vt-searchln04:443_solr,vt-searchln01:443_solr,vt-searchln02:443_solr,vt-s
earchln05:443_solr,vt-searchln06:443_solr'



0206org.apache.solr.client.solrj.SolrServerException:IOExce
ption occured when talking to server at: https://vt-searchln03:443/solr
org.apache.solr.client.solrj.SolrSer
verException:IOException occured when talking to server at:
https://vt-searchln04:443/solr
org.apache.solr.client.solrj.SolrSer
verException:IOException occured when talking to server at:
https://vt-searchln06:443/solr
org.apache.solr.client.solrj.SolrSer
verException:IOException occured when talking to server at:
https://vt-searchln05:443/solr
org.apache.solr.client.solrj.SolrSer
verException:IOException occured when talking to server at:
https://vt-searchln01:443/solr
org.apache.solr.client.solrj.SolrSer
verException:IOException occured when talking to server at:
https://vt-searchln02:443/solr 



-Sindre

On 08.10.14 15:14, "Jan Høydahl"  wrote:

>Hi,
>
>I answered at https://issues.apache.org/jira/browse/SOLR-6595:
>
>* Does it work with createNodeSet when using plain SolrCloud without SSL?
>* Please provide the exact CollectionApi request you used when it failed,
>so we can see if the syntax is correct. Also, is 443 your secure port
>number in Jetty/Tomcat?
>
>...but perhaps keep the conversation going here until it is a confirmed
>bug :)
>
>--
>Jan Høydahl, search solution architect
>Cominvent AS - www.cominvent.com
>
>7. okt. 2014 kl. 06:57 skrev Sindre Fiskaa :
>
>> Followed the description
>>https://cwiki.apache.org/confluence/display/solr/Enabling+SSL and
>>generated a self signed key pair. Configured a few solr-nodes and used
>>the collection api to crate a new collection. I get error message when
>>specify the nodes with the createNodeSet param. When I don't use
>>createNodeSet param the collection gets created without error on random
>>nodes. Could this be a bug related to the createNodeSet param?
>> 
>> 
>> 
>> 0>name="QTime">185>name="failure">org.apache.solr.client.solrj.SolrServerException:IOEx
>>ception occured when talking to server
>>at:https://vt-searchln04:443/solr>3C/str%3E%3C/lst%3E>
>> 
>



WhitespaceTokenizer to consider incorrectly encoded c2a0?

2014-10-08 Thread Markus Jelsma
Hi,

For some crazy reason, some users somehow manage to substitute a perfectly 
normal space with a badly encoded non-breaking space, properly URL encoded this 
then becomes %c2a0 and depending on the encoding you use to view you probably 
see  followed by a space. For example:

Because c2a0 is not considered whitespace (indeed, it is not real whitespace, 
that is 00a0) by the Java Character class, the WhitespaceTokenizer won't split 
on it, but the WordDelimiterFilter still does, somehow mitigating the problem 
as it becomes:

HTMLSCF een abonnement
WT een abonnement
WDF een eenabonnement abonnement

Should the WhitespaceTokenizer not include this weird edge case? 

Cheers,
Markus


Re: eDisMax parser and special characters

2014-10-08 Thread Erick Erickson
There's not much information here.
What's the doc look like?
What is the analyzer chain for it?
What is the output when you add &debug=query?

Details matter. A lot ;)

Best,
Erick

On Wed, Oct 8, 2014 at 6:26 AM, Michael Joyner  wrote:
> Try escaping special chars with a "\"
>
>
> On 10/08/2014 01:39 AM, Lanke,Aniruddha wrote:
>>
>> We are using a eDisMax parser in our configuration. When we search using
>> the query term that has a ‘-‘ we don’t get any results back.
>>
>> Search term: red - yellow
>> This doesn’t return any data back but
>>
>>
>


Re: Filter cache pollution during sharded edismax queries

2014-10-08 Thread Charlie Hull

On 01/10/2014 09:55, jim ferenczi wrote:

I think you should test with facet.shard.limit=-1 this will disallow the
limit for the facet on the shards and remove the needs for facet
refinements. I bet that returning every facet with a count greater than 0
on internal queries is cheaper than using the filter cache to handle a lot
of refinements.


I'm happy to report that in our case setting facet.limit=-1 has a 
significant impact on performance, cache hit ratios and reduced CPU 
load. Thanks to all who replied!


Cheers

Charlie
Flax


Jim

2014-10-01 10:24 GMT+02:00 Charlie Hull :


On 30/09/2014 22:25, Erick Erickson wrote:


Just from a 20,000 ft. view, using the filterCache this way seems...odd.

+1 for using a different cache, but that's being quite unfamiliar with the
code.



Here's a quick update:

1. LFUCache performs worse so we returned to LRUCache
2. Making the cache smaller than the default 512 reduced performance.
3. Raising the cache size to 2048 didn't seem to have a significant effect
on performance but did reduce CPU load significantly. This may help our
client as they can reduce their system spec considerably.

We're continuing to test with our client, but the upshot is that even if
you think you don't need the filter cache, if you're doing distributed
faceting you probably do, and you should size it based on experimentation.
In our case there is a single filter but the cache needs to be considerably
larger than that!

Cheers

Charlie




On Tue, Sep 30, 2014 at 1:53 PM, Alan Woodward  wrote:





  Once all the facets have been gathered, the co-ordinating node then

asks
the subnodes for an exact count for the final top-N facets,




What's the point to refine these counts? I've thought that it make sense
only for facet.limit ed requests. Is it correct statement? can those who
suffer from the low performance, just unlimit  facet.limit to avoid that
distributed hop?



Presumably yes, but if you've got a sufficiently high cardinality field
then any gains made by missing out the hop will probably be offset by
having to stream all the return values back again.

Alan


  --

Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics












--
Charlie Hull
Flax - Open Source Enterprise Search

tel/fax: +44 (0)8700 118334
mobile:  +44 (0)7767 825828
web: www.flax.co.uk






--
Charlie Hull
Flax - Open Source Enterprise Search

tel/fax: +44 (0)8700 118334
mobile:  +44 (0)7767 825828
web: www.flax.co.uk


Re: eDisMax parser and special characters

2014-10-08 Thread Michael Joyner

Try escaping special chars with a "\"

On 10/08/2014 01:39 AM, Lanke,Aniruddha wrote:

We are using a eDisMax parser in our configuration. When we search using the 
query term that has a ‘-‘ we don’t get any results back.

Search term: red - yellow
This doesn’t return any data back but






Re: NullPointerException for ExternalFileField when key field has no terms

2014-10-08 Thread Matthew Nigl
Thanks Markus. I initially interpreted the line "It's OK to have a keyField
value that can't be found in the index" as meaning that the key field value
in the external file does not have to exist as a term in the index.





On 8 October 2014 23:56, Markus Jelsma  wrote:

> Hi - yes it is worth a ticket as the javadoc says it is ok:
>
> http://lucene.apache.org/solr/4_10_1/solr-core/org/apache/solr/schema/ExternalFileField.html
>
>
> -Original message-
> > From:Matthew Nigl 
> > Sent: Wednesday 8th October 2014 14:48
> > To: solr-user@lucene.apache.org
> > Subject: NullPointerException for ExternalFileField when key field has
> no terms
> >
> > Hi,
> >
> > I use various ID fields as the keys for various ExternalFileField fields,
> > and I have noticed that I will sometimes get the following error:
> >
> > ERROR org.apache.solr.servlet.SolrDispatchFilter  û
> > null:java.lang.NullPointerException
> > at
> >
> org.apache.solr.search.function.FileFloatSource.getFloats(FileFloatSource.java:273)
> > at
> >
> org.apache.solr.search.function.FileFloatSource.access$000(FileFloatSource.java:51)
> > at
> >
> org.apache.solr.search.function.FileFloatSource$2.createValue(FileFloatSource.java:147)
> > at
> >
> org.apache.solr.search.function.FileFloatSource$Cache.get(FileFloatSource.java:190)
> > at
> >
> org.apache.solr.search.function.FileFloatSource.getCachedFloats(FileFloatSource.java:141)
> > at
> >
> org.apache.solr.search.function.FileFloatSource.getValues(FileFloatSource.java:84)
> > at
> >
> org.apache.solr.response.transform.ValueSourceAugmenter.transform(ValueSourceAugmenter.java:95)
> > at
> >
> org.apache.solr.response.TextResponseWriter.writeDocuments(TextResponseWriter.java:252)
> > at
> >
> org.apache.solr.response.TextResponseWriter.writeVal(TextResponseWriter.java:170)
> > at
> >
> org.apache.solr.response.JSONWriter.writeNamedListAsMapWithDups(JSONResponseWriter.java:184)
> > at
> >
> org.apache.solr.response.JSONWriter.writeNamedList(JSONResponseWriter.java:300)
> > at
> >
> org.apache.solr.response.JSONWriter.writeResponse(JSONResponseWriter.java:96)
> > at
> >
> org.apache.solr.response.JSONResponseWriter.write(JSONResponseWriter.java:61)
> > at
> >
> org.apache.solr.servlet.SolrDispatchFilter.writeResponse(SolrDispatchFilter.java:765)
> > at
> >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:426)
> > at
> >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:207)
> > at
> >
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
> > at
> >
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
> > at
> >
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
> > at
> >
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)
> > at
> >
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
> > at
> >
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)
> > at
> > org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
> > at
> >
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
> > at
> >
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
> > at
> >
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
> > at
> >
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
> > at
> >
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
> > at
> >
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
> > at org.eclipse.jetty.server.Server.handle(Server.java:368)
> > at
> >
> org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
> > at
> >
> org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)
> > at
> >
> org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:942)
> > at
> >
> org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1004)
> > at
> org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:640)
> > at
> > org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
> > at
> >
> org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)
> > at
> >
> org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)
> > at
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
> >  

Re: SolrCloud with client ssl

2014-10-08 Thread Jan Høydahl
Hi,

I answered at https://issues.apache.org/jira/browse/SOLR-6595:

* Does it work with createNodeSet when using plain SolrCloud without SSL?
* Please provide the exact CollectionApi request you used when it failed, so we 
can see if the syntax is correct. Also, is 443 your secure port number in 
Jetty/Tomcat?

...but perhaps keep the conversation going here until it is a confirmed bug :)

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

7. okt. 2014 kl. 06:57 skrev Sindre Fiskaa :

> Followed the description 
> https://cwiki.apache.org/confluence/display/solr/Enabling+SSL and generated a 
> self signed key pair. Configured a few solr-nodes and used the collection api 
> to crate a new collection. I get error message when specify the nodes with 
> the createNodeSet param. When I don't use createNodeSet param the collection 
> gets created without error on random nodes. Could this be a bug related to 
> the createNodeSet param?
> 
> 
> 
> 0 name="QTime">185 name="failure">org.apache.solr.client.solrj.SolrServerException:IOException
>  occured when talking to server 
> at:https://vt-searchln04:443/solr
> 



RE: NullPointerException for ExternalFileField when key field has no terms

2014-10-08 Thread Markus Jelsma
Hi - yes it is worth a ticket as the javadoc says it is ok:
http://lucene.apache.org/solr/4_10_1/solr-core/org/apache/solr/schema/ExternalFileField.html
 
 
-Original message-
> From:Matthew Nigl 
> Sent: Wednesday 8th October 2014 14:48
> To: solr-user@lucene.apache.org
> Subject: NullPointerException for ExternalFileField when key field has no 
> terms
> 
> Hi,
> 
> I use various ID fields as the keys for various ExternalFileField fields,
> and I have noticed that I will sometimes get the following error:
> 
> ERROR org.apache.solr.servlet.SolrDispatchFilter  û
> null:java.lang.NullPointerException
> at
> org.apache.solr.search.function.FileFloatSource.getFloats(FileFloatSource.java:273)
> at
> org.apache.solr.search.function.FileFloatSource.access$000(FileFloatSource.java:51)
> at
> org.apache.solr.search.function.FileFloatSource$2.createValue(FileFloatSource.java:147)
> at
> org.apache.solr.search.function.FileFloatSource$Cache.get(FileFloatSource.java:190)
> at
> org.apache.solr.search.function.FileFloatSource.getCachedFloats(FileFloatSource.java:141)
> at
> org.apache.solr.search.function.FileFloatSource.getValues(FileFloatSource.java:84)
> at
> org.apache.solr.response.transform.ValueSourceAugmenter.transform(ValueSourceAugmenter.java:95)
> at
> org.apache.solr.response.TextResponseWriter.writeDocuments(TextResponseWriter.java:252)
> at
> org.apache.solr.response.TextResponseWriter.writeVal(TextResponseWriter.java:170)
> at
> org.apache.solr.response.JSONWriter.writeNamedListAsMapWithDups(JSONResponseWriter.java:184)
> at
> org.apache.solr.response.JSONWriter.writeNamedList(JSONResponseWriter.java:300)
> at
> org.apache.solr.response.JSONWriter.writeResponse(JSONResponseWriter.java:96)
> at
> org.apache.solr.response.JSONResponseWriter.write(JSONResponseWriter.java:61)
> at
> org.apache.solr.servlet.SolrDispatchFilter.writeResponse(SolrDispatchFilter.java:765)
> at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:426)
> at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:207)
> at
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
> at
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
> at
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
> at
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)
> at
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
> at
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)
> at
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
> at
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
> at
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
> at
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
> at
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
> at
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
> at
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
> at org.eclipse.jetty.server.Server.handle(Server.java:368)
> at
> org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
> at
> org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)
> at
> org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:942)
> at
> org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1004)
> at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:640)
> at
> org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
> at
> org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)
> at
> org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)
> at
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
> at
> org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
> at java.lang.Thread.run(Unknown Source)
> 
> 
> 
> The source code referenced in the error is below (FileFloatSource.java:273):
> 
> TermsEnum termsEnum = MultiFields.getTerms(reader, idName).iterator(null);
> 
> So if there are no terms in the index for the key field, then getTerms will
> return null, and of course trying to call iterator on null will cause the
> exception.
> 
> For my use-case, it makes sense that the key f

NullPointerException for ExternalFileField when key field has no terms

2014-10-08 Thread Matthew Nigl
Hi,

I use various ID fields as the keys for various ExternalFileField fields,
and I have noticed that I will sometimes get the following error:

ERROR org.apache.solr.servlet.SolrDispatchFilter  û
null:java.lang.NullPointerException
at
org.apache.solr.search.function.FileFloatSource.getFloats(FileFloatSource.java:273)
at
org.apache.solr.search.function.FileFloatSource.access$000(FileFloatSource.java:51)
at
org.apache.solr.search.function.FileFloatSource$2.createValue(FileFloatSource.java:147)
at
org.apache.solr.search.function.FileFloatSource$Cache.get(FileFloatSource.java:190)
at
org.apache.solr.search.function.FileFloatSource.getCachedFloats(FileFloatSource.java:141)
at
org.apache.solr.search.function.FileFloatSource.getValues(FileFloatSource.java:84)
at
org.apache.solr.response.transform.ValueSourceAugmenter.transform(ValueSourceAugmenter.java:95)
at
org.apache.solr.response.TextResponseWriter.writeDocuments(TextResponseWriter.java:252)
at
org.apache.solr.response.TextResponseWriter.writeVal(TextResponseWriter.java:170)
at
org.apache.solr.response.JSONWriter.writeNamedListAsMapWithDups(JSONResponseWriter.java:184)
at
org.apache.solr.response.JSONWriter.writeNamedList(JSONResponseWriter.java:300)
at
org.apache.solr.response.JSONWriter.writeResponse(JSONResponseWriter.java:96)
at
org.apache.solr.response.JSONResponseWriter.write(JSONResponseWriter.java:61)
at
org.apache.solr.servlet.SolrDispatchFilter.writeResponse(SolrDispatchFilter.java:765)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:426)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:207)
at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
at
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
at
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)
at
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
at
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)
at
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
at
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
at
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
at
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
at
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
at org.eclipse.jetty.server.Server.handle(Server.java:368)
at
org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
at
org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)
at
org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:942)
at
org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1004)
at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:640)
at
org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
at
org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)
at
org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)
at
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
at
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
at java.lang.Thread.run(Unknown Source)



The source code referenced in the error is below (FileFloatSource.java:273):

TermsEnum termsEnum = MultiFields.getTerms(reader, idName).iterator(null);

So if there are no terms in the index for the key field, then getTerms will
return null, and of course trying to call iterator on null will cause the
exception.

For my use-case, it makes sense that the key field may have no terms
(initially) because there are various types of documents sharing the index,
and they will not all exist at the onset. The default value for the EFF
would suffice in those cases.

Is this worthy of a JIRA? I have gone through whatever documentation I can
find for ExternalFileField and I can't seem to find anything about
requiring key terms first. It seems that this error is not encountered
often because users generally set the unique key field as the external file
key field, so it always exists.

The workaround is to ensure at least one

How to link tables based on range values solr data-config

2014-10-08 Thread madhav bahuguna
Hi ,

 Businessmasters
 Business_id  Business_point
   13.4
   22.8
   38.0

Business_Colors
business_colors_id   business_rating_from business_rating_to rating
   1 2   5  OK
   2 5   10 GOOD
   310   15
Excellent

I want to link the two tables based business_rating_from and
business_rating_to like

SELECT business_colors_id,business_rating_from,business_rating_to,rating
where  business_rating_from >= 2 AND business_rating_to < 5;

Now i want to index them into solr.This is how my data-config file looks

 

   






When i click full indexing data does not get index and no error is shown.
What is wrong with this,Can any one help and advise. How do i achieve what
i want to do

I also have this question posted on stack over flow
http://stackoverflow.com/questions/26256344/how-to-link-tables-based-on-range-values-solr-data-config
-- 
Regards
Madhav Bahuguna


RE: Solr configuration, memory usage and MMapDirectory

2014-10-08 Thread Simon Fairey
Hi

I'm currently setting up jconsole but as I have to remotely monitor (no gui 
capability on the server) I have to wait before I can restart solr with a JMX 
port setup. In the meantime I looked at top and given the calculations you said 
based on your top output and this top of my java process from the node that 
handles the querying, the indexing node has a similar memory profile:

https://www.dropbox.com/s/pz85dm4e7qpepco/SolrTop.png?dl=0

It would seem I need a monstrously large heap in the 60GB region?

We do use a lot of navigators/filters so I have set the caches to be quite 
large for these, are these what are using up the memory?

Thanks

Si

-Original Message-
From: Shawn Heisey [mailto:apa...@elyograg.org] 
Sent: 06 October 2014 16:56
To: solr-user@lucene.apache.org
Subject: Re: Solr configuration, memory usage and MMapDirectory

On 10/6/2014 9:24 AM, Simon Fairey wrote:
> I've inherited a Solr config and am doing some sanity checks before 
> making some updates, I'm concerned about the memory settings.
>
> System has 1 index in 2 shards split across 2 Ubuntu 64 bit nodes, 
> each node has 32 CPU cores and 132GB RAM, we index around 500k files a 
> day spread out over the day in batches every 10 minutes, a portion of 
> these are updates to existing content, maybe 5-10%. Currently 
> MergeFactor is set to 2 and commit settings are:
>
> 
>
> 6
>
> false
>
> 
>
> 
>
> 90
>
> 
>
> Currently each node has around 25M docs in with an index size of 45GB, 
> we prune the data every few weeks so it never gets much above 35M docs 
> per node.
>
> On reading I've seen a recommendation that we should be using 
> MMapDirectory, currently it's set to NRTCachingDirectoryFactory.
> However currently the JVM is configured with -Xmx131072m, and for 
> MMapDirectory I've read you should use less memory for the JVM so 
> there is more available for the OS caching.
>
> Looking at the dashboard in the JVM memory usage I see:
>
> enter image description here
>
> Not sure I understand the 3 bands, assume 127.81 is Max, dark grey is 
> in use at the moment and the light grey is allocated as it was used 
> previously but not been cleaned up yet?
>
> I'm trying to understand if this will help me know how much would be a 
> good value to change Xmx to, i.e. say 64GB based on light grey?
>
> Additionally once I've changed the max heap size is it a simple case 
> of changing the config to use MMapDirectory or are there things i need 
> to watch out for?
>

NRTCachingDirectoryFactory is a wrapper directory implementation. The wrapped 
Directory implementation is used with some code between that implementation and 
the consumer (Solr in this case) that does caching for NRT indexing.  The 
wrapped implementation is MMapDirectory, so you do not need to switch, you ARE 
using MMap.

Attachments rarely make it to the list, and that has happened in this case, so 
I cannot see any of your pictures.  Instead, look at one of mine, and the 
output of a command from the same machine, running Solr
4.7.2 with Oracle Java 7:

https://www.dropbox.com/s/91uqlrnfghr2heo/solr-memory-sorted-top.png?dl=0

[root@idxa1 ~]# du -sh /index/solr4/data/
64G /index/solr4/data/

I've got 64GB of index data on this machine, used by about 56 million 
documents.  I've also got 64GB of RAM.  The solr process shows a virtual memory 
size of 54GB, a resident size of 16GB, and a shared size of 11GB.  My max heap 
on this process is 6GB.  If you deduct the shared memory size from the resident 
size, you get about 5GB.  The admin dashboard for this machine says the current 
max heap size is 5.75GB, so that 5GB is pretty close to that, and probably 
matches up really well when you consider that the resident size may be 
considerably more than 16GB and the shared size may be just barely over 11GB.

My system has well over 9GB free memory and 44GB is being used for the OS disk 
cache.  This system is NOT facing memory pressure.  The index is well-cached 
and there is even memory that is not used *at all*.

With an index size of 45GB and 132GB of RAM, you're unlikely to be having 
problems with memory unless your heap size is *ENORMOUS*.  You
*should* have your garbage collection highly tuned, especially if your max heap 
larger than 2 or 3GB.  I would guess that a 4 to 6GB heap is probably enough 
for your needs, unless you're doing a lot with facets, sorting, or Solr's 
caches, then you may need more.  Here's some info about heap requirements, 
followed by information about garbage collection tuning:

http://wiki.apache.org/solr/SolrPerformanceProblems#Java_Heap

Your automatic commit settings do not raise any red flags with me. 
Those are sensible settings.

Thanks,
Shawn




Re: dismax query does not match with additional field in qf

2014-10-08 Thread Andreas Hubold
The query is not from a real use-case. We used it to test edge cases. I 
just asked to better understand the parser as its behavior did not match 
my expectations.


Anyway, one use-case I can think of is a free search field for end-users 
where they can search in both ID and text fields including phrases - 
without specifying whether their query is an ID or full-text. Users 
typically just expect the "right thing" to happen. So application 
developers have to be aware of such effects. Maybe the newer simple 
query parser would be a better fit for us.


There were also some good comments in SOLR-6602, especially a link to 
SOLR-3085 which describes a more realistic case with stopword removal.


Thanks everybody!

Regards,
Andreas

Jack Krupansky wrote on 10/07/2014 06:16 PM:
Your query term seems particularly inappropriate for dismax - think 
simple keyword queries.


Also, don't confuse dismax and edismax - maybe you want the latter. 
The former is for... simple keyword queries.


I'm still not sure what your actual use case really is. In particular, 
are you trying to do a full, exact match on the string field, or a 
substring match? You can do the latter with wildcards or regex, but 
normally the former (exact match) is used.


Maybe simply enclosing the complex term in quotes to make it a phrase 
query is what you need - that would do an exact match on the string 
field, but a tokenized phrase match on the text field, and support 
partial matches on the text field as a phrase of contiguous terms.


-- Jack Krupansky

-Original Message- From: Andreas Hubold
Sent: Tuesday, October 7, 2014 12:08 PM
To: solr-user@lucene.apache.org
Subject: Re: dismax query does not match with additional field in qf

Okay, sounds reasonable. However I didn't expect this when reading the
documentation of the dismax query parser.

Especially the need to escape special characters (and which ones) was
not clear to me as the dismax query parser "is designed to process
simple phrases (without complex syntax) entered by users" and "special
characters (except AND and OR) are escaped" by the parser - as written
on 
https://cwiki.apache.org/confluence/display/solr/The+DisMax+Query+Parser


Do you know if the new Simple Query Parser has the same behaviour when
searching across multiple fields? Or could it be used instead to search
across "text_general" and "string" fields of arbitrary content without
additional query preprocessing to get results for matches in any of
these fields (as in field1:STUFF OR field2:STUFF).

Thank you,
Andreas

Jack Krupansky wrote on 10/07/2014 05:24 PM:
I think what is happening is that your last term, the naked 
apostrophe is analyzing to zero terms and simply being ignored, but 
when you add the extra field, a string field, you now have another 
term in the query, and you have mm set to 100%, so that "new" term 
must match. It probably fails because you have no naked apostrophe 
term in that field in the index.


Probably none of your string field terms were matching before, but 
that wasn't apparent since the tokenized text matched. But with this 
naked apostrophe term, there is no way to tell Lucene to match "no" 
term, so it requried the string term to match, which won't happen 
since only the full string is indexed.


Generally, you need to escape all special characters in a query. Then 
hopefully your string field will match.


-- Jack Krupansky

-Original Message- From: Andreas Hubold
Sent: Tuesday, September 30, 2014 11:14 AM
To: solr-user@lucene.apache.org
Subject: dismax query does not match with additional field in qf

Hi,

I ran into a problem with the Solr dismax query parser. We're using Solr
4.10.0 and the field types mentioned below are taken from the example
schema.xml.

In a test we have a document with rather strange content in a field
named "name_tokenized" of type "text_general":

abc_onload='javascript:document.XSSed="name"' width=0 height=0>


(It's a test for XSS bug detection, but that doesn't matter here.)

I can find the document when I use the following dismax query with qf
set to field "name_tokenized" only:

http://localhost:44080/solr/studio/editor?deftype=dismax&q=abc_%3Ciframe+src%3D%27loadLocale.js%27+onload%3D%27javascript%3Adocument.XSSed%3D%22name%22%27&debug=true&echoParams=all&qf=name_tokenized^2 



If I submit exactly the same query but add another field "feederstate"
to the qf parameter, I don't get any results anymore. The field is of
type "string".

http://localhost:44080/solr/studio/editor?deftype=dismax&q=abc_%3Ciframe+src%3D%27loadLocale.js%27+onload%3D%27javascript%3Adocument.XSSed%3D%22name%22%27&debug=true&echoParams=all&qf=name_tokenized^2%20feederstate 



The decoded value of q is: abc_DisjunctionMaxQuery((feederstate:abc_((name_tokenized:abc_ name_tokenized:iframe)^2.0))~0.1)
DisjunctionMaxQuery((feederstate:src='loadLocale.js' | 
((name_tokenized:src name_tokenized:loadlocale.js)^2.0))~0.1)
DisjunctionMaxQuery((feederstate:onlo

Re: eDisMax parser and special characters

2014-10-08 Thread Aman Tandon
Hi,

It seems me like there is difference in tokens generated during query and
indexing time, you can tell us the your field type and the analyzers you
are using to index that field.

With Regards
Aman Tandon

On Wed, Oct 8, 2014 at 11:09 AM, Lanke,Aniruddha  wrote:

> We are using a eDisMax parser in our configuration. When we search using
> the query term that has a ‘-‘ we don’t get any results back.
>
> Search term: red - yellow
> This doesn’t return any data back but
>
> Search term: red yellow
> Will give back result ‘red - yellow’
>
> How does eDisMax treat special characters?
> What tweaks do we need to do, so when a user enters a ‘-‘ in the query
> e.g. red - yellow, we
> get the appropriate result back?
>
> Thanks,
>
> CONFIDENTIALITY NOTICE This message and any included attachments are from
> Cerner Corporation and are intended only for the addressee. The information
> contained in this message is confidential and may constitute inside or
> non-public information under international, federal, or state securities
> laws. Unauthorized forwarding, printing, copying, distribution, or use of
> such information is strictly prohibited and may be unlawful. If you are not
> the addressee, please promptly delete this message and notify the sender of
> the delivery error by e-mail or you may call Cerner's corporate offices in
> Kansas City, Missouri, U.S.A at (+1) (816)221-1024.
>


Re: solr suggester not working with shards

2014-10-08 Thread rsi...@ambrac.nl
One more thing :

suggest is not working  with multiple cores using  shard but  'did you mean'
(spell check ) is working fine with multiple cores.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/solr-suggester-not-working-with-shards-tp4163261p4163265.html
Sent from the Solr - User mailing list archive at Nabble.com.