Re: Issues sending mail to the list

2015-07-23 Thread Upayavira
Be sure to be sending plain text emails, not HTML, and watch out for things that could be considered spam. Apache mail servers do receive a LOT of spam, so need to have relatively aggressive spam filters in place. Upayavira On Thu, Jul 23, 2015, at 07:29 PM, Steven White wrote: Hi Everyone,

Re: Using payloads and user provided data in score

2015-07-23 Thread Jamie Johnson
Well you've at least confirmed what I was thinking :). I am using payloads now for this and I think I have something very basic working. The results don't get dropped out when the scores are 0 so I had to also write a custom collector that could be plugged into the AnalyticQueryAPI (maybe there

Re: Running SOLR 5.2.1 on Embedded Jetty

2015-07-23 Thread Shawn Heisey
On 7/23/2015 3:14 PM, Darin Amos wrote: I have been trying to run the SOLR war with embedded Jetty and can’t seem to get the config quiet right. Is there any known documentation on this or is someone else doing this? I seem to just be setting up a document server at my solr.home directory.

How to connect Solr with Impala?

2015-07-23 Thread Rex X
Given following Impala query: SELECT date, SUM(CAST(price AS DOUBLE)) AS price FROM table WHERE date='2014-01-01' AND store_id IN(1,2,3) GROUP BY date; To work with Solr 1. Will it be more efficient to directly use equivalent Solr query? Any curl command equivalent to the

Re: SolrCloud 5.2.1 - collection creation error

2015-07-23 Thread Aaron Gibbons
Ah, now we're on to something! Solr 4.10.0 is also using the same zookeepers, and both are using Oracle Java 8 JRE. Did some research and uploaded a new config to zookeeper using chroot to isolate them. Changed the init script to have ZK_Host=zk1,zk2,zk3/DevConfigs. I did see that you should

Running SOLR 5.2.1 on Embedded Jetty

2015-07-23 Thread Darin Amos
Hello, I have been trying to run the SOLR war with embedded Jetty and can’t seem to get the config quiet right. Is there any known documentation on this or is someone else doing this? I seem to just be setting up a document server at my solr.home directory. The code snippet below seems

RE: cache implemetation?

2015-07-23 Thread cbuxbaum
Hi Shawn, Thanks for your help. I settled on the following solution, that I am in the process of testing out: entity name=LEAP_PARTY pk=LEAP_PARTY_ID query=SELECT DISTINCT 'LEAP_PARTY' AS DOCUMENT_TYPE, VPARTY.OWNER AS PARTY_OWNER,

Re: SolrCloud 5.2.1 - collection creation error

2015-07-23 Thread Upayavira
Yay! On Thu, Jul 23, 2015, at 10:13 PM, Aaron Gibbons wrote: Ah, now we're on to something! Solr 4.10.0 is also using the same zookeepers, and both are using Oracle Java 8 JRE. Did some research and uploaded a new config to zookeeper using chroot to isolate them. Changed the init script to

Re: Issues sending mail to the list

2015-07-23 Thread Erick Erickson
sometimes when echoing back the whole thread it looks like spam On Thu, Jul 23, 2015 at 1:42 PM, Steven White swhite4...@gmail.com wrote: Three emails to the existing subject of Basic auth didn't make it. As you may have seen, I started a new email thread on this subject under Basic Auth

Re: caceh implemetation?

2015-07-23 Thread Shawn Heisey
On 7/23/2015 10:55 AM, cbuxbaum wrote: Say we have 100 party records. Then the child SQL will be run 100 times (once for each party record). Isn't there a way to just run the child SQL on all of the party records at once with a join, using a GROUP BY and ORDER BY on the PARTY_ID?

Re: Basic Auth (again)

2015-07-23 Thread Peter Sturge
Hi Steve, We've not yet moved to Solr 5, but we do use Jetty 9. In any case, Basic Auth is a Jetty thing, not a Solr thing. We do use this mechanism to great effect to secure things like index writers and such, and it does work well once it's setup. Jetty, as with all containers, is a bit fussy

Re: serious JSON Facet bug

2015-07-23 Thread Harry Yoo
Is there a way to patch? I am using 5.2.1 and using json facet in production. On Jul 16, 2015, at 1:43 PM, Yonik Seeley ysee...@gmail.com wrote: To anyone using the JSON Facet API in released Solr versions: I discovered a serious memory leak while doing performance benchmarks (see

Re: Different scores for the same search

2015-07-23 Thread Upayavira
There's a few odd things here, looking at your explains output: First search, first result: 6.357613 = (MATCH) weight(description:jackshaft in 3339) [DefaultSimilarity], result of: 6.357613 = fieldWeight in 3339, product of: 1.0 = tf(freq=1.0), with freq of: 1.0 = termFreq=1.0

Re: Inconsistent Solr Search Results

2015-07-23 Thread Erick Erickson
Same issue really as long as there's more than one replica/shard. Tied scores are broken by internal ID, specifying a secondary sort should regularize things. Best, Erick On Thu, Jul 23, 2015 at 10:01 AM, Tarala, Magesh mtar...@bh.com wrote: Erick, The 3 node cluster is setup to use 3 shards

Re: serious JSON Facet bug

2015-07-23 Thread Yonik Seeley
On Thu, Jul 23, 2015 at 5:00 PM, Harry Yoo hyunat...@gmail.com wrote: Is there a way to patch? I am using 5.2.1 and using json facet in production. First you should see if your queries tickle the bug... check the size of the filter cache from the admin screen (under plugins, filterCache) and see

Re: serious JSON Facet bug

2015-07-23 Thread Nagasharath
I don't have this issue. I have tried with various json facet queries and my filter cache always come down to the 'minsize'( never exceeds configured) with solr version 5.2.1, and all my queries are json nested faceted. On 23-Jul-2015, at 7:43 pm, Yonik Seeley ysee...@gmail.com wrote: On

Re: Unexpected docvalues type error using result grouping - Use UninvertingReader or index with docvalues

2015-07-23 Thread William Bell
You could try stopping SOLR, going into the data directory and rm -rf * and starting SOLR again. Did you use the schema REST api? Residual ? On Thu, Jul 23, 2015 at 6:57 PM, Shamik Bandopadhyay sham...@gmail.com wrote: Hi, I'm facing this weird error while running result grouping queries.

Unexpected docvalues type error using result grouping - Use UninvertingReader or index with docvalues

2015-07-23 Thread Shamik Bandopadhyay
Hi, I'm facing this weird error while running result grouping queries. This started when I turned on docvalues for an existing facet field and indexed the documents. Looking at the exception, I reverted back the change and re-indexed the documents again. But I'm still getting the exception,

Re: Different scores for the same search

2015-07-23 Thread Upayavira
What it looks like is kinda as Erick suggested - the scores are the same for some docs, so it probably depends upon which order they come back from the shards as to which will be shown first. If the issue is that the score is the same for some docs, try adding a deliberate sort: sort=score

RE: Different scores for the same search

2015-07-23 Thread Tarala, Magesh
I added the explicit sort: http://server1.domain.com:8983/solr/serviceorder_shard1_replica2/select?q=description%3Ajackshaftfl=service_orderwt=jsonindent=truedebugQuery=truesort=score%20desc,id%20asc Still seeing the same behavior - inconsistent results: First time I run:

How to:- Extending Tika within Solr

2015-07-23 Thread Aditya Dhulipala
Hi, I have implemented a new file-type parser for TIka. It parses a custom filetype (*.mx) I would like my Solr instance to use my version of Tika with the mx parser. I found this by a google search https://lucidworks.com/blog/extending-apache-tika-capabilities/ But it seems to be over 5

Re: Basic Auth (again)

2015-07-23 Thread Steven White
Hi Petter, I'm on Solr 5.2.1 which comes with Jetty 9.2. I'm setting this up on Windows 2012 but will need to do the same on Linux too. I followed the step per this link: https://wiki.apache.org/solr/SolrSecurity#Jetty_realm_example very much to the book. Here are the changes I made: File:

Re: Issues sending mail to the list

2015-07-23 Thread Steven White
Three emails to the existing subject of Basic auth didn't make it. As you may have seen, I started a new email thread on this subject under Basic Auth (again) and now they are making it to the list. I don't know what to make of this. Steve On Thu, Jul 23, 2015 at 4:31 PM, Upayavira

Re: Using payloads and user provided data in score

2015-07-23 Thread Jamie Johnson
Sorry for being vague, I'll try to explain more. In my use case a particular field does not have a security control, it's the data in the field. So for instance if I had a schema with a field called name, there could be data that should be secured at A, B, AB, A|B, etc within that field. So

Re: Solr Clustering Issue

2015-07-23 Thread Upayavira
I've seen something like this on another system - where the OR is consumed as a query term rather than an operator. Remember that Edismax will use the Lucene query parser (which supports OR, etc) unless there is an exception, and defer to dismax if there is a syntax error. What I'd suggest here

Re: SolrCloud 5.2.1 - collection creation error

2015-07-23 Thread Upayavira
I'd still like to just confirm that you're using the same Java for running Solr and for running bin/solr. When you run bin/solr you are doing that on the instance itself? You show a collections API URL below. Does that fail the same way? Basically, the exception you showed was a SolrJ error.

Re: Solr Clustering Issue

2015-07-23 Thread Joseph Obernberger
Hi Upayavira - the URL was: http://server1:9100/solr/MYCOL1/clustering?q=Collection:(COLLECT1008+OR+COLLECT2587)+AND+(amazon+AND+soap)wt=jsonindent=trueclustering=truerows=1df=FULL_DOCUMENTdebugQuery=true Here is the relevant part of the response - notice that the default field (FULL_DOCUMENT)

Re: Solr Clustering Issue

2015-07-23 Thread Shawn Heisey
On 7/23/2015 7:51 AM, Joseph Obernberger wrote: Hi Upayavira - the URL was: http://server1:9100/solr/MYCOL1/clustering?q=Collection:(COLLECT1008+OR+COLLECT2587)+AND+(amazon+AND+soap)wt=jsonindent=trueclustering=truerows=1df=FULL_DOCUMENTdebugQuery=true Here is the relevant part of the

Re: solr blocking and client timeout issue

2015-07-23 Thread Jeremy Ashcraft
A quick follow up, after finding and eliminating some code that was generating multiple update requests per second, applying the CMS GC tuning options, and upgrading to Java 8, we've not experienced a single long term GC pause. The java 8 upgrade got rid of the final couple of pauses during

Re: Per-document and per-query analysis

2015-07-23 Thread Alessandro Benedetti
markus, the first idea that come to my mind is this : 1) you configure your schema, creating your field types, and if necessary fields associated 2) you build an UpdateRequestProcessor that do a conditional check per document, and create the proper fields starting from one input field . In this

Re: SolrCloud 5.2.1 - collection creation error

2015-07-23 Thread Aaron Gibbons
*When you run bin/solr you are doing that on the instance itself? * Yes *You show a collections API URL below. Does that fail the same way?* Error from API: 50042java.io.InvalidClassException: org.apache.solr.client.solrj.SolrResponse; local class incompatible: stream classdesc serialVersionUID =

Per-document and per-query analysis

2015-07-23 Thread Markus Jelsma
Hello - the title says it all. When indexing a document, we need to run one or more additional filters depending on the value of a specific field. Likewise, we need to run that same filter over the already analyzed tokens when querying. This is not going to work if i extend TextField, at all.

Re: SolrCloud 5.2.1 - collection creation error

2015-07-23 Thread Aaron Gibbons
I originally started using Ansible playbooks which did install the JDK (with the same error), but have been doing manual installs to take Ansible completely out of the equation. Safari wasn't giving showing the XML response so I ran this in Chrome..

Inconsistent Solr Search Results

2015-07-23 Thread Tarala, Magesh
I have about 15K documents in a 3 node solr cluster. When I execute a simple search, I get the results in different order every time I search. But the number of records is the same. Here's the definition for the field. Any ideas, suggestions would be greatly appreciated. fieldType

Re: solr blocking and client timeout issue

2015-07-23 Thread Erick Erickson
Thanks for letting us know how it turned out. Too often I'm never sure what actually _worked_ Erick On Thu, Jul 23, 2015 at 8:56 AM, Jeremy Ashcraft jashcr...@edgate.com wrote: A quick follow up, after finding and eliminating some code that was generating multiple update requests per

caceh implemetation?

2015-07-23 Thread cbuxbaum
Hi, We are trying to improve the performance of our data import. We tried using the CachedSqlEntityProcessor implementation, but that is apparently broken. I am looking at the workaround described below: https://issues.apache.org/jira/browse/SOLR-3857

Re: SolrCloud 5.2.1 - collection creation error

2015-07-23 Thread Upayavira
Have you tried it with a JDK? I tend to use JDK rather than JRE, but don't recall whether this is a specific requirement for Solr. Can you show the URL you use for the API, and the JSON/XML response you get? I wouldn't expect to see mention of solrj in the API because it isn't used. Just for the

Re: XSLT with maps

2015-07-23 Thread Sreekant Sreedharan
That worked for most of my attributes. I have only one issue to fix. How would I convert boolean values to integers? For example: doc ... bool name=prfalse/bool /doc to ID pr=0 /ID Is that possible as well? On that note, what version of XSLT should I assume SOLR supports?

Re: SolrCloud 5.2.1 - collection creation error

2015-07-23 Thread Aaron Gibbons
I've mainly used Oracle Java 8, but tested 7 also. Typically I'll wipe the machines and start from scratch before installing a different version. The latest attempt followed these steps exactly on each machine: - sudo apt-get install python-software-properties - sudo add-apt-repository

Re: SolrCloud 5.2.1 - collection creation error

2015-07-23 Thread Upayavira
Hmmm, what other Solr nodes do you have connected to Zookeeper? Are any of them running a different Java or Solr version? It looks like you have another node connected to your Zookeeper that has taken the role of overseer and it is sending back serialized java objects that your own node cannot

Re: XSLT with maps

2015-07-23 Thread Upayavira
xsl:template match=doc ID NewID=... xsl:apply-templates select=pr/ /ID /xsl:template xsl:template match=bool[.='false'] xsl:attribute name={@name}0/xsl:attribute /xsl:template xsl:template match=bool[.='true'] xsl:attribute name={@name}1/xsl:attribute /xsl:template Note, if you find XSLT

Different scores for the same search

2015-07-23 Thread Tarala, Magesh
I'm executing a very simple search in a 3 node cluster - 3 shards with 1 replica each. Solr version 4.10.2: http://server1.domain.com:8983/solr/serviceorder_shard1_replica2/select?q=description%3Ajackshaftfl=service_orderwt=jsonindent=truedebugQuery=true I'm getting different scores when I run

Issues sending mail to the list

2015-07-23 Thread Steven White
Hi Everyone, I'm seeing that some of my emails are not making it to the mailing list and I confirmed that I'm subscribed: Hi! This is the ezmlm program. I'm managing the solr-user@lucene.apache.org mailing list. I'm working for my owner, who can be reached at

Basic Auth (again)

2015-07-23 Thread Steven White
(re-posting as new email thread to see if this will make it to the list) That didn't help. I still get the same result and virtually no log to help me figure out where / what things are going wrong. Here is all that I see in C:\Solr\solr-5.2.1\server\logs\solr.log: INFO - 2015-07-23

Re: Basic Auth (again)

2015-07-23 Thread Peter Sturge
Hi Steve, What version of Jetty are you using? Have you got a webdefault.xml in your etc folder? If so, does it have an entry like this: login-config auth-methodBASIC/auth-method realm-nameRealm Name as specified in jetty.xml/realm-name /login-config It's been a few years since I

Re: caceh implemetation?

2015-07-23 Thread Mikhail Khludnev
Hello, I briefly described the similar problems at http://blog.griddynamics.com/2015/07/how-to-import-structured-data-into-solr.html Let me know if you have further questions On Thu, Jul 23, 2015 at 7:55 PM, cbuxbaum cbuxb...@tradestonesoftware.com wrote: That's OK, I have determined that

Re: Performance of facet contain search in 5.2.1

2015-07-23 Thread Erick Erickson
bq: I do not understand why anyone would ever use facet.prefix or facet.contains for any use other than a development... Gotta disagree a bit here. AFAIK, it depends on the number of unique terms in the field. How would either one be worse than facet.field? And you can freely use facet.field on a

Re: Using payloads and user provided data in score

2015-07-23 Thread Erick Erickson
bq: Your ugly problem is my situation I think ;) No, your problem is much worse ;( The _contents_ of fields are restricted, which is horrible. OK, here's another idea out of waaay left field: Payloads. It hinges on there being an OK number of possible combinations which seems to be the

Re: Inconsistent Solr Search Results

2015-07-23 Thread Erick Erickson
The query you're running would help. But here's a guess: You say you have a 3 node Solr cluster. By that I'm guessing you mean a single shard with 1 leader and 2 replicas. when the primary sort criteria (score by default) is tied between two documents, the internal Lucene doc ID is used as a

Re: caceh implemetation?

2015-07-23 Thread Erick Erickson
Your version of the config didn't come through, the mail program is pretty aggressive about stripping attachments and things. Best, Erick On Thu, Jul 23, 2015 at 8:20 AM, cbuxbaum cbuxb...@tradestonesoftware.com wrote: Hi, We are trying to improve the performance of our data import. We tried

Re: caceh implemetation?

2015-07-23 Thread cbuxbaum
That's OK, I have determined that caching is not relevant to our use case. However, I have a question about the full import queries that we are using: Here is the SQL from the top level entity: query=SELECT DISTINCT 'LEAP_PARTY' AS DOCUMENT_TYPE, VPARTY.OWNER AS

RE: Inconsistent Solr Search Results

2015-07-23 Thread Tarala, Magesh
Erick, The 3 node cluster is setup to use 3 shards each with 1 replica. So, the index is split on 3 servers. Another piece of info - I think the issue happens only when I use pagination. Verifying if that's the case.. Here's a query from the solr log on the server I'm pointing the query to:

Re: Performance of facet contain search in 5.2.1

2015-07-23 Thread Alessandro Benedetti
Hi Dave and Markus, I would definitely suggest to use the *Suggester Component* . In particular, for your use case I suggest the AnalyzingInfixLookup strategy . As usual i suggest : Erick's post - http://lucidworks.com/blog/solr-suggester/ My post -

Re: XSLT with maps

2015-07-23 Thread Sreekant Sreedharan
Well, if you had a result say: ... doc str name=id589587B2B1CA4C4683FC106967E7C326/str str name=arEE3YYK/str int name=age31034/int /doc ... applying the template: xsl:template match=doc ID NewID={@id} ... / /xsl:template would result in the following XML: IMAGES ID NewID=/ /IMAGES

Re: XSLT with maps

2015-07-23 Thread Upayavira
you are correct, I should have said: xsl:template match=doc ID NewID={str[@name='id']}.../ /xsl:template On Thu, Jul 23, 2015, at 10:15 AM, Sreekant Sreedharan wrote: Well, if you had a result say: ... doc str name=id589587B2B1CA4C4683FC106967E7C326/str str name=arEE3YYK/str int

RE: Performance of facet contain search in 5.2.1

2015-07-23 Thread Markus Jelsma
Hello - You should index your terms as n-grams indeed, especially for autocompletion. I do not understand why anyone would ever use facet.prefix or facet.contains for any use other than a development tool. It won't perform on any index larger than small. Jan Høydahl has put up a thorough